postgresql/src/backend/utils/adt/ruleutils.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

13268 lines
367 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
*
* ruleutils.c
* Functions to convert stored expressions/querytrees back to
* source text
*
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/backend/utils/adt/ruleutils.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <ctype.h>
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
#include <unistd.h>
#include <fcntl.h>
#include "access/amapi.h"
#include "access/htup_details.h"
#include "access/relation.h"
#include "access/table.h"
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
#include "catalog/pg_aggregate.h"
#include "catalog/pg_am.h"
#include "catalog/pg_authid.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_depend.h"
#include "catalog/pg_language.h"
#include "catalog/pg_opclass.h"
#include "catalog/pg_operator.h"
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
#include "catalog/pg_partitioned_table.h"
#include "catalog/pg_proc.h"
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_trigger.h"
#include "catalog/pg_type.h"
#include "commands/defrem.h"
#include "commands/tablespace.h"
#include "common/keywords.h"
#include "executor/spi.h"
#include "funcapi.h"
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
#include "nodes/pathnodes.h"
#include "optimizer/optimizer.h"
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
#include "parser/parse_agg.h"
#include "parser/parse_func.h"
#include "parser/parse_node.h"
#include "parser/parse_oper.h"
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
#include "parser/parse_relation.h"
#include "parser/parser.h"
#include "parser/parsetree.h"
#include "rewrite/rewriteHandler.h"
#include "rewrite/rewriteManip.h"
#include "rewrite/rewriteSupport.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c. Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
2018-03-22 01:03:28 +01:00
#include "utils/guc.h"
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
#include "utils/rel.h"
#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
#include "utils/varlena.h"
#include "utils/xml.h"
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* Pretty formatting constants
* ----------
*/
/* Indent counts */
#define PRETTYINDENT_STD 8
#define PRETTYINDENT_JOIN 4
#define PRETTYINDENT_VAR 4
#define PRETTYINDENT_LIMIT 40 /* wrap limit */
/* Pretty flags */
#define PRETTYFLAG_PAREN 0x0001
#define PRETTYFLAG_INDENT 0x0002
#define PRETTYFLAG_SCHEMA 0x0004
/* Standard conversion of a "bool pretty" option to detailed flags */
#define GET_PRETTY_FLAGS(pretty) \
((pretty) ? (PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | PRETTYFLAG_SCHEMA) \
: PRETTYFLAG_INDENT)
/* Default line length for pretty-print wrapping: 0 means wrap always */
#define WRAP_COLUMN_DEFAULT 0
/* macros to test if pretty action needed */
#define PRETTY_PAREN(context) ((context)->prettyFlags & PRETTYFLAG_PAREN)
#define PRETTY_INDENT(context) ((context)->prettyFlags & PRETTYFLAG_INDENT)
#define PRETTY_SCHEMA(context) ((context)->prettyFlags & PRETTYFLAG_SCHEMA)
/* ----------
* Local data types
* ----------
*/
/* Context info needed for invoking a recursive querytree display routine */
typedef struct
{
StringInfo buf; /* output buffer to append to */
List *namespaces; /* List of deparse_namespace nodes */
List *windowClause; /* Current query level's WINDOW clause */
List *windowTList; /* targetlist for resolving WINDOW clause */
int prettyFlags; /* enabling of pretty-print functions */
int wrapColumn; /* max line length, or -1 for no limit */
int indentLevel; /* current indent level for pretty-print */
bool varprefix; /* true to print prefixes on Vars */
ParseExprKind special_exprkind; /* set only for exprkinds needing special
* handling */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Bitmapset *appendparents; /* if not null, map child Vars of these relids
* back to the parent rel */
} deparse_context;
/*
* Each level of query context around a subtree needs a level of Var namespace.
* A Var having varlevelsup=N refers to the N'th item (counting from 0) in
* the current context's namespaces list.
*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* rtable is the list of actual RTEs from the Query or PlannedStmt.
* rtable_names holds the alias name to be used for each RTE (either a C
* string, or NULL for nameless RTEs such as unnamed joins).
* rtable_columns holds the column alias names to be used for each RTE.
*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* subplans is a list of Plan trees for SubPlans and CTEs (it's only used
* in the PlannedStmt case).
* ctes is a list of CommonTableExpr nodes (only used in the Query case).
* appendrels, if not null (it's only used in the PlannedStmt case), is an
* array of AppendRelInfo nodes, indexed by child relid. We use that to map
* child-table Vars to their inheritance parents.
*
* In some cases we need to make names of merged JOIN USING columns unique
* across the whole query, not only per-RTE. If so, unique_using is true
* and using_names is a list of C strings representing names already assigned
* to USING columns.
*
* When deparsing plan trees, there is always just a single item in the
* deparse_namespace list (since a plan tree never contains Vars with
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* varlevelsup > 0). We store the Plan node that is the immediate
* parent of the expression to be deparsed, as well as a list of that
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* Plan's ancestors. In addition, we store its outer and inner subplan nodes,
* as well as their targetlists, and the index tlist if the current plan node
* might contain INDEX_VAR Vars. (These fields could be derived on-the-fly
* from the current Plan node, but it seems notationally clearer to set them
* up as separate fields.)
*/
typedef struct
{
List *rtable; /* List of RangeTblEntry nodes */
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
List *rtable_names; /* Parallel list of names for RTEs */
List *rtable_columns; /* Parallel list of deparse_columns structs */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
List *subplans; /* List of Plan trees for SubPlans */
List *ctes; /* List of CommonTableExpr nodes */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
AppendRelInfo **appendrels; /* Array of AppendRelInfo nodes, or NULL */
/* Workspace for column alias assignment: */
bool unique_using; /* Are we making USING names globally unique */
List *using_names; /* List of assigned names for USING columns */
/* Remaining fields are used only when deparsing a Plan tree: */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Plan *plan; /* immediate parent of current expression */
List *ancestors; /* ancestors of plan */
Plan *outer_plan; /* outer subnode, or NULL if none */
Plan *inner_plan; /* inner subnode, or NULL if none */
List *outer_tlist; /* referent for OUTER_VAR Vars */
List *inner_tlist; /* referent for INNER_VAR Vars */
List *index_tlist; /* referent for INDEX_VAR Vars */
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
/* Special namespace representing a function signature: */
char *funcname;
int numargs;
char **argnames;
} deparse_namespace;
/*
* Per-relation data about column alias names.
*
* Selecting aliases is unreasonably complicated because of the need to dump
* rules/views whose underlying tables may have had columns added, deleted, or
* renamed since the query was parsed. We must nonetheless print the rule/view
* in a form that can be reloaded and will produce the same results as before.
*
* For each RTE used in the query, we must assign column aliases that are
* unique within that RTE. SQL does not require this of the original query,
* but due to factors such as *-expansion we need to be able to uniquely
* reference every column in a decompiled query. As long as we qualify all
* column references, per-RTE uniqueness is sufficient for that.
*
* However, we can't ensure per-column name uniqueness for unnamed join RTEs,
* since they just inherit column names from their input RTEs, and we can't
* rename the columns at the join level. Most of the time this isn't an issue
* because we don't need to reference the join's output columns as such; we
* can reference the input columns instead. That approach can fail for merged
* JOIN USING columns, however, so when we have one of those in an unnamed
* join, we have to make that column's alias globally unique across the whole
* query to ensure it can be referenced unambiguously.
*
* Another problem is that a JOIN USING clause requires the columns to be
* merged to have the same aliases in both input RTEs, and that no other
* columns in those RTEs or their children conflict with the USING names.
* To handle that, we do USING-column alias assignment in a recursive
* traversal of the query's jointree. When descending through a JOIN with
* USING, we preassign the USING column names to the child columns, overriding
* other rules for column alias assignment. We also mark each RTE with a list
* of all USING column names selected for joins containing that RTE, so that
* when we assign other columns' aliases later, we can avoid conflicts.
*
* Another problem is that if a JOIN's input tables have had columns added or
* deleted since the query was parsed, we must generate a column alias list
* for the join that matches the current set of input columns --- otherwise, a
* change in the number of columns in the left input would throw off matching
* of aliases to columns of the right input. Thus, positions in the printable
* column alias list are not necessarily one-for-one with varattnos of the
* JOIN, so we need a separate new_colnames[] array for printing purposes.
*/
typedef struct
{
/*
* colnames is an array containing column aliases to use for columns that
* existed when the query was parsed. Dropped columns have NULL entries.
* This array can be directly indexed by varattno to get a Var's name.
*
* Non-NULL entries are guaranteed unique within the RTE, *except* when
* this is for an unnamed JOIN RTE. In that case we merely copy up names
* from the two input RTEs.
*
* During the recursive descent in set_using_names(), forcible assignment
* of a child RTE's column name is represented by pre-setting that element
* of the child's colnames array. So at that stage, NULL entries in this
* array just mean that no name has been preassigned, not necessarily that
* the column is dropped.
*/
int num_cols; /* length of colnames[] array */
char **colnames; /* array of C strings and NULLs */
/*
* new_colnames is an array containing column aliases to use for columns
* that would exist if the query was re-parsed against the current
* definitions of its base tables. This is what to print as the column
* alias list for the RTE. This array does not include dropped columns,
* but it will include columns added since original parsing. Indexes in
* it therefore have little to do with current varattno values. As above,
* entries are unique unless this is for an unnamed JOIN RTE. (In such an
* RTE, we never actually print this array, but we must compute it anyway
* for possible use in computing column names of upper joins.) The
* parallel array is_new_col marks which of these columns are new since
* original parsing. Entries with is_new_col false must match the
* non-NULL colnames entries one-for-one.
*/
int num_new_cols; /* length of new_colnames[] array */
char **new_colnames; /* array of C strings */
bool *is_new_col; /* array of bool flags */
/* This flag tells whether we should actually print a column alias list */
bool printaliases;
/* This list has all names used as USING names in joins above this RTE */
List *parentUsing; /* names assigned to parent merged columns */
/*
* If this struct is for a JOIN RTE, we fill these fields during the
* set_using_names() pass to describe its relationship to its child RTEs.
*
* leftattnos and rightattnos are arrays with one entry per existing
* output column of the join (hence, indexable by join varattno). For a
* simple reference to a column of the left child, leftattnos[i] is the
* child RTE's attno and rightattnos[i] is zero; and conversely for a
* column of the right child. But for merged columns produced by JOIN
* USING/NATURAL JOIN, both leftattnos[i] and rightattnos[i] are nonzero.
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
* Note that a simple reference might be to a child RTE column that's been
* dropped; but that's OK since the column could not be used in the query.
*
* If it's a JOIN USING, usingNames holds the alias names selected for the
* merged columns (these might be different from the original USING list,
* if we had to modify names to achieve uniqueness).
*/
int leftrti; /* rangetable index of left child */
int rightrti; /* rangetable index of right child */
int *leftattnos; /* left-child varattnos of join cols, or 0 */
int *rightattnos; /* right-child varattnos of join cols, or 0 */
List *usingNames; /* names assigned to merged columns */
} deparse_columns;
/* This macro is analogous to rt_fetch(), but for deparse_columns structs */
#define deparse_columns_fetch(rangetable_index, dpns) \
((deparse_columns *) list_nth((dpns)->rtable_columns, (rangetable_index)-1))
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
/*
* Entry in set_rtable_names' hash table
*/
typedef struct
{
char name[NAMEDATALEN]; /* Hash key --- must be first */
int counter; /* Largest addition used so far for name */
} NameHashEntry;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
/* Callback signature for resolve_special_varno() */
typedef void (*rsv_callback) (Node *node, deparse_context *context,
void *callback_arg);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* Global data
* ----------
*/
static SPIPlanPtr plan_getrulebyoid = NULL;
static const char *const query_getrulebyoid = "SELECT * FROM pg_catalog.pg_rewrite WHERE oid = $1";
static SPIPlanPtr plan_getviewrule = NULL;
static const char *const query_getviewrule = "SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND rulename = $2";
/* GUC parameters */
bool quote_all_identifiers = false;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* Local functions
*
* Most of these functions used to use fixed-size buffers to build their
* results. Now, they take an (already initialized) StringInfo object
* as a parameter, and append their text output to its contents.
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
* ----------
*/
static char *deparse_expression_pretty(Node *expr, List *dpcontext,
bool forceprefix, bool showimplicit,
int prettyFlags, int startIndent);
static char *pg_get_viewdef_worker(Oid viewoid,
int prettyFlags, int wrapColumn);
static char *pg_get_triggerdef_worker(Oid trigid, bool pretty);
static int decompile_column_index_array(Datum column_index_array, Oid relId,
bool withPeriod, StringInfo buf);
static char *pg_get_ruledef_worker(Oid ruleoid, int prettyFlags);
static char *pg_get_indexdef_worker(Oid indexrelid, int colno,
const Oid *excludeOps,
bool attrsOnly, bool keysOnly,
bool showTblSpc, bool inherits,
int prettyFlags, bool missing_ok);
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
static char *pg_get_statisticsobj_worker(Oid statextid, bool columns_only,
bool missing_ok);
static char *pg_get_partkeydef_worker(Oid relid, int prettyFlags,
bool attrsOnly, bool missing_ok);
static char *pg_get_constraintdef_worker(Oid constraintId, bool fullCommand,
int prettyFlags, bool missing_ok);
static text *pg_get_expr_worker(text *expr, Oid relid, int prettyFlags);
static int print_function_arguments(StringInfo buf, HeapTuple proctup,
bool print_table_args, bool print_defaults);
static void print_function_rettype(StringInfo buf, HeapTuple proctup);
static void print_function_trftypes(StringInfo buf, HeapTuple proctup);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
static void print_function_sqlbody(StringInfo buf, HeapTuple proctup);
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
static void set_rtable_names(deparse_namespace *dpns, List *parent_namespaces,
Bitmapset *rels_used);
static void set_deparse_for_query(deparse_namespace *dpns, Query *query,
List *parent_namespaces);
static void set_simple_column_names(deparse_namespace *dpns);
static bool has_dangerous_join_using(deparse_namespace *dpns, Node *jtnode);
static void set_using_names(deparse_namespace *dpns, Node *jtnode,
List *parentUsing);
static void set_relation_column_names(deparse_namespace *dpns,
RangeTblEntry *rte,
deparse_columns *colinfo);
static void set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
deparse_columns *colinfo);
static bool colname_is_unique(const char *colname, deparse_namespace *dpns,
deparse_columns *colinfo);
static char *make_colname_unique(char *colname, deparse_namespace *dpns,
deparse_columns *colinfo);
static void expand_colnames_array_to(deparse_columns *colinfo, int n);
static void identify_join_columns(JoinExpr *j, RangeTblEntry *jrte,
deparse_columns *colinfo);
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
static char *get_rtable_name(int rtindex, deparse_context *context);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
static void set_deparse_plan(deparse_namespace *dpns, Plan *plan);
static Plan *find_recursive_union(deparse_namespace *dpns,
WorkTableScan *wtscan);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
static void push_child_plan(deparse_namespace *dpns, Plan *plan,
deparse_namespace *save_dpns);
static void pop_child_plan(deparse_namespace *dpns,
deparse_namespace *save_dpns);
static void push_ancestor_plan(deparse_namespace *dpns, ListCell *ancestor_cell,
deparse_namespace *save_dpns);
static void pop_ancestor_plan(deparse_namespace *dpns,
deparse_namespace *save_dpns);
static void make_ruledef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc,
int prettyFlags);
static void make_viewdef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc,
int prettyFlags, int wrapColumn);
static void get_query_def(Query *query, StringInfo buf, List *parentnamespace,
TupleDesc resultDesc, bool colNamesVisible,
int prettyFlags, int wrapColumn, int startIndent);
static void get_values_def(List *values_lists, deparse_context *context);
static void get_with_clause(Query *query, deparse_context *context);
static void get_select_query_def(Query *query, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible);
static void get_insert_query_def(Query *query, deparse_context *context,
bool colNamesVisible);
static void get_update_query_def(Query *query, deparse_context *context,
bool colNamesVisible);
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
static void get_update_query_targetlist_def(Query *query, List *targetList,
deparse_context *context,
RangeTblEntry *rte);
static void get_delete_query_def(Query *query, deparse_context *context,
bool colNamesVisible);
static void get_merge_query_def(Query *query, deparse_context *context,
bool colNamesVisible);
static void get_utility_query_def(Query *query, deparse_context *context);
static void get_basic_select_query(Query *query, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible);
static void get_target_list(List *targetList, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible);
static void get_setop_query(Node *setOp, Query *query,
deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
static Node *get_rule_sortgroupclause(Index ref, List *tlist,
bool force_colno,
deparse_context *context);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
static void get_rule_groupingset(GroupingSet *gset, List *targetlist,
bool omit_parens, deparse_context *context);
static void get_rule_orderby(List *orderList, List *targetList,
bool force_colno, deparse_context *context);
static void get_rule_windowclause(Query *query, deparse_context *context);
static void get_rule_windowspec(WindowClause *wc, List *targetList,
deparse_context *context);
static char *get_variable(Var *var, int levelsup, bool istoplevel,
deparse_context *context);
static void get_special_variable(Node *node, deparse_context *context,
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
void *callback_arg);
static void resolve_special_varno(Node *node, deparse_context *context,
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
rsv_callback callback, void *callback_arg);
static Node *find_param_referent(Param *param, deparse_context *context,
deparse_namespace **dpns_p, ListCell **ancestor_cell_p);
static SubPlan *find_param_generator(Param *param, deparse_context *context,
int *column_p);
static SubPlan *find_param_generator_initplan(Param *param, Plan *plan,
int *column_p);
static void get_parameter(Param *param, deparse_context *context);
static const char *get_simple_binary_op_name(OpExpr *expr);
static bool isSimpleNode(Node *node, Node *parentNode, int prettyFlags);
static void appendContextKeyword(deparse_context *context, const char *str,
int indentBefore, int indentAfter, int indentPlus);
static void removeStringInfoSpaces(StringInfo str);
static void get_rule_expr(Node *node, deparse_context *context,
bool showimplicit);
static void get_rule_expr_toplevel(Node *node, deparse_context *context,
bool showimplicit);
static void get_rule_list_toplevel(List *lst, deparse_context *context,
bool showimplicit);
static void get_rule_expr_funccall(Node *node, deparse_context *context,
bool showimplicit);
static bool looks_like_function(Node *node);
static void get_oper_expr(OpExpr *expr, deparse_context *context);
static void get_func_expr(FuncExpr *expr, deparse_context *context,
bool showimplicit);
static void get_agg_expr(Aggref *aggref, deparse_context *context,
Aggref *original_aggref);
static void get_agg_expr_helper(Aggref *aggref, deparse_context *context,
Aggref *original_aggref, const char *funcname,
const char *options, bool is_json_objectagg);
static void get_agg_combine_expr(Node *node, deparse_context *context,
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
void *callback_arg);
static void get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context);
static void get_windowfunc_expr_helper(WindowFunc *wfunc, deparse_context *context,
const char *funcname, const char *options,
bool is_json_objectagg);
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
static bool get_func_sql_syntax(FuncExpr *expr, deparse_context *context);
static void get_coercion_expr(Node *arg, deparse_context *context,
Oid resulttype, int32 resulttypmod,
Node *parentNode);
static void get_const_expr(Const *constval, deparse_context *context,
int showtype);
static void get_const_collation(Const *constval, deparse_context *context);
static void get_json_format(JsonFormat *format, StringInfo buf);
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
static void get_json_returning(JsonReturning *returning, StringInfo buf,
bool json_format_by_default);
static void get_json_constructor(JsonConstructorExpr *ctor,
deparse_context *context, bool showimplicit);
static void get_json_constructor_options(JsonConstructorExpr *ctor,
StringInfo buf);
static void get_json_agg_constructor(JsonConstructorExpr *ctor,
deparse_context *context,
const char *funcname,
bool is_json_objectagg);
static void simple_quote_literal(StringInfo buf, const char *val);
static void get_sublink_expr(SubLink *sublink, deparse_context *context);
static void get_tablefunc(TableFunc *tf, deparse_context *context,
bool showimplicit);
static void get_from_clause(Query *query, const char *prefix,
deparse_context *context);
static void get_from_clause_item(Node *jtnode, Query *query,
deparse_context *context);
static void get_rte_alias(RangeTblEntry *rte, int varno, bool use_as,
deparse_context *context);
static void get_column_alias_list(deparse_columns *colinfo,
deparse_context *context);
static void get_from_clause_coldeflist(RangeTblFunction *rtfunc,
deparse_columns *colinfo,
deparse_context *context);
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
static void get_tablesample_def(TableSampleClause *tablesample,
deparse_context *context);
static void get_opclass_name(Oid opclass, Oid actual_datatype,
StringInfo buf);
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
static Node *processIndirection(Node *node, deparse_context *context);
static void printSubscripts(SubscriptingRef *sbsref, deparse_context *context);
static char *get_relation_name(Oid relid);
static char *generate_relation_name(Oid relid, List *namespaces);
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
static char *generate_qualified_relation_name(Oid relid);
static char *generate_function_name(Oid funcid, int nargs,
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
List *argnames, Oid *argtypes,
bool has_variadic, bool *use_variadic_p,
ParseExprKind special_exprkind);
static char *generate_operator_name(Oid operid, Oid arg1, Oid arg2);
static void add_cast_to(StringInfo buf, Oid typid);
static char *generate_qualified_type_name(Oid typid);
static text *string_to_text(char *str);
static char *flatten_reloptions(Oid relid);
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
static void get_reloptions(StringInfo buf, Datum reloptions);
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
static void get_json_path_spec(Node *path_spec, deparse_context *context,
bool showimplicit);
Add basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-04-04 12:57:08 +02:00
static void get_json_table_columns(TableFunc *tf, deparse_context *context,
bool showimplicit);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
#define only_marker(rte) ((rte)->inh ? "" : "ONLY ")
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* pg_get_ruledef - Do it all and return a text
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
* that could be used as a statement
* to recreate the rule
* ----------
*/
Datum
pg_get_ruledef(PG_FUNCTION_ARGS)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
Oid ruleoid = PG_GETARG_OID(0);
int prettyFlags;
char *res;
prettyFlags = PRETTYFLAG_INDENT;
res = pg_get_ruledef_worker(ruleoid, prettyFlags);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_ruledef_ext(PG_FUNCTION_ARGS)
{
Oid ruleoid = PG_GETARG_OID(0);
bool pretty = PG_GETARG_BOOL(1);
int prettyFlags;
char *res;
prettyFlags = GET_PRETTY_FLAGS(pretty);
res = pg_get_ruledef_worker(ruleoid, prettyFlags);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
static char *
pg_get_ruledef_worker(Oid ruleoid, int prettyFlags)
{
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
Datum args[1];
char nulls[1];
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
int spirc;
HeapTuple ruletup;
TupleDesc rulettc;
StringInfoData buf;
/*
* Do this first so that string is alloc'd in outer context not SPI's.
*/
initStringInfo(&buf);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Connect to SPI manager
*/
if (SPI_connect() != SPI_OK_CONNECT)
elog(ERROR, "SPI_connect failed");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* On the first call prepare the plan to lookup pg_rewrite. We read
* pg_rewrite over the SPI manager instead of using the syscache to be
* checked for read access on pg_rewrite.
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
*/
if (plan_getrulebyoid == NULL)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
Oid argtypes[1];
SPIPlanPtr plan;
argtypes[0] = OIDOID;
plan = SPI_prepare(query_getrulebyoid, 1, argtypes);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (plan == NULL)
elog(ERROR, "SPI_prepare failed for \"%s\"", query_getrulebyoid);
SPI_keepplan(plan);
plan_getrulebyoid = plan;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Get the pg_rewrite tuple for this rule
*/
args[0] = ObjectIdGetDatum(ruleoid);
nulls[0] = ' ';
spirc = SPI_execute_plan(plan_getrulebyoid, args, nulls, true, 0);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (spirc != SPI_OK_SELECT)
elog(ERROR, "failed to get pg_rewrite tuple for rule %u", ruleoid);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (SPI_processed != 1)
{
/*
* There is no tuple data available here, just keep the output buffer
* empty.
*/
}
else
{
/*
2006-07-04 06:35:49 +02:00
* Get the rule's definition and put it into executor's memory
*/
ruletup = SPI_tuptable->vals[0];
rulettc = SPI_tuptable->tupdesc;
make_ruledef(&buf, ruletup, rulettc, prettyFlags);
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Disconnect from SPI manager
*/
if (SPI_finish() != SPI_OK_FINISH)
elog(ERROR, "SPI_finish failed");
if (buf.len == 0)
return NULL;
return buf.data;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
* pg_get_viewdef - Mainly the same thing, but we
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
* only return the SELECT part of a view
* ----------
*/
Datum
pg_get_viewdef(PG_FUNCTION_ARGS)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
/* By OID */
Oid viewoid = PG_GETARG_OID(0);
int prettyFlags;
char *res;
prettyFlags = PRETTYFLAG_INDENT;
res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_viewdef_ext(PG_FUNCTION_ARGS)
{
/* By OID */
Oid viewoid = PG_GETARG_OID(0);
bool pretty = PG_GETARG_BOOL(1);
int prettyFlags;
char *res;
prettyFlags = GET_PRETTY_FLAGS(pretty);
res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_viewdef_wrap(PG_FUNCTION_ARGS)
{
/* By OID */
Oid viewoid = PG_GETARG_OID(0);
int wrap = PG_GETARG_INT32(1);
int prettyFlags;
char *res;
/* calling this implies we want pretty printing */
prettyFlags = GET_PRETTY_FLAGS(true);
res = pg_get_viewdef_worker(viewoid, prettyFlags, wrap);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_viewdef_name(PG_FUNCTION_ARGS)
{
/* By qualified name */
text *viewname = PG_GETARG_TEXT_PP(0);
int prettyFlags;
RangeVar *viewrel;
Oid viewoid;
char *res;
prettyFlags = PRETTYFLAG_INDENT;
/* Look up view name. Can't lock it - we might not have privileges. */
viewrel = makeRangeVarFromNameList(textToQualifiedNameList(viewname));
Improve table locking behavior in the face of current DDL. In the previous coding, callers were faced with an awkward choice: look up the name, do permissions checks, and then lock the table; or look up the name, lock the table, and then do permissions checks. The first choice was wrong because the results of the name lookup and permissions checks might be out-of-date by the time the table lock was acquired, while the second allowed a user with no privileges to interfere with access to a table by users who do have privileges (e.g. if a malicious backend queues up for an AccessExclusiveLock on a table on which AccessShareLock is already held, further attempts to access the table will be blocked until the AccessExclusiveLock is obtained and the malicious backend's transaction rolls back). To fix, allow callers of RangeVarGetRelid() to pass a callback which gets executed after performing the name lookup but before acquiring the relation lock. If the name lookup is retried (because invalidation messages are received), the callback will be re-executed as well, so we get the best of both worlds. RangeVarGetRelid() is renamed to RangeVarGetRelidExtended(); callers not wishing to supply a callback can continue to invoke it as RangeVarGetRelid(), which is now a macro. Since the only one caller that uses nowait = true now passes a callback anyway, the RangeVarGetRelid() macro defaults nowait as well. The callback can also be used for supplemental locking - for example, REINDEX INDEX needs to acquire the table lock before the index lock to reduce deadlock possibilities. There's a lot more work to be done here to fix all the cases where this can be a problem, but this commit provides the general infrastructure and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE, LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE. Per discussion with Noah Misch and Alvaro Herrera.
2011-11-30 16:12:27 +01:00
viewoid = RangeVarGetRelid(viewrel, NoLock, false);
res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_viewdef_name_ext(PG_FUNCTION_ARGS)
{
/* By qualified name */
text *viewname = PG_GETARG_TEXT_PP(0);
bool pretty = PG_GETARG_BOOL(1);
int prettyFlags;
RangeVar *viewrel;
Oid viewoid;
char *res;
prettyFlags = GET_PRETTY_FLAGS(pretty);
/* Look up view name. Can't lock it - we might not have privileges. */
viewrel = makeRangeVarFromNameList(textToQualifiedNameList(viewname));
Improve table locking behavior in the face of current DDL. In the previous coding, callers were faced with an awkward choice: look up the name, do permissions checks, and then lock the table; or look up the name, lock the table, and then do permissions checks. The first choice was wrong because the results of the name lookup and permissions checks might be out-of-date by the time the table lock was acquired, while the second allowed a user with no privileges to interfere with access to a table by users who do have privileges (e.g. if a malicious backend queues up for an AccessExclusiveLock on a table on which AccessShareLock is already held, further attempts to access the table will be blocked until the AccessExclusiveLock is obtained and the malicious backend's transaction rolls back). To fix, allow callers of RangeVarGetRelid() to pass a callback which gets executed after performing the name lookup but before acquiring the relation lock. If the name lookup is retried (because invalidation messages are received), the callback will be re-executed as well, so we get the best of both worlds. RangeVarGetRelid() is renamed to RangeVarGetRelidExtended(); callers not wishing to supply a callback can continue to invoke it as RangeVarGetRelid(), which is now a macro. Since the only one caller that uses nowait = true now passes a callback anyway, the RangeVarGetRelid() macro defaults nowait as well. The callback can also be used for supplemental locking - for example, REINDEX INDEX needs to acquire the table lock before the index lock to reduce deadlock possibilities. There's a lot more work to be done here to fix all the cases where this can be a problem, but this commit provides the general infrastructure and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE, LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE. Per discussion with Noah Misch and Alvaro Herrera.
2011-11-30 16:12:27 +01:00
viewoid = RangeVarGetRelid(viewrel, NoLock, false);
res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
/*
* Common code for by-OID and by-name variants of pg_get_viewdef
*/
static char *
pg_get_viewdef_worker(Oid viewoid, int prettyFlags, int wrapColumn)
{
Datum args[2];
char nulls[2];
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
int spirc;
HeapTuple ruletup;
TupleDesc rulettc;
StringInfoData buf;
/*
* Do this first so that string is alloc'd in outer context not SPI's.
*/
initStringInfo(&buf);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Connect to SPI manager
*/
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (SPI_connect() != SPI_OK_CONNECT)
elog(ERROR, "SPI_connect failed");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* On the first call prepare the plan to lookup pg_rewrite. We read
* pg_rewrite over the SPI manager instead of using the syscache to be
* checked for read access on pg_rewrite.
*/
if (plan_getviewrule == NULL)
{
Oid argtypes[2];
SPIPlanPtr plan;
argtypes[0] = OIDOID;
argtypes[1] = NAMEOID;
plan = SPI_prepare(query_getviewrule, 2, argtypes);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (plan == NULL)
elog(ERROR, "SPI_prepare failed for \"%s\"", query_getviewrule);
SPI_keepplan(plan);
plan_getviewrule = plan;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Get the pg_rewrite tuple for the view's SELECT rule
*/
args[0] = ObjectIdGetDatum(viewoid);
args[1] = DirectFunctionCall1(namein, CStringGetDatum(ViewSelectRuleName));
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
nulls[0] = ' ';
nulls[1] = ' ';
spirc = SPI_execute_plan(plan_getviewrule, args, nulls, true, 0);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (spirc != SPI_OK_SELECT)
elog(ERROR, "failed to get pg_rewrite tuple for view %u", viewoid);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (SPI_processed != 1)
{
/*
* There is no tuple data available here, just keep the output buffer
* empty.
*/
}
else
{
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
2006-07-04 06:35:49 +02:00
* Get the rule's definition and put it into executor's memory
*/
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
ruletup = SPI_tuptable->vals[0];
rulettc = SPI_tuptable->tupdesc;
make_viewdef(&buf, ruletup, rulettc, prettyFlags, wrapColumn);
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Disconnect from SPI manager
*/
if (SPI_finish() != SPI_OK_FINISH)
elog(ERROR, "SPI_finish failed");
if (buf.len == 0)
return NULL;
return buf.data;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
* pg_get_triggerdef - Get the definition of a trigger
* ----------
*/
Datum
pg_get_triggerdef(PG_FUNCTION_ARGS)
{
Oid trigid = PG_GETARG_OID(0);
char *res;
res = pg_get_triggerdef_worker(trigid, false);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_triggerdef_ext(PG_FUNCTION_ARGS)
{
Oid trigid = PG_GETARG_OID(0);
bool pretty = PG_GETARG_BOOL(1);
char *res;
res = pg_get_triggerdef_worker(trigid, pretty);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
static char *
pg_get_triggerdef_worker(Oid trigid, bool pretty)
{
HeapTuple ht_trig;
Form_pg_trigger trigrec;
StringInfoData buf;
Relation tgrel;
ScanKeyData skey[1];
SysScanDesc tgscan;
int findx = 0;
char *tgname;
char *tgoldtable;
char *tgnewtable;
Datum value;
bool isnull;
/*
* Fetch the pg_trigger tuple by the Oid of the trigger
*/
tgrel = table_open(TriggerRelationId, AccessShareLock);
ScanKeyInit(&skey[0],
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
Anum_pg_trigger_oid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(trigid));
tgscan = systable_beginscan(tgrel, TriggerOidIndexId, true,
NULL, 1, skey);
ht_trig = systable_getnext(tgscan);
if (!HeapTupleIsValid(ht_trig))
{
systable_endscan(tgscan);
table_close(tgrel, AccessShareLock);
return NULL;
}
trigrec = (Form_pg_trigger) GETSTRUCT(ht_trig);
/*
* Start the trigger definition. Note that the trigger's name should never
* be schema-qualified, but the trigger rel's name may be.
*/
initStringInfo(&buf);
tgname = NameStr(trigrec->tgname);
appendStringInfo(&buf, "CREATE %sTRIGGER %s ",
OidIsValid(trigrec->tgconstraint) ? "CONSTRAINT " : "",
quote_identifier(tgname));
if (TRIGGER_FOR_BEFORE(trigrec->tgtype))
appendStringInfoString(&buf, "BEFORE");
else if (TRIGGER_FOR_AFTER(trigrec->tgtype))
appendStringInfoString(&buf, "AFTER");
else if (TRIGGER_FOR_INSTEAD(trigrec->tgtype))
appendStringInfoString(&buf, "INSTEAD OF");
else
elog(ERROR, "unexpected tgtype value: %d", trigrec->tgtype);
if (TRIGGER_FOR_INSERT(trigrec->tgtype))
{
appendStringInfoString(&buf, " INSERT");
findx++;
}
if (TRIGGER_FOR_DELETE(trigrec->tgtype))
{
if (findx > 0)
appendStringInfoString(&buf, " OR DELETE");
else
appendStringInfoString(&buf, " DELETE");
findx++;
}
if (TRIGGER_FOR_UPDATE(trigrec->tgtype))
{
if (findx > 0)
appendStringInfoString(&buf, " OR UPDATE");
else
appendStringInfoString(&buf, " UPDATE");
findx++;
/* tgattr is first var-width field, so OK to access directly */
if (trigrec->tgattr.dim1 > 0)
{
int i;
appendStringInfoString(&buf, " OF ");
for (i = 0; i < trigrec->tgattr.dim1; i++)
{
char *attname;
if (i > 0)
appendStringInfoString(&buf, ", ");
attname = get_attname(trigrec->tgrelid,
trigrec->tgattr.values[i], false);
appendStringInfoString(&buf, quote_identifier(attname));
}
}
}
if (TRIGGER_FOR_TRUNCATE(trigrec->tgtype))
{
if (findx > 0)
appendStringInfoString(&buf, " OR TRUNCATE");
else
appendStringInfoString(&buf, " TRUNCATE");
findx++;
}
/*
* In non-pretty mode, always schema-qualify the target table name for
* safety. In pretty mode, schema-qualify only if not visible.
*/
appendStringInfo(&buf, " ON %s ",
pretty ?
generate_relation_name(trigrec->tgrelid, NIL) :
generate_qualified_relation_name(trigrec->tgrelid));
if (OidIsValid(trigrec->tgconstraint))
{
if (OidIsValid(trigrec->tgconstrrelid))
appendStringInfo(&buf, "FROM %s ",
generate_relation_name(trigrec->tgconstrrelid, NIL));
if (!trigrec->tgdeferrable)
appendStringInfoString(&buf, "NOT ");
appendStringInfoString(&buf, "DEFERRABLE INITIALLY ");
if (trigrec->tginitdeferred)
appendStringInfoString(&buf, "DEFERRED ");
else
appendStringInfoString(&buf, "IMMEDIATE ");
}
value = fastgetattr(ht_trig, Anum_pg_trigger_tgoldtable,
tgrel->rd_att, &isnull);
if (!isnull)
tgoldtable = NameStr(*DatumGetName(value));
else
tgoldtable = NULL;
value = fastgetattr(ht_trig, Anum_pg_trigger_tgnewtable,
tgrel->rd_att, &isnull);
if (!isnull)
tgnewtable = NameStr(*DatumGetName(value));
else
tgnewtable = NULL;
if (tgoldtable != NULL || tgnewtable != NULL)
{
appendStringInfoString(&buf, "REFERENCING ");
if (tgoldtable != NULL)
appendStringInfo(&buf, "OLD TABLE AS %s ",
quote_identifier(tgoldtable));
if (tgnewtable != NULL)
appendStringInfo(&buf, "NEW TABLE AS %s ",
quote_identifier(tgnewtable));
}
if (TRIGGER_FOR_ROW(trigrec->tgtype))
appendStringInfoString(&buf, "FOR EACH ROW ");
else
appendStringInfoString(&buf, "FOR EACH STATEMENT ");
/* If the trigger has a WHEN qualification, add that */
value = fastgetattr(ht_trig, Anum_pg_trigger_tgqual,
tgrel->rd_att, &isnull);
if (!isnull)
{
Node *qual;
char relkind;
deparse_context context;
deparse_namespace dpns;
RangeTblEntry *oldrte;
RangeTblEntry *newrte;
appendStringInfoString(&buf, "WHEN (");
qual = stringToNode(TextDatumGetCString(value));
relkind = get_rel_relkind(trigrec->tgrelid);
/* Build minimal OLD and NEW RTEs for the rel */
oldrte = makeNode(RangeTblEntry);
oldrte->rtekind = RTE_RELATION;
oldrte->relid = trigrec->tgrelid;
oldrte->relkind = relkind;
oldrte->rellockmode = AccessShareLock;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
oldrte->alias = makeAlias("old", NIL);
oldrte->eref = oldrte->alias;
oldrte->lateral = false;
oldrte->inh = false;
oldrte->inFromCl = true;
newrte = makeNode(RangeTblEntry);
newrte->rtekind = RTE_RELATION;
newrte->relid = trigrec->tgrelid;
newrte->relkind = relkind;
newrte->rellockmode = AccessShareLock;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
newrte->alias = makeAlias("new", NIL);
newrte->eref = newrte->alias;
newrte->lateral = false;
newrte->inh = false;
newrte->inFromCl = true;
/* Build two-element rtable */
memset(&dpns, 0, sizeof(dpns));
dpns.rtable = list_make2(oldrte, newrte);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns.subplans = NIL;
dpns.ctes = NIL;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns.appendrels = NULL;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
set_rtable_names(&dpns, NIL, NULL);
set_simple_column_names(&dpns);
/* Set up context with one-deep namespace stack */
context.buf = &buf;
context.namespaces = list_make1(&dpns);
context.windowClause = NIL;
context.windowTList = NIL;
context.varprefix = true;
context.prettyFlags = GET_PRETTY_FLAGS(pretty);
context.wrapColumn = WRAP_COLUMN_DEFAULT;
context.indentLevel = PRETTYINDENT_STD;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
context.special_exprkind = EXPR_KIND_NONE;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
context.appendparents = NULL;
get_rule_expr(qual, &context, false);
appendStringInfoString(&buf, ") ");
}
appendStringInfo(&buf, "EXECUTE FUNCTION %s(",
generate_function_name(trigrec->tgfoid, 0,
NIL, NULL,
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
false, NULL, EXPR_KIND_NONE));
if (trigrec->tgnargs > 0)
{
char *p;
int i;
value = fastgetattr(ht_trig, Anum_pg_trigger_tgargs,
tgrel->rd_att, &isnull);
if (isnull)
elog(ERROR, "tgargs is null for trigger %u", trigid);
p = (char *) VARDATA_ANY(DatumGetByteaPP(value));
for (i = 0; i < trigrec->tgnargs; i++)
{
if (i > 0)
appendStringInfoString(&buf, ", ");
simple_quote_literal(&buf, p);
/* advance p to next string embedded in tgargs */
while (*p)
p++;
p++;
}
}
/* We deliberately do not put semi-colon at end */
appendStringInfoChar(&buf, ')');
/* Clean up */
systable_endscan(tgscan);
table_close(tgrel, AccessShareLock);
return buf.data;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* pg_get_indexdef - Get the definition of an index
*
* In the extended version, there is a colno argument as well as pretty bool.
* if colno == 0, we want a complete index definition.
* if colno > 0, we only want the Nth index key's variable or expression.
*
* Note that the SQL-function versions of this omit any info about the
* index tablespace; this is intentional because pg_dump wants it that way.
* However pg_get_indexdef_string() includes the index tablespace.
* ----------
*/
Datum
pg_get_indexdef(PG_FUNCTION_ARGS)
{
Oid indexrelid = PG_GETARG_OID(0);
int prettyFlags;
char *res;
prettyFlags = PRETTYFLAG_INDENT;
res = pg_get_indexdef_worker(indexrelid, 0, NULL,
false, false,
false, false,
prettyFlags, true);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_indexdef_ext(PG_FUNCTION_ARGS)
{
Oid indexrelid = PG_GETARG_OID(0);
int32 colno = PG_GETARG_INT32(1);
bool pretty = PG_GETARG_BOOL(2);
int prettyFlags;
char *res;
prettyFlags = GET_PRETTY_FLAGS(pretty);
res = pg_get_indexdef_worker(indexrelid, colno, NULL,
colno != 0, false,
false, false,
prettyFlags, true);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
/*
* Internal version for use by ALTER TABLE.
* Includes a tablespace clause in the result.
* Returns a palloc'd C string; no pretty-printing.
*/
char *
pg_get_indexdef_string(Oid indexrelid)
{
return pg_get_indexdef_worker(indexrelid, 0, NULL,
false, false,
true, true,
0, false);
}
/* Internal version that just reports the key-column definitions */
char *
pg_get_indexdef_columns(Oid indexrelid, bool pretty)
{
int prettyFlags;
prettyFlags = GET_PRETTY_FLAGS(pretty);
return pg_get_indexdef_worker(indexrelid, 0, NULL,
true, true,
false, false,
prettyFlags, false);
}
pageinspect: Fix gist_page_items() with included columns Non-leaf pages of GiST indexes contain key attributes, leaf pages contain both key and non-key attributes, and gist_page_items() ignored the handling of non-key attributes. This caused a few problems when using gist_page_items() on a GiST index with INCLUDE: - On a non-leaf page, the function would crash. - On a leaf page, the function would work, but miss to display all the values for included attributes. This commit fixes gist_page_items() to handle such cases in a more appropriate way, and now displays the values of key and non-key attributes for each item separately in a style consistent with what ruleutils.c would generate for the attribute list, depending on the page type dealt with. In a way similar to how a record is displayed, values would be double-quoted for key or non-key attributes if required. ruleutils.c did not provide a routine able to control if non-key attributes should be displayed, so an extended() routine for index definitions is added to work around the leaf and non-leaf page differences. While on it, this commit fixes a third problem related to the amount of data reported for key attributes. The code originally relied on BuildIndexValueDescription() (used for error reports on constraints) that would not print all the data stored in the index but the index opclass's input type, so this limited the amount of information available. This switch makes gist_page_items() much cheaper as there is no need to run ACL checks for each item printed, which is not an issue anyway as superuser rights are required to execute the functions of pageinspect. Opclasses whose data cannot be displayed can rely on gist_page_items_bytea(). The documentation of this function was slightly incorrect for the output results generated on HEAD and v15, so adjust it on these branches. Author: Alexander Lakhin, Michael Paquier Discussion: https://postgr.es/m/17884-cb8c326522977acb@postgresql.org Backpatch-through: 14
2023-05-19 05:37:58 +02:00
/* Internal version, extensible with flags to control its behavior */
char *
pg_get_indexdef_columns_extended(Oid indexrelid, bits16 flags)
{
bool pretty = ((flags & RULE_INDEXDEF_PRETTY) != 0);
bool keys_only = ((flags & RULE_INDEXDEF_KEYS_ONLY) != 0);
int prettyFlags;
prettyFlags = GET_PRETTY_FLAGS(pretty);
return pg_get_indexdef_worker(indexrelid, 0, NULL,
true, keys_only,
false, false,
prettyFlags, false);
}
/*
* Internal workhorse to decompile an index definition.
*
* This is now used for exclusion constraints as well: if excludeOps is not
* NULL then it points to an array of exclusion operator OIDs.
*/
static char *
pg_get_indexdef_worker(Oid indexrelid, int colno,
const Oid *excludeOps,
bool attrsOnly, bool keysOnly,
bool showTblSpc, bool inherits,
int prettyFlags, bool missing_ok)
{
/* might want a separate isConstraint parameter later */
bool isConstraint = (excludeOps != NULL);
HeapTuple ht_idx;
HeapTuple ht_idxrel;
HeapTuple ht_am;
Form_pg_index idxrec;
Form_pg_class idxrelrec;
Form_pg_am amrec;
IndexAmRoutine *amroutine;
List *indexprs;
ListCell *indexpr_item;
List *context;
Oid indrelid;
int keyno;
Datum indcollDatum;
Datum indclassDatum;
Datum indoptionDatum;
oidvector *indcollation;
oidvector *indclass;
int2vector *indoption;
StringInfoData buf;
char *str;
char *sep;
/*
* Fetch the pg_index tuple by the Oid of the index
*/
ht_idx = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexrelid));
if (!HeapTupleIsValid(ht_idx))
{
if (missing_ok)
return NULL;
elog(ERROR, "cache lookup failed for index %u", indexrelid);
}
idxrec = (Form_pg_index) GETSTRUCT(ht_idx);
indrelid = idxrec->indrelid;
Assert(indexrelid == idxrec->indexrelid);
/* Must get indcollation, indclass, and indoption the hard way */
indcollDatum = SysCacheGetAttrNotNull(INDEXRELID, ht_idx,
Anum_pg_index_indcollation);
indcollation = (oidvector *) DatumGetPointer(indcollDatum);
indclassDatum = SysCacheGetAttrNotNull(INDEXRELID, ht_idx,
Anum_pg_index_indclass);
indclass = (oidvector *) DatumGetPointer(indclassDatum);
indoptionDatum = SysCacheGetAttrNotNull(INDEXRELID, ht_idx,
Anum_pg_index_indoption);
indoption = (int2vector *) DatumGetPointer(indoptionDatum);
/*
* Fetch the pg_class tuple of the index relation
*/
ht_idxrel = SearchSysCache1(RELOID, ObjectIdGetDatum(indexrelid));
if (!HeapTupleIsValid(ht_idxrel))
elog(ERROR, "cache lookup failed for relation %u", indexrelid);
idxrelrec = (Form_pg_class) GETSTRUCT(ht_idxrel);
/*
* Fetch the pg_am tuple of the index' access method
*/
ht_am = SearchSysCache1(AMOID, ObjectIdGetDatum(idxrelrec->relam));
if (!HeapTupleIsValid(ht_am))
elog(ERROR, "cache lookup failed for access method %u",
idxrelrec->relam);
amrec = (Form_pg_am) GETSTRUCT(ht_am);
/* Fetch the index AM's API struct */
amroutine = GetIndexAmRoutine(amrec->amhandler);
/*
* Get the index expressions, if any. (NOTE: we do not use the relcache
* versions of the expressions and predicate, because we want to display
* non-const-folded expressions.)
*/
if (!heap_attisnull(ht_idx, Anum_pg_index_indexprs, NULL))
{
Datum exprsDatum;
char *exprsString;
exprsDatum = SysCacheGetAttrNotNull(INDEXRELID, ht_idx,
Anum_pg_index_indexprs);
exprsString = TextDatumGetCString(exprsDatum);
indexprs = (List *) stringToNode(exprsString);
pfree(exprsString);
}
else
indexprs = NIL;
indexpr_item = list_head(indexprs);
context = deparse_context_for(get_relation_name(indrelid), indrelid);
/*
* Start the index definition. Note that the index's name should never be
* schema-qualified, but the indexed rel's name may be.
*/
initStringInfo(&buf);
if (!attrsOnly)
{
if (!isConstraint)
appendStringInfo(&buf, "CREATE %sINDEX %s ON %s%s USING %s (",
idxrec->indisunique ? "UNIQUE " : "",
quote_identifier(NameStr(idxrelrec->relname)),
idxrelrec->relkind == RELKIND_PARTITIONED_INDEX
&& !inherits ? "ONLY " : "",
(prettyFlags & PRETTYFLAG_SCHEMA) ?
generate_relation_name(indrelid, NIL) :
generate_qualified_relation_name(indrelid),
quote_identifier(NameStr(amrec->amname)));
else /* currently, must be EXCLUDE constraint */
appendStringInfo(&buf, "EXCLUDE USING %s (",
quote_identifier(NameStr(amrec->amname)));
}
1999-05-25 18:15:34 +02:00
/*
* Report the indexed attributes
*/
sep = "";
for (keyno = 0; keyno < idxrec->indnatts; keyno++)
{
AttrNumber attnum = idxrec->indkey.values[keyno];
Oid keycoltype;
Oid keycolcollation;
/*
* Ignore non-key attributes if told to.
*/
if (keysOnly && keyno >= idxrec->indnkeyatts)
break;
/* Otherwise, print INCLUDE to divide key and non-key attrs. */
if (!colno && keyno == idxrec->indnkeyatts)
{
appendStringInfoString(&buf, ") INCLUDE (");
sep = "";
}
if (!colno)
appendStringInfoString(&buf, sep);
sep = ", ";
if (attnum != 0)
{
/* Simple index column */
char *attname;
int32 keycoltypmod;
attname = get_attname(indrelid, attnum, false);
if (!colno || colno == keyno + 1)
appendStringInfoString(&buf, quote_identifier(attname));
get_atttypetypmodcoll(indrelid, attnum,
&keycoltype, &keycoltypmod,
&keycolcollation);
}
else
{
/* expressional index */
Node *indexkey;
if (indexpr_item == NULL)
elog(ERROR, "too few entries in indexprs list");
indexkey = (Node *) lfirst(indexpr_item);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
indexpr_item = lnext(indexprs, indexpr_item);
/* Deparse */
str = deparse_expression_pretty(indexkey, context, false, false,
prettyFlags, 0);
if (!colno || colno == keyno + 1)
{
/* Need parens if it's not a bare function call */
if (looks_like_function(indexkey))
appendStringInfoString(&buf, str);
else
appendStringInfo(&buf, "(%s)", str);
}
keycoltype = exprType(indexkey);
keycolcollation = exprCollation(indexkey);
}
/* Print additional decoration for (selected) key columns */
if (!attrsOnly && keyno < idxrec->indnkeyatts &&
(!colno || colno == keyno + 1))
{
int16 opt = indoption->values[keyno];
Oid indcoll = indcollation->values[keyno];
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
Datum attoptions = get_attoptions(indexrelid, keyno + 1);
bool has_options = attoptions != (Datum) 0;
/* Add collation, if not default for column */
if (OidIsValid(indcoll) && indcoll != keycolcollation)
appendStringInfo(&buf, " COLLATE %s",
generate_collation_name((indcoll)));
/* Add the operator class name, if not default */
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
get_opclass_name(indclass->values[keyno],
has_options ? InvalidOid : keycoltype, &buf);
if (has_options)
{
appendStringInfoString(&buf, " (");
get_reloptions(&buf, attoptions);
appendStringInfoChar(&buf, ')');
}
/* Add options if relevant */
if (amroutine->amcanorder)
{
/* if it supports sort ordering, report DESC and NULLS opts */
if (opt & INDOPTION_DESC)
{
appendStringInfoString(&buf, " DESC");
/* NULLS FIRST is the default in this case */
if (!(opt & INDOPTION_NULLS_FIRST))
appendStringInfoString(&buf, " NULLS LAST");
}
else
{
if (opt & INDOPTION_NULLS_FIRST)
appendStringInfoString(&buf, " NULLS FIRST");
}
}
/* Add the exclusion operator if relevant */
if (excludeOps != NULL)
appendStringInfo(&buf, " WITH %s",
generate_operator_name(excludeOps[keyno],
keycoltype,
keycoltype));
}
}
if (!attrsOnly)
{
appendStringInfoChar(&buf, ')');
if (idxrec->indnullsnotdistinct)
appendStringInfoString(&buf, " NULLS NOT DISTINCT");
/*
* If it has options, append "WITH (options)"
*/
str = flatten_reloptions(indexrelid);
if (str)
{
appendStringInfo(&buf, " WITH (%s)", str);
pfree(str);
}
/*
* Print tablespace, but only if requested
*/
if (showTblSpc)
{
Oid tblspc;
tblspc = get_rel_tablespace(indexrelid);
if (OidIsValid(tblspc))
{
if (isConstraint)
appendStringInfoString(&buf, " USING INDEX");
appendStringInfo(&buf, " TABLESPACE %s",
quote_identifier(get_tablespace_name(tblspc)));
}
}
/*
* If it's a partial index, decompile and append the predicate
*/
if (!heap_attisnull(ht_idx, Anum_pg_index_indpred, NULL))
{
Node *node;
Datum predDatum;
char *predString;
/* Convert text string to node tree */
predDatum = SysCacheGetAttrNotNull(INDEXRELID, ht_idx,
Anum_pg_index_indpred);
predString = TextDatumGetCString(predDatum);
node = (Node *) stringToNode(predString);
pfree(predString);
/* Deparse */
str = deparse_expression_pretty(node, context, false, false,
prettyFlags, 0);
if (isConstraint)
appendStringInfo(&buf, " WHERE (%s)", str);
else
appendStringInfo(&buf, " WHERE %s", str);
}
}
/* Clean up */
ReleaseSysCache(ht_idx);
ReleaseSysCache(ht_idxrel);
ReleaseSysCache(ht_am);
return buf.data;
}
/* ----------
* pg_get_querydef
*
* Public entry point to deparse one query parsetree.
* The pretty flags are determined by GET_PRETTY_FLAGS(pretty).
*
* The result is a palloc'd C string.
* ----------
*/
char *
pg_get_querydef(Query *query, bool pretty)
{
StringInfoData buf;
int prettyFlags;
prettyFlags = GET_PRETTY_FLAGS(pretty);
initStringInfo(&buf);
get_query_def(query, &buf, NIL, NULL, true,
prettyFlags, WRAP_COLUMN_DEFAULT, 0);
return buf.data;
}
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
/*
* pg_get_statisticsobjdef
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
* Get the definition of an extended statistics object
*/
Datum
pg_get_statisticsobjdef(PG_FUNCTION_ARGS)
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
{
Oid statextid = PG_GETARG_OID(0);
char *res;
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
res = pg_get_statisticsobj_worker(statextid, false, true);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
/*
* Internal version for use by ALTER TABLE.
* Includes a tablespace clause in the result.
* Returns a palloc'd C string; no pretty-printing.
*/
char *
pg_get_statisticsobjdef_string(Oid statextid)
{
return pg_get_statisticsobj_worker(statextid, false, false);
}
/*
* pg_get_statisticsobjdef_columns
* Get columns and expressions for an extended statistics object
*/
Datum
pg_get_statisticsobjdef_columns(PG_FUNCTION_ARGS)
{
Oid statextid = PG_GETARG_OID(0);
char *res;
res = pg_get_statisticsobj_worker(statextid, true, true);
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
/*
* Internal workhorse to decompile an extended statistics object.
*/
static char *
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
pg_get_statisticsobj_worker(Oid statextid, bool columns_only, bool missing_ok)
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
{
Form_pg_statistic_ext statextrec;
HeapTuple statexttup;
StringInfoData buf;
int colno;
char *nsp;
ArrayType *arr;
char *enabled;
Datum datum;
bool ndistinct_enabled;
bool dependencies_enabled;
bool mcv_enabled;
int i;
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
List *context;
ListCell *lc;
List *exprs = NIL;
bool has_exprs;
int ncolumns;
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
if (!HeapTupleIsValid(statexttup))
{
if (missing_ok)
return NULL;
elog(ERROR, "cache lookup failed for statistics object %u", statextid);
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/* has the statistics expressions? */
has_exprs = !heap_attisnull(statexttup, Anum_pg_statistic_ext_stxexprs, NULL);
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
statextrec = (Form_pg_statistic_ext) GETSTRUCT(statexttup);
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
/*
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
* Get the statistics expressions, if any. (NOTE: we do not use the
* relcache versions of the expressions, because we want to display
* non-const-folded expressions.)
*/
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
if (has_exprs)
{
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
Datum exprsDatum;
char *exprsString;
exprsDatum = SysCacheGetAttrNotNull(STATEXTOID, statexttup,
Anum_pg_statistic_ext_stxexprs);
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
exprsString = TextDatumGetCString(exprsDatum);
exprs = (List *) stringToNode(exprsString);
pfree(exprsString);
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
else
exprs = NIL;
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/* count the number of columns (attributes and expressions) */
ncolumns = statextrec->stxkeys.dim1 + list_length(exprs);
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
initStringInfo(&buf);
if (!columns_only)
{
nsp = get_namespace_name_or_temp(statextrec->stxnamespace);
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
appendStringInfo(&buf, "CREATE STATISTICS %s",
quote_qualified_identifier(nsp,
NameStr(statextrec->stxname)));
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/*
* Decode the stxkind column so that we know which stats types to
* print.
*/
datum = SysCacheGetAttrNotNull(STATEXTOID, statexttup,
Anum_pg_statistic_ext_stxkind);
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
arr = DatumGetArrayTypeP(datum);
if (ARR_NDIM(arr) != 1 ||
ARR_HASNULL(arr) ||
ARR_ELEMTYPE(arr) != CHAROID)
elog(ERROR, "stxkind is not a 1-D char array");
enabled = (char *) ARR_DATA_PTR(arr);
ndistinct_enabled = false;
dependencies_enabled = false;
mcv_enabled = false;
for (i = 0; i < ARR_DIMS(arr)[0]; i++)
{
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
if (enabled[i] == STATS_EXT_NDISTINCT)
ndistinct_enabled = true;
else if (enabled[i] == STATS_EXT_DEPENDENCIES)
dependencies_enabled = true;
else if (enabled[i] == STATS_EXT_MCV)
mcv_enabled = true;
/* ignore STATS_EXT_EXPRESSIONS (it's built automatically) */
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/*
* If any option is disabled, then we'll need to append the types
* clause to show which options are enabled. We omit the types clause
* on purpose when all options are enabled, so a pg_dump/pg_restore
* will create all statistics types on a newer postgres version, if
* the statistics had all options enabled on the original version.
*
* But if the statistics is defined on just a single column, it has to
* be an expression statistics. In that case we don't need to specify
* kinds.
*/
if ((!ndistinct_enabled || !dependencies_enabled || !mcv_enabled) &&
(ncolumns > 1))
{
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
bool gotone = false;
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
appendStringInfoString(&buf, " (");
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
if (ndistinct_enabled)
{
appendStringInfoString(&buf, "ndistinct");
gotone = true;
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
if (dependencies_enabled)
{
appendStringInfo(&buf, "%sdependencies", gotone ? ", " : "");
gotone = true;
}
if (mcv_enabled)
appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
appendStringInfoChar(&buf, ')');
}
appendStringInfoString(&buf, " ON ");
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/* decode simple column references */
for (colno = 0; colno < statextrec->stxkeys.dim1; colno++)
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
{
AttrNumber attnum = statextrec->stxkeys.values[colno];
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
char *attname;
if (colno > 0)
appendStringInfoString(&buf, ", ");
attname = get_attname(statextrec->stxrelid, attnum, false);
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
appendStringInfoString(&buf, quote_identifier(attname));
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
context = deparse_context_for(get_relation_name(statextrec->stxrelid),
statextrec->stxrelid);
foreach(lc, exprs)
{
Node *expr = (Node *) lfirst(lc);
char *str;
int prettyFlags = PRETTYFLAG_PAREN;
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
str = deparse_expression_pretty(expr, context, false, false,
prettyFlags, 0);
if (colno > 0)
appendStringInfoString(&buf, ", ");
/* Need parens if it's not a bare function call */
if (looks_like_function(expr))
appendStringInfoString(&buf, str);
else
appendStringInfo(&buf, "(%s)", str);
colno++;
}
if (!columns_only)
appendStringInfo(&buf, " FROM %s",
generate_relation_name(statextrec->stxrelid, NIL));
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
ReleaseSysCache(statexttup);
return buf.data;
}
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/*
* Generate text array of expressions for statistics object.
*/
Datum
pg_get_statisticsobjdef_expressions(PG_FUNCTION_ARGS)
{
Oid statextid = PG_GETARG_OID(0);
Form_pg_statistic_ext statextrec;
HeapTuple statexttup;
Datum datum;
List *context;
ListCell *lc;
List *exprs = NIL;
bool has_exprs;
char *tmp;
ArrayBuildState *astate = NULL;
statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
if (!HeapTupleIsValid(statexttup))
PG_RETURN_NULL();
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
/* Does the stats object have expressions? */
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
has_exprs = !heap_attisnull(statexttup, Anum_pg_statistic_ext_stxexprs, NULL);
/* no expressions? we're done */
if (!has_exprs)
{
ReleaseSysCache(statexttup);
PG_RETURN_NULL();
}
statextrec = (Form_pg_statistic_ext) GETSTRUCT(statexttup);
/*
* Get the statistics expressions, and deparse them into text values.
*/
datum = SysCacheGetAttrNotNull(STATEXTOID, statexttup,
Anum_pg_statistic_ext_stxexprs);
Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
2021-03-26 23:22:01 +01:00
tmp = TextDatumGetCString(datum);
exprs = (List *) stringToNode(tmp);
pfree(tmp);
context = deparse_context_for(get_relation_name(statextrec->stxrelid),
statextrec->stxrelid);
foreach(lc, exprs)
{
Node *expr = (Node *) lfirst(lc);
char *str;
int prettyFlags = PRETTYFLAG_INDENT;
str = deparse_expression_pretty(expr, context, false, false,
prettyFlags, 0);
astate = accumArrayResult(astate,
PointerGetDatum(cstring_to_text(str)),
false,
TEXTOID,
CurrentMemoryContext);
}
ReleaseSysCache(statexttup);
PG_RETURN_DATUM(makeArrayResult(astate, CurrentMemoryContext));
}
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
/*
* pg_get_partkeydef
*
* Returns the partition key specification, ie, the following:
*
* { RANGE | LIST | HASH } (column opt_collation opt_opclass [, ...])
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
*/
Datum
pg_get_partkeydef(PG_FUNCTION_ARGS)
{
Oid relid = PG_GETARG_OID(0);
char *res;
res = pg_get_partkeydef_worker(relid, PRETTYFLAG_INDENT, false, true);
if (res == NULL)
PG_RETURN_NULL();
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
PG_RETURN_TEXT_P(string_to_text(res));
}
/* Internal version that just reports the column definitions */
char *
pg_get_partkeydef_columns(Oid relid, bool pretty)
{
int prettyFlags;
prettyFlags = GET_PRETTY_FLAGS(pretty);
return pg_get_partkeydef_worker(relid, prettyFlags, true, false);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
}
/*
* Internal workhorse to decompile a partition key definition.
*/
static char *
pg_get_partkeydef_worker(Oid relid, int prettyFlags,
bool attrsOnly, bool missing_ok)
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
{
Form_pg_partitioned_table form;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
HeapTuple tuple;
oidvector *partclass;
oidvector *partcollation;
List *partexprs;
ListCell *partexpr_item;
List *context;
Datum datum;
StringInfoData buf;
int keyno;
char *str;
char *sep;
tuple = SearchSysCache1(PARTRELID, ObjectIdGetDatum(relid));
if (!HeapTupleIsValid(tuple))
{
if (missing_ok)
return NULL;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
elog(ERROR, "cache lookup failed for partition key of %u", relid);
}
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
form = (Form_pg_partitioned_table) GETSTRUCT(tuple);
Assert(form->partrelid == relid);
/* Must get partclass and partcollation the hard way */
datum = SysCacheGetAttrNotNull(PARTRELID, tuple,
Anum_pg_partitioned_table_partclass);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
partclass = (oidvector *) DatumGetPointer(datum);
datum = SysCacheGetAttrNotNull(PARTRELID, tuple,
Anum_pg_partitioned_table_partcollation);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
partcollation = (oidvector *) DatumGetPointer(datum);
/*
* Get the expressions, if any. (NOTE: we do not use the relcache
* versions of the expressions, because we want to display
* non-const-folded expressions.)
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
*/
if (!heap_attisnull(tuple, Anum_pg_partitioned_table_partexprs, NULL))
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
{
Datum exprsDatum;
char *exprsString;
exprsDatum = SysCacheGetAttrNotNull(PARTRELID, tuple,
Anum_pg_partitioned_table_partexprs);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
exprsString = TextDatumGetCString(exprsDatum);
partexprs = (List *) stringToNode(exprsString);
if (!IsA(partexprs, List))
elog(ERROR, "unexpected node type found in partexprs: %d",
(int) nodeTag(partexprs));
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
pfree(exprsString);
}
else
partexprs = NIL;
partexpr_item = list_head(partexprs);
context = deparse_context_for(get_relation_name(relid), relid);
initStringInfo(&buf);
switch (form->partstrat)
{
case PARTITION_STRATEGY_HASH:
if (!attrsOnly)
appendStringInfoString(&buf, "HASH");
break;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
case PARTITION_STRATEGY_LIST:
if (!attrsOnly)
appendStringInfoString(&buf, "LIST");
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
break;
case PARTITION_STRATEGY_RANGE:
if (!attrsOnly)
appendStringInfoString(&buf, "RANGE");
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
break;
default:
elog(ERROR, "unexpected partition strategy: %d",
(int) form->partstrat);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
}
if (!attrsOnly)
appendStringInfoString(&buf, " (");
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
sep = "";
for (keyno = 0; keyno < form->partnatts; keyno++)
{
AttrNumber attnum = form->partattrs.values[keyno];
Oid keycoltype;
Oid keycolcollation;
Oid partcoll;
appendStringInfoString(&buf, sep);
sep = ", ";
if (attnum != 0)
{
/* Simple attribute reference */
char *attname;
int32 keycoltypmod;
attname = get_attname(relid, attnum, false);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
appendStringInfoString(&buf, quote_identifier(attname));
get_atttypetypmodcoll(relid, attnum,
&keycoltype, &keycoltypmod,
&keycolcollation);
}
else
{
/* Expression */
Node *partkey;
if (partexpr_item == NULL)
elog(ERROR, "too few entries in partexprs list");
partkey = (Node *) lfirst(partexpr_item);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
partexpr_item = lnext(partexprs, partexpr_item);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
/* Deparse */
str = deparse_expression_pretty(partkey, context, false, false,
prettyFlags, 0);
/* Need parens if it's not a bare function call */
if (looks_like_function(partkey))
appendStringInfoString(&buf, str);
else
appendStringInfo(&buf, "(%s)", str);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
keycoltype = exprType(partkey);
keycolcollation = exprCollation(partkey);
}
/* Add collation, if not default for column */
partcoll = partcollation->values[keyno];
if (!attrsOnly && OidIsValid(partcoll) && partcoll != keycolcollation)
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
appendStringInfo(&buf, " COLLATE %s",
generate_collation_name((partcoll)));
/* Add the operator class name, if not default */
if (!attrsOnly)
get_opclass_name(partclass->values[keyno], keycoltype, &buf);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
}
if (!attrsOnly)
appendStringInfoChar(&buf, ')');
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
/* Clean up */
ReleaseSysCache(tuple);
return buf.data;
}
/*
* pg_get_partition_constraintdef
*
* Returns partition constraint expression as a string for the input relation
*/
Datum
pg_get_partition_constraintdef(PG_FUNCTION_ARGS)
{
Oid relationId = PG_GETARG_OID(0);
Expr *constr_expr;
int prettyFlags;
List *context;
char *consrc;
constr_expr = get_partition_qual_relid(relationId);
/* Quick exit if no partition constraint */
if (constr_expr == NULL)
PG_RETURN_NULL();
/*
* Deparse and return the constraint expression.
*/
prettyFlags = PRETTYFLAG_INDENT;
context = deparse_context_for(get_relation_name(relationId), relationId);
consrc = deparse_expression_pretty((Node *) constr_expr, context, false,
false, prettyFlags, 0);
PG_RETURN_TEXT_P(string_to_text(consrc));
}
/*
* pg_get_partconstrdef_string
*
* Returns the partition constraint as a C-string for the input relation, with
* the given alias. No pretty-printing.
*/
char *
pg_get_partconstrdef_string(Oid partitionId, char *aliasname)
{
Expr *constr_expr;
List *context;
constr_expr = get_partition_qual_relid(partitionId);
context = deparse_context_for(aliasname, partitionId);
return deparse_expression((Node *) constr_expr, context, true, false);
}
/*
* pg_get_constraintdef
*
* Returns the definition for the constraint, ie, everything that needs to
* appear after "ALTER TABLE ... ADD CONSTRAINT <constraintname>".
*/
Datum
pg_get_constraintdef(PG_FUNCTION_ARGS)
{
Oid constraintId = PG_GETARG_OID(0);
int prettyFlags;
char *res;
prettyFlags = PRETTYFLAG_INDENT;
res = pg_get_constraintdef_worker(constraintId, false, prettyFlags, true);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Datum
pg_get_constraintdef_ext(PG_FUNCTION_ARGS)
{
Oid constraintId = PG_GETARG_OID(0);
bool pretty = PG_GETARG_BOOL(1);
int prettyFlags;
char *res;
prettyFlags = GET_PRETTY_FLAGS(pretty);
res = pg_get_constraintdef_worker(constraintId, false, prettyFlags, true);
if (res == NULL)
PG_RETURN_NULL();
PG_RETURN_TEXT_P(string_to_text(res));
}
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
/*
* Internal version that returns a full ALTER TABLE ... ADD CONSTRAINT command
*/
char *
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
pg_get_constraintdef_command(Oid constraintId)
{
return pg_get_constraintdef_worker(constraintId, true, 0, false);
}
/*
* As of 9.4, we now use an MVCC snapshot for this.
*/
static char *
pg_get_constraintdef_worker(Oid constraintId, bool fullCommand,
int prettyFlags, bool missing_ok)
{
HeapTuple tup;
Form_pg_constraint conForm;
StringInfoData buf;
SysScanDesc scandesc;
ScanKeyData scankey[1];
Snapshot snapshot = RegisterSnapshot(GetTransactionSnapshot());
Relation relation = table_open(ConstraintRelationId, AccessShareLock);
ScanKeyInit(&scankey[0],
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
Anum_pg_constraint_oid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(constraintId));
scandesc = systable_beginscan(relation,
ConstraintOidIndexId,
true,
snapshot,
1,
scankey);
/*
* We later use the tuple with SysCacheGetAttr() as if we had obtained it
* via SearchSysCache, which works fine.
*/
tup = systable_getnext(scandesc);
UnregisterSnapshot(snapshot);
if (!HeapTupleIsValid(tup))
{
if (missing_ok)
{
systable_endscan(scandesc);
table_close(relation, AccessShareLock);
return NULL;
}
elog(ERROR, "could not find tuple for constraint %u", constraintId);
}
conForm = (Form_pg_constraint) GETSTRUCT(tup);
initStringInfo(&buf);
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
if (fullCommand)
{
if (OidIsValid(conForm->conrelid))
{
/*
* Currently, callers want ALTER TABLE (without ONLY) for CHECK
* constraints, and other types of constraints don't inherit
* anyway so it doesn't matter whether we say ONLY or not. Someday
* we might need to let callers specify whether to put ONLY in the
* command.
*/
appendStringInfo(&buf, "ALTER TABLE %s ADD CONSTRAINT %s ",
generate_qualified_relation_name(conForm->conrelid),
quote_identifier(NameStr(conForm->conname)));
}
else
{
/* Must be a domain constraint */
Assert(OidIsValid(conForm->contypid));
appendStringInfo(&buf, "ALTER DOMAIN %s ADD CONSTRAINT %s ",
generate_qualified_type_name(conForm->contypid),
quote_identifier(NameStr(conForm->conname)));
}
}
switch (conForm->contype)
{
case CONSTRAINT_FOREIGN:
{
Datum val;
bool isnull;
const char *string;
/* Start off the constraint definition */
appendStringInfoString(&buf, "FOREIGN KEY (");
/* Fetch and build referencing-column list */
val = SysCacheGetAttrNotNull(CONSTROID, tup,
Anum_pg_constraint_conkey);
/* If it is a temporal foreign key then it uses PERIOD. */
decompile_column_index_array(val, conForm->conrelid, conForm->conperiod, &buf);
/* add foreign relation name */
appendStringInfo(&buf, ") REFERENCES %s(",
generate_relation_name(conForm->confrelid,
NIL));
/* Fetch and build referenced-column list */
val = SysCacheGetAttrNotNull(CONSTROID, tup,
Anum_pg_constraint_confkey);
decompile_column_index_array(val, conForm->confrelid, conForm->conperiod, &buf);
appendStringInfoChar(&buf, ')');
/* Add match type */
switch (conForm->confmatchtype)
{
case FKCONSTR_MATCH_FULL:
string = " MATCH FULL";
break;
case FKCONSTR_MATCH_PARTIAL:
string = " MATCH PARTIAL";
break;
case FKCONSTR_MATCH_SIMPLE:
string = "";
break;
default:
elog(ERROR, "unrecognized confmatchtype: %d",
conForm->confmatchtype);
string = ""; /* keep compiler quiet */
break;
}
appendStringInfoString(&buf, string);
/* Add ON UPDATE and ON DELETE clauses, if needed */
switch (conForm->confupdtype)
{
case FKCONSTR_ACTION_NOACTION:
string = NULL; /* suppress default */
break;
case FKCONSTR_ACTION_RESTRICT:
string = "RESTRICT";
break;
case FKCONSTR_ACTION_CASCADE:
string = "CASCADE";
break;
case FKCONSTR_ACTION_SETNULL:
string = "SET NULL";
break;
case FKCONSTR_ACTION_SETDEFAULT:
string = "SET DEFAULT";
break;
default:
elog(ERROR, "unrecognized confupdtype: %d",
conForm->confupdtype);
string = NULL; /* keep compiler quiet */
break;
}
if (string)
appendStringInfo(&buf, " ON UPDATE %s", string);
switch (conForm->confdeltype)
{
case FKCONSTR_ACTION_NOACTION:
string = NULL; /* suppress default */
break;
case FKCONSTR_ACTION_RESTRICT:
string = "RESTRICT";
break;
case FKCONSTR_ACTION_CASCADE:
string = "CASCADE";
break;
case FKCONSTR_ACTION_SETNULL:
string = "SET NULL";
break;
case FKCONSTR_ACTION_SETDEFAULT:
string = "SET DEFAULT";
break;
default:
elog(ERROR, "unrecognized confdeltype: %d",
conForm->confdeltype);
string = NULL; /* keep compiler quiet */
break;
}
if (string)
appendStringInfo(&buf, " ON DELETE %s", string);
/*
* Add columns specified to SET NULL or SET DEFAULT if
* provided.
*/
val = SysCacheGetAttr(CONSTROID, tup,
Anum_pg_constraint_confdelsetcols, &isnull);
if (!isnull)
{
appendStringInfoString(&buf, " (");
decompile_column_index_array(val, conForm->conrelid, false, &buf);
appendStringInfoChar(&buf, ')');
}
break;
}
case CONSTRAINT_PRIMARY:
case CONSTRAINT_UNIQUE:
{
Datum val;
Oid indexId;
int keyatts;
HeapTuple indtup;
/* Start off the constraint definition */
if (conForm->contype == CONSTRAINT_PRIMARY)
appendStringInfoString(&buf, "PRIMARY KEY ");
else
appendStringInfoString(&buf, "UNIQUE ");
indexId = conForm->conindid;
indtup = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexId));
if (!HeapTupleIsValid(indtup))
elog(ERROR, "cache lookup failed for index %u", indexId);
if (conForm->contype == CONSTRAINT_UNIQUE &&
((Form_pg_index) GETSTRUCT(indtup))->indnullsnotdistinct)
appendStringInfoString(&buf, "NULLS NOT DISTINCT ");
appendStringInfoChar(&buf, '(');
/* Fetch and build target column list */
val = SysCacheGetAttrNotNull(CONSTROID, tup,
Anum_pg_constraint_conkey);
keyatts = decompile_column_index_array(val, conForm->conrelid, false, &buf);
if (conForm->conperiod)
appendStringInfoString(&buf, " WITHOUT OVERLAPS");
appendStringInfoChar(&buf, ')');
/* Build including column list (from pg_index.indkeys) */
val = SysCacheGetAttrNotNull(INDEXRELID, indtup,
Anum_pg_index_indnatts);
if (DatumGetInt32(val) > keyatts)
{
Datum cols;
Datum *keys;
int nKeys;
int j;
appendStringInfoString(&buf, " INCLUDE (");
cols = SysCacheGetAttrNotNull(INDEXRELID, indtup,
Anum_pg_index_indkey);
deconstruct_array_builtin(DatumGetArrayTypeP(cols), INT2OID,
&keys, NULL, &nKeys);
for (j = keyatts; j < nKeys; j++)
{
char *colName;
colName = get_attname(conForm->conrelid,
DatumGetInt16(keys[j]), false);
if (j > keyatts)
appendStringInfoString(&buf, ", ");
appendStringInfoString(&buf, quote_identifier(colName));
}
appendStringInfoChar(&buf, ')');
}
ReleaseSysCache(indtup);
/* XXX why do we only print these bits if fullCommand? */
if (fullCommand && OidIsValid(indexId))
{
char *options = flatten_reloptions(indexId);
Oid tblspc;
if (options)
{
appendStringInfo(&buf, " WITH (%s)", options);
pfree(options);
}
/*
* Print the tablespace, unless it's the database default.
* This is to help ALTER TABLE usage of this facility,
* which needs this behavior to recreate exact catalog
* state.
*/
tblspc = get_rel_tablespace(indexId);
if (OidIsValid(tblspc))
appendStringInfo(&buf, " USING INDEX TABLESPACE %s",
quote_identifier(get_tablespace_name(tblspc)));
}
break;
}
case CONSTRAINT_CHECK:
{
Datum val;
char *conbin;
char *consrc;
Node *expr;
List *context;
/* Fetch constraint expression in parsetree form */
val = SysCacheGetAttrNotNull(CONSTROID, tup,
Anum_pg_constraint_conbin);
conbin = TextDatumGetCString(val);
expr = stringToNode(conbin);
/* Set up deparsing context for Var nodes in constraint */
if (conForm->conrelid != InvalidOid)
{
/* relation constraint */
context = deparse_context_for(get_relation_name(conForm->conrelid),
conForm->conrelid);
}
else
{
/* domain constraint --- can't have Vars */
context = NIL;
}
consrc = deparse_expression_pretty(expr, context, false, false,
prettyFlags, 0);
/*
* Now emit the constraint definition, adding NO INHERIT if
* necessary.
*
* There are cases where the constraint expression will be
* fully parenthesized and we don't need the outer parens ...
* but there are other cases where we do need 'em. Be
* conservative for now.
*
* Note that simply checking for leading '(' and trailing ')'
* would NOT be good enough, consider "(x > 0) AND (y > 0)".
*/
appendStringInfo(&buf, "CHECK (%s)%s",
consrc,
conForm->connoinherit ? " NO INHERIT" : "");
break;
}
Catalog not-null constraints We now create contype='n' pg_constraint rows for not-null constraints. We propagate these constraints to other tables during operations such as adding inheritance relationships, creating and attaching partitions and creating tables LIKE other tables. We also spawn not-null constraints for inheritance child tables when their parents have primary keys. These related constraints mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations: for example, as opposed to CHECK constraints, we don't match not-null ones by name when descending a hierarchy to alter it, instead matching by column name that they apply to. This means we don't require the constraint names to be identical across a hierarchy. For now, we omit them for system catalogs. Maybe this is worth reconsidering. We don't support NOT VALID nor DEFERRABLE clauses either; these can be added as separate features later (this patch is already large and complicated enough.) psql shows these constraints in \d+. pg_dump requires some ad-hoc hacks, particularly when dumping a primary key. We now create one "throwaway" not-null constraint for each column in the PK together with the CREATE TABLE command, and once the PK is created, all those throwaway constraints are removed. This avoids having to check each tuple for nullness when the dump restores the primary key creation. pg_upgrading from an older release requires a somewhat brittle procedure to create a constraint state that matches what would be created if the database were being created fresh in Postgres 17. I have tested all the scenarios I could think of, and it works correctly as far as I can tell, but I could have neglected weird cases. This patch has been very long in the making. The first patch was written by Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one was killed by the realization that we ought to use contype='c' instead: manufactured CHECK constraints. However, later SQL standard development, as well as nonobvious emergent properties of that design (mostly, failure to distinguish them from "normal" CHECK constraints as well as the performance implication of having to test the CHECK expression) led us to reconsider this choice, so now the current implementation uses contype='n' again. During Postgres 16 this had already been introduced by commit e056c557aef4, but there were some problems mainly with the pg_upgrade procedure that couldn't be fixed in reasonable time, so it was reverted. In 2016 Vitaly Burovoy also worked on this feature[1] but found no consensus for his proposed approach, which was claimed to be closer to the letter of the standard, requiring an additional pg_attribute column to track the OID of the not-null constraint for that column. [1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
case CONSTRAINT_NOTNULL:
{
if (conForm->conrelid)
{
AttrNumber attnum;
Catalog not-null constraints We now create contype='n' pg_constraint rows for not-null constraints. We propagate these constraints to other tables during operations such as adding inheritance relationships, creating and attaching partitions and creating tables LIKE other tables. We also spawn not-null constraints for inheritance child tables when their parents have primary keys. These related constraints mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations: for example, as opposed to CHECK constraints, we don't match not-null ones by name when descending a hierarchy to alter it, instead matching by column name that they apply to. This means we don't require the constraint names to be identical across a hierarchy. For now, we omit them for system catalogs. Maybe this is worth reconsidering. We don't support NOT VALID nor DEFERRABLE clauses either; these can be added as separate features later (this patch is already large and complicated enough.) psql shows these constraints in \d+. pg_dump requires some ad-hoc hacks, particularly when dumping a primary key. We now create one "throwaway" not-null constraint for each column in the PK together with the CREATE TABLE command, and once the PK is created, all those throwaway constraints are removed. This avoids having to check each tuple for nullness when the dump restores the primary key creation. pg_upgrading from an older release requires a somewhat brittle procedure to create a constraint state that matches what would be created if the database were being created fresh in Postgres 17. I have tested all the scenarios I could think of, and it works correctly as far as I can tell, but I could have neglected weird cases. This patch has been very long in the making. The first patch was written by Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one was killed by the realization that we ought to use contype='c' instead: manufactured CHECK constraints. However, later SQL standard development, as well as nonobvious emergent properties of that design (mostly, failure to distinguish them from "normal" CHECK constraints as well as the performance implication of having to test the CHECK expression) led us to reconsider this choice, so now the current implementation uses contype='n' again. During Postgres 16 this had already been introduced by commit e056c557aef4, but there were some problems mainly with the pg_upgrade procedure that couldn't be fixed in reasonable time, so it was reverted. In 2016 Vitaly Burovoy also worked on this feature[1] but found no consensus for his proposed approach, which was claimed to be closer to the letter of the standard, requiring an additional pg_attribute column to track the OID of the not-null constraint for that column. [1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
attnum = extractNotNullColumn(tup);
Catalog not-null constraints We now create contype='n' pg_constraint rows for not-null constraints. We propagate these constraints to other tables during operations such as adding inheritance relationships, creating and attaching partitions and creating tables LIKE other tables. We also spawn not-null constraints for inheritance child tables when their parents have primary keys. These related constraints mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations: for example, as opposed to CHECK constraints, we don't match not-null ones by name when descending a hierarchy to alter it, instead matching by column name that they apply to. This means we don't require the constraint names to be identical across a hierarchy. For now, we omit them for system catalogs. Maybe this is worth reconsidering. We don't support NOT VALID nor DEFERRABLE clauses either; these can be added as separate features later (this patch is already large and complicated enough.) psql shows these constraints in \d+. pg_dump requires some ad-hoc hacks, particularly when dumping a primary key. We now create one "throwaway" not-null constraint for each column in the PK together with the CREATE TABLE command, and once the PK is created, all those throwaway constraints are removed. This avoids having to check each tuple for nullness when the dump restores the primary key creation. pg_upgrading from an older release requires a somewhat brittle procedure to create a constraint state that matches what would be created if the database were being created fresh in Postgres 17. I have tested all the scenarios I could think of, and it works correctly as far as I can tell, but I could have neglected weird cases. This patch has been very long in the making. The first patch was written by Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one was killed by the realization that we ought to use contype='c' instead: manufactured CHECK constraints. However, later SQL standard development, as well as nonobvious emergent properties of that design (mostly, failure to distinguish them from "normal" CHECK constraints as well as the performance implication of having to test the CHECK expression) led us to reconsider this choice, so now the current implementation uses contype='n' again. During Postgres 16 this had already been introduced by commit e056c557aef4, but there were some problems mainly with the pg_upgrade procedure that couldn't be fixed in reasonable time, so it was reverted. In 2016 Vitaly Burovoy also worked on this feature[1] but found no consensus for his proposed approach, which was claimed to be closer to the letter of the standard, requiring an additional pg_attribute column to track the OID of the not-null constraint for that column. [1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
appendStringInfo(&buf, "NOT NULL %s",
quote_identifier(get_attname(conForm->conrelid,
attnum, false)));
if (((Form_pg_constraint) GETSTRUCT(tup))->connoinherit)
appendStringInfoString(&buf, " NO INHERIT");
}
else if (conForm->contypid)
{
/* conkey is null for domain not-null constraints */
appendStringInfoString(&buf, "NOT NULL VALUE");
}
Catalog not-null constraints We now create contype='n' pg_constraint rows for not-null constraints. We propagate these constraints to other tables during operations such as adding inheritance relationships, creating and attaching partitions and creating tables LIKE other tables. We also spawn not-null constraints for inheritance child tables when their parents have primary keys. These related constraints mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations: for example, as opposed to CHECK constraints, we don't match not-null ones by name when descending a hierarchy to alter it, instead matching by column name that they apply to. This means we don't require the constraint names to be identical across a hierarchy. For now, we omit them for system catalogs. Maybe this is worth reconsidering. We don't support NOT VALID nor DEFERRABLE clauses either; these can be added as separate features later (this patch is already large and complicated enough.) psql shows these constraints in \d+. pg_dump requires some ad-hoc hacks, particularly when dumping a primary key. We now create one "throwaway" not-null constraint for each column in the PK together with the CREATE TABLE command, and once the PK is created, all those throwaway constraints are removed. This avoids having to check each tuple for nullness when the dump restores the primary key creation. pg_upgrading from an older release requires a somewhat brittle procedure to create a constraint state that matches what would be created if the database were being created fresh in Postgres 17. I have tested all the scenarios I could think of, and it works correctly as far as I can tell, but I could have neglected weird cases. This patch has been very long in the making. The first patch was written by Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one was killed by the realization that we ought to use contype='c' instead: manufactured CHECK constraints. However, later SQL standard development, as well as nonobvious emergent properties of that design (mostly, failure to distinguish them from "normal" CHECK constraints as well as the performance implication of having to test the CHECK expression) led us to reconsider this choice, so now the current implementation uses contype='n' again. During Postgres 16 this had already been introduced by commit e056c557aef4, but there were some problems mainly with the pg_upgrade procedure that couldn't be fixed in reasonable time, so it was reverted. In 2016 Vitaly Burovoy also worked on this feature[1] but found no consensus for his proposed approach, which was claimed to be closer to the letter of the standard, requiring an additional pg_attribute column to track the OID of the not-null constraint for that column. [1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
break;
}
case CONSTRAINT_TRIGGER:
2010-02-26 03:01:40 +01:00
/*
* There isn't an ALTER TABLE syntax for creating a user-defined
* constraint trigger, but it seems better to print something than
* throw an error; if we throw error then this function couldn't
* safely be applied to all rows of pg_constraint.
*/
appendStringInfoString(&buf, "TRIGGER");
break;
case CONSTRAINT_EXCLUSION:
{
Oid indexOid = conForm->conindid;
Datum val;
Datum *elems;
int nElems;
int i;
Oid *operators;
/* Extract operator OIDs from the pg_constraint tuple */
val = SysCacheGetAttrNotNull(CONSTROID, tup,
Anum_pg_constraint_conexclop);
deconstruct_array_builtin(DatumGetArrayTypeP(val), OIDOID,
&elems, NULL, &nElems);
operators = (Oid *) palloc(nElems * sizeof(Oid));
for (i = 0; i < nElems; i++)
operators[i] = DatumGetObjectId(elems[i]);
/* pg_get_indexdef_worker does the rest */
/* suppress tablespace because pg_dump wants it that way */
appendStringInfoString(&buf,
pg_get_indexdef_worker(indexOid,
0,
operators,
false,
false,
false,
false,
prettyFlags,
false));
break;
}
default:
elog(ERROR, "invalid constraint type \"%c\"", conForm->contype);
break;
}
if (conForm->condeferrable)
appendStringInfoString(&buf, " DEFERRABLE");
if (conForm->condeferred)
appendStringInfoString(&buf, " INITIALLY DEFERRED");
if (!conForm->convalidated)
appendStringInfoString(&buf, " NOT VALID");
/* Cleanup */
systable_endscan(scandesc);
table_close(relation, AccessShareLock);
return buf.data;
}
/*
* Convert an int16[] Datum into a comma-separated list of column names
* for the indicated relation; append the list to buf. Returns the number
* of keys.
*/
static int
decompile_column_index_array(Datum column_index_array, Oid relId,
bool withPeriod, StringInfo buf)
{
Datum *keys;
int nKeys;
int j;
/* Extract data from array of int16 */
deconstruct_array_builtin(DatumGetArrayTypeP(column_index_array), INT2OID,
&keys, NULL, &nKeys);
for (j = 0; j < nKeys; j++)
{
char *colName;
colName = get_attname(relId, DatumGetInt16(keys[j]), false);
if (j == 0)
appendStringInfoString(buf, quote_identifier(colName));
else
appendStringInfo(buf, ", %s%s",
(withPeriod && j == nKeys - 1) ? "PERIOD " : "",
quote_identifier(colName));
}
return nKeys;
}
/* ----------
* pg_get_expr - Decompile an expression tree
*
* Input: an expression tree in nodeToString form, and a relation OID
*
* Output: reverse-listed expression
*
* Currently, the expression can only refer to a single relation, namely
* the one specified by the second parameter. This is sufficient for
* partial indexes, column default expressions, etc. We also support
* Var-free expressions, for which the OID can be InvalidOid.
*
* If the OID is nonzero but not actually valid, don't throw an error,
* just return NULL. This is a bit questionable, but it's what we've
* done historically, and it can help avoid unwanted failures when
* examining catalog entries for just-deleted relations.
*
* We expect this function to work, or throw a reasonably clean error,
* for any node tree that can appear in a catalog pg_node_tree column.
* Query trees, such as those appearing in pg_rewrite.ev_action, are
* not supported. Nor are expressions in more than one relation, which
* can appear in places like pg_rewrite.ev_qual.
* ----------
*/
Datum
pg_get_expr(PG_FUNCTION_ARGS)
{
text *expr = PG_GETARG_TEXT_PP(0);
Oid relid = PG_GETARG_OID(1);
text *result;
int prettyFlags;
prettyFlags = PRETTYFLAG_INDENT;
result = pg_get_expr_worker(expr, relid, prettyFlags);
if (result)
PG_RETURN_TEXT_P(result);
else
PG_RETURN_NULL();
}
Datum
pg_get_expr_ext(PG_FUNCTION_ARGS)
{
text *expr = PG_GETARG_TEXT_PP(0);
Oid relid = PG_GETARG_OID(1);
bool pretty = PG_GETARG_BOOL(2);
text *result;
int prettyFlags;
prettyFlags = GET_PRETTY_FLAGS(pretty);
result = pg_get_expr_worker(expr, relid, prettyFlags);
if (result)
PG_RETURN_TEXT_P(result);
else
PG_RETURN_NULL();
}
static text *
pg_get_expr_worker(text *expr, Oid relid, int prettyFlags)
{
Node *node;
Node *tst;
Relids relids;
List *context;
char *exprstr;
Relation rel = NULL;
char *str;
/* Convert input pg_node_tree (really TEXT) object to C string */
exprstr = text_to_cstring(expr);
/* Convert expression to node tree */
node = (Node *) stringToNode(exprstr);
pfree(exprstr);
/*
* Throw error if the input is a querytree rather than an expression tree.
* While we could support queries here, there seems no very good reason
* to. In most such catalog columns, we'll see a List of Query nodes, or
* even nested Lists, so drill down to a non-List node before checking.
*/
tst = node;
while (tst && IsA(tst, List))
tst = linitial((List *) tst);
if (tst && IsA(tst, Query))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("input is a query, not an expression")));
/*
* Throw error if the expression contains Vars we won't be able to
* deparse.
*/
relids = pull_varnos(NULL, node);
if (OidIsValid(relid))
{
if (!bms_is_subset(relids, bms_make_singleton(1)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("expression contains variables of more than one relation")));
}
else
{
if (!bms_is_empty(relids))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("expression contains variables")));
}
/*
* Prepare deparse context if needed. If we are deparsing with a relid,
* we need to transiently open and lock the rel, to make sure it won't go
* away underneath us. (set_relation_column_names would lock it anyway,
* so this isn't really introducing any new behavior.)
*/
if (OidIsValid(relid))
{
rel = try_relation_open(relid, AccessShareLock);
if (rel == NULL)
return NULL;
context = deparse_context_for(RelationGetRelationName(rel), relid);
}
else
context = NIL;
/* Deparse */
str = deparse_expression_pretty(node, context, false, false,
prettyFlags, 0);
if (rel != NULL)
relation_close(rel, AccessShareLock);
return string_to_text(str);
}
/* ----------
* pg_get_userbyid - Get a user name by roleid and
* fallback to 'unknown (OID=n)'
* ----------
*/
Datum
pg_get_userbyid(PG_FUNCTION_ARGS)
{
Oid roleid = PG_GETARG_OID(0);
Name result;
HeapTuple roletup;
Form_pg_authid role_rec;
/*
* Allocate space for the result
*/
result = (Name) palloc(NAMEDATALEN);
memset(NameStr(*result), 0, NAMEDATALEN);
/*
* Get the pg_authid entry and print the result
*/
roletup = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
if (HeapTupleIsValid(roletup))
{
role_rec = (Form_pg_authid) GETSTRUCT(roletup);
*result = role_rec->rolname;
ReleaseSysCache(roletup);
}
else
sprintf(NameStr(*result), "unknown (OID=%u)", roleid);
PG_RETURN_NAME(result);
}
/*
* pg_get_serial_sequence
* Get the name of the sequence used by an identity or serial column,
* formatted suitably for passing to setval, nextval or currval.
* First parameter is not treated as double-quoted, second parameter
* is --- see documentation for reason.
*/
Datum
pg_get_serial_sequence(PG_FUNCTION_ARGS)
{
text *tablename = PG_GETARG_TEXT_PP(0);
text *columnname = PG_GETARG_TEXT_PP(1);
RangeVar *tablerv;
Oid tableOid;
char *column;
AttrNumber attnum;
Oid sequenceId = InvalidOid;
Relation depRel;
ScanKeyData key[3];
SysScanDesc scan;
HeapTuple tup;
/* Look up table name. Can't lock it - we might not have privileges. */
tablerv = makeRangeVarFromNameList(textToQualifiedNameList(tablename));
Improve table locking behavior in the face of current DDL. In the previous coding, callers were faced with an awkward choice: look up the name, do permissions checks, and then lock the table; or look up the name, lock the table, and then do permissions checks. The first choice was wrong because the results of the name lookup and permissions checks might be out-of-date by the time the table lock was acquired, while the second allowed a user with no privileges to interfere with access to a table by users who do have privileges (e.g. if a malicious backend queues up for an AccessExclusiveLock on a table on which AccessShareLock is already held, further attempts to access the table will be blocked until the AccessExclusiveLock is obtained and the malicious backend's transaction rolls back). To fix, allow callers of RangeVarGetRelid() to pass a callback which gets executed after performing the name lookup but before acquiring the relation lock. If the name lookup is retried (because invalidation messages are received), the callback will be re-executed as well, so we get the best of both worlds. RangeVarGetRelid() is renamed to RangeVarGetRelidExtended(); callers not wishing to supply a callback can continue to invoke it as RangeVarGetRelid(), which is now a macro. Since the only one caller that uses nowait = true now passes a callback anyway, the RangeVarGetRelid() macro defaults nowait as well. The callback can also be used for supplemental locking - for example, REINDEX INDEX needs to acquire the table lock before the index lock to reduce deadlock possibilities. There's a lot more work to be done here to fix all the cases where this can be a problem, but this commit provides the general infrastructure and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE, LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE. Per discussion with Noah Misch and Alvaro Herrera.
2011-11-30 16:12:27 +01:00
tableOid = RangeVarGetRelid(tablerv, NoLock, false);
/* Get the number of the column */
column = text_to_cstring(columnname);
attnum = get_attnum(tableOid, column);
if (attnum == InvalidAttrNumber)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("column \"%s\" of relation \"%s\" does not exist",
column, tablerv->relname)));
/* Search the dependency table for the dependent sequence */
depRel = table_open(DependRelationId, AccessShareLock);
ScanKeyInit(&key[0],
Anum_pg_depend_refclassid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(RelationRelationId));
ScanKeyInit(&key[1],
Anum_pg_depend_refobjid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(tableOid));
ScanKeyInit(&key[2],
Anum_pg_depend_refobjsubid,
BTEqualStrategyNumber, F_INT4EQ,
Int32GetDatum(attnum));
scan = systable_beginscan(depRel, DependReferenceIndexId, true,
NULL, 3, key);
while (HeapTupleIsValid(tup = systable_getnext(scan)))
{
Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
/*
* Look for an auto dependency (serial column) or internal dependency
* (identity column) of a sequence on a column. (We need the relkind
* test because indexes can also have auto dependencies on columns.)
*/
if (deprec->classid == RelationRelationId &&
deprec->objsubid == 0 &&
(deprec->deptype == DEPENDENCY_AUTO ||
deprec->deptype == DEPENDENCY_INTERNAL) &&
get_rel_relkind(deprec->objid) == RELKIND_SEQUENCE)
{
sequenceId = deprec->objid;
break;
}
}
systable_endscan(scan);
table_close(depRel, AccessShareLock);
if (OidIsValid(sequenceId))
{
char *result;
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
result = generate_qualified_relation_name(sequenceId);
PG_RETURN_TEXT_P(string_to_text(result));
}
PG_RETURN_NULL();
}
/*
* pg_get_functiondef
* Returns the complete "CREATE OR REPLACE FUNCTION ..." statement for
* the specified function.
*
* Note: if you change the output format of this function, be careful not
* to break psql's rules (in \ef and \sf) for identifying the start of the
* function body. To wit: the function body starts on a line that begins with
* "AS ", "BEGIN ", or "RETURN ", and no preceding line will look like that.
*/
Datum
pg_get_functiondef(PG_FUNCTION_ARGS)
{
Oid funcid = PG_GETARG_OID(0);
StringInfoData buf;
StringInfoData dq;
HeapTuple proctup;
Form_pg_proc proc;
bool isfunction;
Datum tmp;
bool isnull;
const char *prosrc;
const char *name;
const char *nsp;
float4 procost;
int oldlen;
initStringInfo(&buf);
/* Look up the function */
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
PG_RETURN_NULL();
proc = (Form_pg_proc) GETSTRUCT(proctup);
name = NameStr(proc->proname);
if (proc->prokind == PROKIND_AGGREGATE)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("\"%s\" is an aggregate function", name)));
isfunction = (proc->prokind != PROKIND_PROCEDURE);
/*
* We always qualify the function name, to ensure the right function gets
* replaced.
*/
nsp = get_namespace_name_or_temp(proc->pronamespace);
appendStringInfo(&buf, "CREATE OR REPLACE %s %s(",
isfunction ? "FUNCTION" : "PROCEDURE",
quote_qualified_identifier(nsp, name));
(void) print_function_arguments(&buf, proctup, false, true);
appendStringInfoString(&buf, ")\n");
if (isfunction)
{
appendStringInfoString(&buf, " RETURNS ");
print_function_rettype(&buf, proctup);
appendStringInfoChar(&buf, '\n');
}
print_function_trftypes(&buf, proctup);
appendStringInfo(&buf, " LANGUAGE %s\n",
quote_identifier(get_language_name(proc->prolang, false)));
/* Emit some miscellaneous options on one line */
oldlen = buf.len;
if (proc->prokind == PROKIND_WINDOW)
appendStringInfoString(&buf, " WINDOW");
switch (proc->provolatile)
{
case PROVOLATILE_IMMUTABLE:
appendStringInfoString(&buf, " IMMUTABLE");
break;
case PROVOLATILE_STABLE:
appendStringInfoString(&buf, " STABLE");
break;
case PROVOLATILE_VOLATILE:
break;
}
switch (proc->proparallel)
{
case PROPARALLEL_SAFE:
appendStringInfoString(&buf, " PARALLEL SAFE");
break;
case PROPARALLEL_RESTRICTED:
appendStringInfoString(&buf, " PARALLEL RESTRICTED");
break;
case PROPARALLEL_UNSAFE:
break;
}
if (proc->proisstrict)
appendStringInfoString(&buf, " STRICT");
if (proc->prosecdef)
appendStringInfoString(&buf, " SECURITY DEFINER");
if (proc->proleakproof)
appendStringInfoString(&buf, " LEAKPROOF");
/* This code for the default cost and rows should match functioncmds.c */
if (proc->prolang == INTERNALlanguageId ||
proc->prolang == ClanguageId)
procost = 1;
else
procost = 100;
if (proc->procost != procost)
appendStringInfo(&buf, " COST %g", proc->procost);
if (proc->prorows > 0 && proc->prorows != 1000)
appendStringInfo(&buf, " ROWS %g", proc->prorows);
if (proc->prosupport)
{
Oid argtypes[1];
/*
* We should qualify the support function's name if it wouldn't be
* resolved by lookup in the current search path.
*/
argtypes[0] = INTERNALOID;
appendStringInfo(&buf, " SUPPORT %s",
generate_function_name(proc->prosupport, 1,
NIL, argtypes,
false, NULL, EXPR_KIND_NONE));
}
if (oldlen != buf.len)
appendStringInfoChar(&buf, '\n');
/* Emit any proconfig options, one per line */
tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_proconfig, &isnull);
if (!isnull)
{
ArrayType *a = DatumGetArrayTypeP(tmp);
int i;
Assert(ARR_ELEMTYPE(a) == TEXTOID);
Assert(ARR_NDIM(a) == 1);
Assert(ARR_LBOUND(a)[0] == 1);
for (i = 1; i <= ARR_DIMS(a)[0]; i++)
{
Datum d;
d = array_ref(a, 1, &i,
-1 /* varlenarray */ ,
-1 /* TEXT's typlen */ ,
false /* TEXT's typbyval */ ,
TYPALIGN_INT /* TEXT's typalign */ ,
&isnull);
if (!isnull)
{
char *configitem = TextDatumGetCString(d);
char *pos;
pos = strchr(configitem, '=');
if (pos == NULL)
continue;
*pos++ = '\0';
appendStringInfo(&buf, " SET %s TO ",
quote_identifier(configitem));
/*
Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c. Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
2018-03-22 01:03:28 +01:00
* Variables that are marked GUC_LIST_QUOTE were already fully
* quoted by flatten_set_variable_args() before they were put
Further fixes for quoted-list GUC values in pg_dump and ruleutils.c. Commits 742869946 et al turn out to be a couple bricks shy of a load. We were dumping the stored values of GUC_LIST_QUOTE variables as they appear in proconfig or setconfig catalog columns. However, although that quoting rule looks a lot like SQL-identifier double quotes, there are two critical differences: empty strings ("") are legal, and depending on which variable you're considering, values longer than NAMEDATALEN might be valid too. So the current technique fails altogether on empty-string list entries (as reported by Steven Winfield in bug #15248) and it also risks truncating file pathnames during dump/reload of GUC values that are lists of pathnames. To fix, split the stored value without any downcasing or truncation, and then emit each element as a SQL string literal. This is a tad annoying, because we now have three copies of the comma-separated-string splitting logic in varlena.c as well as a fourth one in dumputils.c. (Not to mention the randomly-different-from-those splitting logic in libpq...) I looked at unifying these, but it would be rather a mess unless we're willing to tweak the API definitions of SplitIdentifierString, SplitDirectoriesString, or both. That might be worth doing in future; but it seems pretty unsafe for a back-patched bug fix, so for now accept the duplication. Back-patch to all supported branches, as the previous fix was. Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
2018-07-31 19:00:07 +02:00
* into the proconfig array. However, because the quoting
* rules used there aren't exactly like SQL's, we have to
* break the list value apart and then quote the elements as
* string literals. (The elements may be double-quoted as-is,
* but we can't just feed them to the SQL parser; it would do
* the wrong thing with elements that are zero-length or
* longer than NAMEDATALEN.)
*
* Variables that are not so marked should just be emitted as
* simple string literals. If the variable is not known to
* guc.c, we'll do that; this makes it unsafe to use
* GUC_LIST_QUOTE for extension variables.
*/
Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c. Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
2018-03-22 01:03:28 +01:00
if (GetConfigOptionFlags(configitem, true) & GUC_LIST_QUOTE)
Further fixes for quoted-list GUC values in pg_dump and ruleutils.c. Commits 742869946 et al turn out to be a couple bricks shy of a load. We were dumping the stored values of GUC_LIST_QUOTE variables as they appear in proconfig or setconfig catalog columns. However, although that quoting rule looks a lot like SQL-identifier double quotes, there are two critical differences: empty strings ("") are legal, and depending on which variable you're considering, values longer than NAMEDATALEN might be valid too. So the current technique fails altogether on empty-string list entries (as reported by Steven Winfield in bug #15248) and it also risks truncating file pathnames during dump/reload of GUC values that are lists of pathnames. To fix, split the stored value without any downcasing or truncation, and then emit each element as a SQL string literal. This is a tad annoying, because we now have three copies of the comma-separated-string splitting logic in varlena.c as well as a fourth one in dumputils.c. (Not to mention the randomly-different-from-those splitting logic in libpq...) I looked at unifying these, but it would be rather a mess unless we're willing to tweak the API definitions of SplitIdentifierString, SplitDirectoriesString, or both. That might be worth doing in future; but it seems pretty unsafe for a back-patched bug fix, so for now accept the duplication. Back-patch to all supported branches, as the previous fix was. Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
2018-07-31 19:00:07 +02:00
{
List *namelist;
ListCell *lc;
/* Parse string into list of identifiers */
if (!SplitGUCList(pos, ',', &namelist))
{
/* this shouldn't fail really */
elog(ERROR, "invalid list syntax in proconfig item");
}
foreach(lc, namelist)
{
char *curname = (char *) lfirst(lc);
simple_quote_literal(&buf, curname);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
if (lnext(namelist, lc))
Further fixes for quoted-list GUC values in pg_dump and ruleutils.c. Commits 742869946 et al turn out to be a couple bricks shy of a load. We were dumping the stored values of GUC_LIST_QUOTE variables as they appear in proconfig or setconfig catalog columns. However, although that quoting rule looks a lot like SQL-identifier double quotes, there are two critical differences: empty strings ("") are legal, and depending on which variable you're considering, values longer than NAMEDATALEN might be valid too. So the current technique fails altogether on empty-string list entries (as reported by Steven Winfield in bug #15248) and it also risks truncating file pathnames during dump/reload of GUC values that are lists of pathnames. To fix, split the stored value without any downcasing or truncation, and then emit each element as a SQL string literal. This is a tad annoying, because we now have three copies of the comma-separated-string splitting logic in varlena.c as well as a fourth one in dumputils.c. (Not to mention the randomly-different-from-those splitting logic in libpq...) I looked at unifying these, but it would be rather a mess unless we're willing to tweak the API definitions of SplitIdentifierString, SplitDirectoriesString, or both. That might be worth doing in future; but it seems pretty unsafe for a back-patched bug fix, so for now accept the duplication. Back-patch to all supported branches, as the previous fix was. Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
2018-07-31 19:00:07 +02:00
appendStringInfoString(&buf, ", ");
}
}
else
simple_quote_literal(&buf, pos);
appendStringInfoChar(&buf, '\n');
}
}
}
/* And finally the function definition ... */
(void) SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_prosqlbody, &isnull);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
if (proc->prolang == SQLlanguageId && !isnull)
{
print_function_sqlbody(&buf, proctup);
}
else
{
appendStringInfoString(&buf, "AS ");
tmp = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_probin, &isnull);
if (!isnull)
{
simple_quote_literal(&buf, TextDatumGetCString(tmp));
appendStringInfoString(&buf, ", "); /* assume prosrc isn't null */
}
tmp = SysCacheGetAttrNotNull(PROCOID, proctup, Anum_pg_proc_prosrc);
prosrc = TextDatumGetCString(tmp);
/*
* We always use dollar quoting. Figure out a suitable delimiter.
*
* Since the user is likely to be editing the function body string, we
* shouldn't use a short delimiter that he might easily create a
* conflict with. Hence prefer "$function$"/"$procedure$", but extend
* if needed.
*/
initStringInfo(&dq);
appendStringInfoChar(&dq, '$');
appendStringInfoString(&dq, (isfunction ? "function" : "procedure"));
while (strstr(prosrc, dq.data) != NULL)
appendStringInfoChar(&dq, 'x');
appendStringInfoChar(&dq, '$');
appendBinaryStringInfo(&buf, dq.data, dq.len);
appendStringInfoString(&buf, prosrc);
appendBinaryStringInfo(&buf, dq.data, dq.len);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
}
appendStringInfoChar(&buf, '\n');
ReleaseSysCache(proctup);
PG_RETURN_TEXT_P(string_to_text(buf.data));
}
/*
* pg_get_function_arguments
* Get a nicely-formatted list of arguments for a function.
* This is everything that would go between the parentheses in
* CREATE FUNCTION.
*/
Datum
pg_get_function_arguments(PG_FUNCTION_ARGS)
{
Oid funcid = PG_GETARG_OID(0);
StringInfoData buf;
HeapTuple proctup;
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
PG_RETURN_NULL();
initStringInfo(&buf);
(void) print_function_arguments(&buf, proctup, false, true);
ReleaseSysCache(proctup);
PG_RETURN_TEXT_P(string_to_text(buf.data));
}
/*
* pg_get_function_identity_arguments
* Get a formatted list of arguments for a function.
* This is everything that would go between the parentheses in
* ALTER FUNCTION, etc. In particular, don't print defaults.
*/
Datum
pg_get_function_identity_arguments(PG_FUNCTION_ARGS)
{
Oid funcid = PG_GETARG_OID(0);
StringInfoData buf;
HeapTuple proctup;
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
PG_RETURN_NULL();
initStringInfo(&buf);
(void) print_function_arguments(&buf, proctup, false, false);
ReleaseSysCache(proctup);
PG_RETURN_TEXT_P(string_to_text(buf.data));
}
/*
* pg_get_function_result
* Get a nicely-formatted version of the result type of a function.
* This is what would appear after RETURNS in CREATE FUNCTION.
*/
Datum
pg_get_function_result(PG_FUNCTION_ARGS)
{
Oid funcid = PG_GETARG_OID(0);
StringInfoData buf;
HeapTuple proctup;
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
PG_RETURN_NULL();
if (((Form_pg_proc) GETSTRUCT(proctup))->prokind == PROKIND_PROCEDURE)
{
ReleaseSysCache(proctup);
PG_RETURN_NULL();
}
initStringInfo(&buf);
print_function_rettype(&buf, proctup);
ReleaseSysCache(proctup);
PG_RETURN_TEXT_P(string_to_text(buf.data));
}
/*
* Guts of pg_get_function_result: append the function's return type
* to the specified buffer.
*/
static void
print_function_rettype(StringInfo buf, HeapTuple proctup)
{
Form_pg_proc proc = (Form_pg_proc) GETSTRUCT(proctup);
int ntabargs = 0;
StringInfoData rbuf;
initStringInfo(&rbuf);
if (proc->proretset)
{
/* It might be a table function; try to print the arguments */
appendStringInfoString(&rbuf, "TABLE(");
ntabargs = print_function_arguments(&rbuf, proctup, true, false);
if (ntabargs > 0)
appendStringInfoChar(&rbuf, ')');
else
resetStringInfo(&rbuf);
}
if (ntabargs == 0)
{
/* Not a table function, so do the normal thing */
if (proc->proretset)
appendStringInfoString(&rbuf, "SETOF ");
appendStringInfoString(&rbuf, format_type_be(proc->prorettype));
}
appendBinaryStringInfo(buf, rbuf.data, rbuf.len);
}
/*
* Common code for pg_get_function_arguments and pg_get_function_result:
* append the desired subset of arguments to buf. We print only TABLE
* arguments when print_table_args is true, and all the others when it's false.
* We print argument defaults only if print_defaults is true.
* Function return value is the number of arguments printed.
*/
static int
print_function_arguments(StringInfo buf, HeapTuple proctup,
bool print_table_args, bool print_defaults)
{
Form_pg_proc proc = (Form_pg_proc) GETSTRUCT(proctup);
int numargs;
Oid *argtypes;
char **argnames;
char *argmodes;
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
int insertorderbyat = -1;
int argsprinted;
int inputargno;
int nlackdefaults;
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
List *argdefaults = NIL;
ListCell *nextargdefault = NULL;
int i;
numargs = get_func_arg_info(proctup,
&argtypes, &argnames, &argmodes);
nlackdefaults = numargs;
if (print_defaults && proc->pronargdefaults > 0)
{
Datum proargdefaults;
bool isnull;
proargdefaults = SysCacheGetAttr(PROCOID, proctup,
Anum_pg_proc_proargdefaults,
&isnull);
if (!isnull)
{
char *str;
str = TextDatumGetCString(proargdefaults);
2017-02-21 17:33:07 +01:00
argdefaults = castNode(List, stringToNode(str));
pfree(str);
nextargdefault = list_head(argdefaults);
/* nlackdefaults counts only *input* arguments lacking defaults */
nlackdefaults = proc->pronargs - list_length(argdefaults);
}
}
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
/* Check for special treatment of ordered-set aggregates */
if (proc->prokind == PROKIND_AGGREGATE)
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
{
HeapTuple aggtup;
Form_pg_aggregate agg;
aggtup = SearchSysCache1(AGGFNOID, ObjectIdGetDatum(proc->oid));
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
if (!HeapTupleIsValid(aggtup))
elog(ERROR, "cache lookup failed for aggregate %u",
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
proc->oid);
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
agg = (Form_pg_aggregate) GETSTRUCT(aggtup);
if (AGGKIND_IS_ORDERED_SET(agg->aggkind))
insertorderbyat = agg->aggnumdirectargs;
ReleaseSysCache(aggtup);
}
argsprinted = 0;
inputargno = 0;
for (i = 0; i < numargs; i++)
{
Oid argtype = argtypes[i];
char *argname = argnames ? argnames[i] : NULL;
char argmode = argmodes ? argmodes[i] : PROARGMODE_IN;
const char *modename;
bool isinput;
switch (argmode)
{
case PROARGMODE_IN:
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
/*
* For procedures, explicitly mark all argument modes, so as
* to avoid ambiguity with the SQL syntax for DROP PROCEDURE.
*/
if (proc->prokind == PROKIND_PROCEDURE)
modename = "IN ";
else
modename = "";
isinput = true;
break;
case PROARGMODE_INOUT:
modename = "INOUT ";
isinput = true;
break;
case PROARGMODE_OUT:
modename = "OUT ";
isinput = false;
break;
case PROARGMODE_VARIADIC:
modename = "VARIADIC ";
isinput = true;
break;
case PROARGMODE_TABLE:
modename = "";
isinput = false;
break;
default:
elog(ERROR, "invalid parameter mode '%c'", argmode);
modename = NULL; /* keep compiler quiet */
isinput = false;
break;
}
if (isinput)
inputargno++; /* this is a 1-based counter */
if (print_table_args != (argmode == PROARGMODE_TABLE))
continue;
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
if (argsprinted == insertorderbyat)
{
if (argsprinted)
appendStringInfoChar(buf, ' ');
appendStringInfoString(buf, "ORDER BY ");
}
else if (argsprinted)
appendStringInfoString(buf, ", ");
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
appendStringInfoString(buf, modename);
if (argname && argname[0])
appendStringInfo(buf, "%s ", quote_identifier(argname));
appendStringInfoString(buf, format_type_be(argtype));
if (print_defaults && isinput && inputargno > nlackdefaults)
{
Node *expr;
Assert(nextargdefault != NULL);
expr = (Node *) lfirst(nextargdefault);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
nextargdefault = lnext(argdefaults, nextargdefault);
appendStringInfo(buf, " DEFAULT %s",
deparse_expression(expr, NIL, false, false));
}
argsprinted++;
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
/* nasty hack: print the last arg twice for variadic ordered-set agg */
if (argsprinted == insertorderbyat && i == numargs - 1)
{
i--;
/* aggs shouldn't have defaults anyway, but just to be sure ... */
print_defaults = false;
}
}
return argsprinted;
}
static bool
is_input_argument(int nth, const char *argmodes)
{
return (!argmodes
|| argmodes[nth] == PROARGMODE_IN
|| argmodes[nth] == PROARGMODE_INOUT
|| argmodes[nth] == PROARGMODE_VARIADIC);
}
/*
* Append used transformed types to specified buffer
*/
static void
print_function_trftypes(StringInfo buf, HeapTuple proctup)
{
Oid *trftypes;
int ntypes;
ntypes = get_func_trftypes(proctup, &trftypes);
if (ntypes > 0)
{
int i;
appendStringInfoString(buf, " TRANSFORM ");
for (i = 0; i < ntypes; i++)
{
if (i != 0)
appendStringInfoString(buf, ", ");
appendStringInfo(buf, "FOR TYPE %s", format_type_be(trftypes[i]));
}
appendStringInfoChar(buf, '\n');
}
}
/*
* Get textual representation of a function argument's default value. The
* second argument of this function is the argument number among all arguments
* (i.e. proallargtypes, *not* proargtypes), starting with 1, because that's
* how information_schema.sql uses it.
*/
Datum
pg_get_function_arg_default(PG_FUNCTION_ARGS)
{
Oid funcid = PG_GETARG_OID(0);
int32 nth_arg = PG_GETARG_INT32(1);
HeapTuple proctup;
Form_pg_proc proc;
int numargs;
Oid *argtypes;
char **argnames;
char *argmodes;
int i;
List *argdefaults;
Node *node;
char *str;
int nth_inputarg;
Datum proargdefaults;
bool isnull;
int nth_default;
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
PG_RETURN_NULL();
numargs = get_func_arg_info(proctup, &argtypes, &argnames, &argmodes);
if (nth_arg < 1 || nth_arg > numargs || !is_input_argument(nth_arg - 1, argmodes))
{
ReleaseSysCache(proctup);
PG_RETURN_NULL();
}
nth_inputarg = 0;
for (i = 0; i < nth_arg; i++)
if (is_input_argument(i, argmodes))
nth_inputarg++;
proargdefaults = SysCacheGetAttr(PROCOID, proctup,
Anum_pg_proc_proargdefaults,
&isnull);
if (isnull)
{
ReleaseSysCache(proctup);
PG_RETURN_NULL();
}
str = TextDatumGetCString(proargdefaults);
2017-02-21 17:33:07 +01:00
argdefaults = castNode(List, stringToNode(str));
pfree(str);
proc = (Form_pg_proc) GETSTRUCT(proctup);
/*
* Calculate index into proargdefaults: proargdefaults corresponds to the
* last N input arguments, where N = pronargdefaults.
*/
nth_default = nth_inputarg - 1 - (proc->pronargs - proc->pronargdefaults);
if (nth_default < 0 || nth_default >= list_length(argdefaults))
{
ReleaseSysCache(proctup);
PG_RETURN_NULL();
}
node = list_nth(argdefaults, nth_default);
str = deparse_expression(node, NIL, false, false);
ReleaseSysCache(proctup);
PG_RETURN_TEXT_P(string_to_text(str));
}
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
static void
print_function_sqlbody(StringInfo buf, HeapTuple proctup)
{
int numargs;
Oid *argtypes;
char **argnames;
char *argmodes;
deparse_namespace dpns = {0};
Datum tmp;
Node *n;
dpns.funcname = pstrdup(NameStr(((Form_pg_proc) GETSTRUCT(proctup))->proname));
numargs = get_func_arg_info(proctup,
&argtypes, &argnames, &argmodes);
dpns.numargs = numargs;
dpns.argnames = argnames;
tmp = SysCacheGetAttrNotNull(PROCOID, proctup, Anum_pg_proc_prosqlbody);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
n = stringToNode(TextDatumGetCString(tmp));
if (IsA(n, List))
{
List *stmts;
ListCell *lc;
stmts = linitial(castNode(List, n));
appendStringInfoString(buf, "BEGIN ATOMIC\n");
foreach(lc, stmts)
{
Query *query = lfirst_node(Query, lc);
/* It seems advisable to get at least AccessShareLock on rels */
AcquireRewriteLocks(query, false, false);
get_query_def(query, buf, list_make1(&dpns), NULL, false,
PRETTYFLAG_INDENT, WRAP_COLUMN_DEFAULT, 1);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
appendStringInfoChar(buf, ';');
appendStringInfoChar(buf, '\n');
}
appendStringInfoString(buf, "END");
}
else
{
Query *query = castNode(Query, n);
/* It seems advisable to get at least AccessShareLock on rels */
AcquireRewriteLocks(query, false, false);
get_query_def(query, buf, list_make1(&dpns), NULL, false,
0, WRAP_COLUMN_DEFAULT, 0);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
}
}
Datum
pg_get_function_sqlbody(PG_FUNCTION_ARGS)
{
Oid funcid = PG_GETARG_OID(0);
StringInfoData buf;
HeapTuple proctup;
bool isnull;
initStringInfo(&buf);
/* Look up the function */
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
PG_RETURN_NULL();
(void) SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_prosqlbody, &isnull);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
if (isnull)
{
ReleaseSysCache(proctup);
PG_RETURN_NULL();
}
print_function_sqlbody(&buf, proctup);
ReleaseSysCache(proctup);
PG_RETURN_TEXT_P(cstring_to_text_with_len(buf.data, buf.len));
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
}
/*
* deparse_expression - General utility for deparsing expressions
*
* calls deparse_expression_pretty with all prettyPrinting disabled
*/
char *
deparse_expression(Node *expr, List *dpcontext,
bool forceprefix, bool showimplicit)
{
return deparse_expression_pretty(expr, dpcontext, forceprefix,
showimplicit, 0, 0);
}
/* ----------
* deparse_expression_pretty - General utility for deparsing expressions
*
* expr is the node tree to be deparsed. It must be a transformed expression
* tree (ie, not the raw output of gram.y).
*
* dpcontext is a list of deparse_namespace nodes representing the context
* for interpreting Vars in the node tree. It can be NIL if no Vars are
* expected.
*
* forceprefix is true to force all Vars to be prefixed with their table names.
*
* showimplicit is true to force all implicit casts to be shown explicitly.
*
* Tries to pretty up the output according to prettyFlags and startIndent.
*
* The result is a palloc'd string.
* ----------
*/
static char *
deparse_expression_pretty(Node *expr, List *dpcontext,
bool forceprefix, bool showimplicit,
int prettyFlags, int startIndent)
{
StringInfoData buf;
deparse_context context;
initStringInfo(&buf);
context.buf = &buf;
context.namespaces = dpcontext;
context.windowClause = NIL;
context.windowTList = NIL;
context.varprefix = forceprefix;
context.prettyFlags = prettyFlags;
context.wrapColumn = WRAP_COLUMN_DEFAULT;
context.indentLevel = startIndent;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
context.special_exprkind = EXPR_KIND_NONE;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
context.appendparents = NULL;
get_rule_expr(expr, &context, showimplicit);
return buf.data;
}
/* ----------
* deparse_context_for - Build deparse context for a single relation
*
* Given the reference name (alias) and OID of a relation, build deparsing
* context for an expression referencing only that relation (as varno 1,
* varlevelsup 0). This is sufficient for many uses of deparse_expression.
* ----------
*/
List *
deparse_context_for(const char *aliasname, Oid relid)
{
deparse_namespace *dpns;
RangeTblEntry *rte;
dpns = (deparse_namespace *) palloc0(sizeof(deparse_namespace));
/* Build a minimal RTE for the rel */
rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
rte->relid = relid;
rte->relkind = RELKIND_RELATION; /* no need for exactness here */
rte->rellockmode = AccessShareLock;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
rte->alias = makeAlias(aliasname, NIL);
rte->eref = rte->alias;
rte->lateral = false;
rte->inh = false;
rte->inFromCl = true;
/* Build one-element rtable */
dpns->rtable = list_make1(rte);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->subplans = NIL;
dpns->ctes = NIL;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->appendrels = NULL;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
set_rtable_names(dpns, NIL, NULL);
set_simple_column_names(dpns);
/* Return a one-deep namespace stack */
return list_make1(dpns);
}
/*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* deparse_context_for_plan_tree - Build deparse context for a Plan tree
*
* When deparsing an expression in a Plan tree, we use the plan's rangetable
* to resolve names of simple Vars. The initialization of column names for
* this is rather expensive if the rangetable is large, and it'll be the same
* for every expression in the Plan tree; so we do it just once and re-use
* the result of this function for each expression. (Note that the result
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* is not usable until set_deparse_context_plan() is applied to it.)
*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* In addition to the PlannedStmt, pass the per-RTE alias names
* assigned by a previous call to select_rtable_names_for_explain.
*/
List *
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
deparse_context_for_plan_tree(PlannedStmt *pstmt, List *rtable_names)
{
deparse_namespace *dpns;
dpns = (deparse_namespace *) palloc0(sizeof(deparse_namespace));
/* Initialize fields that stay the same across the whole plan tree */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->rtable = pstmt->rtable;
dpns->rtable_names = rtable_names;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->subplans = pstmt->subplans;
dpns->ctes = NIL;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (pstmt->appendRelations)
{
/* Set up the array, indexed by child relid */
int ntables = list_length(dpns->rtable);
ListCell *lc;
dpns->appendrels = (AppendRelInfo **)
palloc0((ntables + 1) * sizeof(AppendRelInfo *));
foreach(lc, pstmt->appendRelations)
{
AppendRelInfo *appinfo = lfirst_node(AppendRelInfo, lc);
Index crelid = appinfo->child_relid;
Assert(crelid > 0 && crelid <= ntables);
Assert(dpns->appendrels[crelid] == NULL);
dpns->appendrels[crelid] = appinfo;
}
}
else
dpns->appendrels = NULL; /* don't need it */
/*
* Set up column name aliases. We will get rather bogus results for join
* RTEs, but that doesn't matter because plan trees don't contain any join
* alias Vars.
*/
set_simple_column_names(dpns);
/* Return a one-deep namespace stack */
return list_make1(dpns);
}
/*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* set_deparse_context_plan - Specify Plan node containing expression
*
* When deparsing an expression in a Plan tree, we might have to resolve
* OUTER_VAR, INNER_VAR, or INDEX_VAR references. To do this, the caller must
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* provide the parent Plan node. Then OUTER_VAR and INNER_VAR references
* can be resolved by drilling down into the left and right child plans.
* Similarly, INDEX_VAR references can be resolved by reference to the
Code review for foreign/custom join pushdown patch. Commit e7cb7ee14555cc9c5773e2c102efd6371f6f2005 included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.
2015-05-10 20:36:30 +02:00
* indextlist given in a parent IndexOnlyScan node, or to the scan tlist in
* ForeignScan and CustomScan nodes. (Note that we don't currently support
* deparsing of indexquals in regular IndexScan or BitmapIndexScan nodes;
* for those, we can only deparse the indexqualorig fields, which won't
* contain INDEX_VAR Vars.)
*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* The ancestors list is a list of the Plan's parent Plan and SubPlan nodes,
* the most-closely-nested first. This is needed to resolve PARAM_EXEC
* Params. Note we assume that all the Plan nodes share the same rtable.
*
* Once this function has been called, deparse_expression() can be called on
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* subsidiary expression(s) of the specified Plan node. To deparse
* expressions of a different Plan node in the same Plan tree, re-call this
* function to identify the new parent Plan node.
*
* The result is the same List passed in; this is a notational convenience.
*/
List *
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
set_deparse_context_plan(List *dpcontext, Plan *plan, List *ancestors)
{
deparse_namespace *dpns;
/* Should always have one-entry namespace list for Plan deparsing */
Assert(list_length(dpcontext) == 1);
dpns = (deparse_namespace *) linitial(dpcontext);
/* Set our attention on the specific plan node passed in */
dpns->ancestors = ancestors;
set_deparse_plan(dpns, plan);
return dpcontext;
}
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
/*
* select_rtable_names_for_explain - Select RTE aliases for EXPLAIN
*
* Determine the relation aliases we'll use during an EXPLAIN operation.
* This is just a frontend to set_rtable_names. We have to expose the aliases
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
* to EXPLAIN because EXPLAIN needs to know the right alias names to print.
*/
List *
select_rtable_names_for_explain(List *rtable, Bitmapset *rels_used)
{
deparse_namespace dpns;
memset(&dpns, 0, sizeof(dpns));
dpns.rtable = rtable;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns.subplans = NIL;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
dpns.ctes = NIL;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns.appendrels = NULL;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
set_rtable_names(&dpns, NIL, rels_used);
/* We needn't bother computing column aliases yet */
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
return dpns.rtable_names;
}
/*
* set_rtable_names: select RTE aliases to be used in printing a query
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
*
* We fill in dpns->rtable_names with a list of names that is one-for-one with
* the already-filled dpns->rtable list. Each RTE name is unique among those
* in the new namespace plus any ancestor namespaces listed in
* parent_namespaces.
*
* If rels_used isn't NULL, only RTE indexes listed in it are given aliases.
*
* Note that this function is only concerned with relation names, not column
* names.
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
*/
static void
set_rtable_names(deparse_namespace *dpns, List *parent_namespaces,
Bitmapset *rels_used)
{
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
HASHCTL hash_ctl;
HTAB *names_hash;
NameHashEntry *hentry;
bool found;
int rtindex;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
ListCell *lc;
dpns->rtable_names = NIL;
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
/* nothing more to do if empty rtable */
if (dpns->rtable == NIL)
return;
/*
* We use a hash table to hold known names, so that this process is O(N)
* not O(N^2) for N names.
*/
hash_ctl.keysize = NAMEDATALEN;
hash_ctl.entrysize = sizeof(NameHashEntry);
hash_ctl.hcxt = CurrentMemoryContext;
names_hash = hash_create("set_rtable_names names",
list_length(dpns->rtable),
&hash_ctl,
Improve hash_create()'s API for some added robustness. Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
2020-12-15 17:38:53 +01:00
HASH_ELEM | HASH_STRINGS | HASH_CONTEXT);
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
/* Preload the hash table with names appearing in parent_namespaces */
foreach(lc, parent_namespaces)
{
deparse_namespace *olddpns = (deparse_namespace *) lfirst(lc);
ListCell *lc2;
foreach(lc2, olddpns->rtable_names)
{
char *oldname = (char *) lfirst(lc2);
if (oldname == NULL)
continue;
hentry = (NameHashEntry *) hash_search(names_hash,
oldname,
HASH_ENTER,
&found);
/* we do not complain about duplicate names in parent namespaces */
hentry->counter = 0;
}
}
/* Now we can scan the rtable */
rtindex = 1;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
foreach(lc, dpns->rtable)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
char *refname;
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
/* Just in case this takes an unreasonable amount of time ... */
CHECK_FOR_INTERRUPTS();
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
if (rels_used && !bms_is_member(rtindex, rels_used))
{
/* Ignore unreferenced RTE */
refname = NULL;
}
else if (rte->alias)
{
/* If RTE has a user-defined alias, prefer that */
refname = rte->alias->aliasname;
}
else if (rte->rtekind == RTE_RELATION)
{
/* Use the current actual name of the relation */
refname = get_rel_name(rte->relid);
}
else if (rte->rtekind == RTE_JOIN)
{
/* Unnamed join has no refname */
refname = NULL;
}
else
{
/* Otherwise use whatever the parser assigned */
refname = rte->eref->aliasname;
}
/*
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
* If the selected name isn't unique, append digits to make it so, and
* make a new hash entry for it once we've got a unique name. For a
* very long input name, we might have to truncate to stay within
* NAMEDATALEN.
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
*/
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
if (refname)
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
{
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
hentry = (NameHashEntry *) hash_search(names_hash,
refname,
HASH_ENTER,
&found);
if (found)
{
/* Name already in use, must choose a new one */
int refnamelen = strlen(refname);
char *modname = (char *) palloc(refnamelen + 16);
NameHashEntry *hentry2;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
do
{
hentry->counter++;
for (;;)
{
memcpy(modname, refname, refnamelen);
sprintf(modname + refnamelen, "_%d", hentry->counter);
if (strlen(modname) < NAMEDATALEN)
break;
/* drop chars from refname to keep all the digits */
refnamelen = pg_mbcliplen(refname, refnamelen,
refnamelen - 1);
}
hentry2 = (NameHashEntry *) hash_search(names_hash,
modname,
HASH_ENTER,
&found);
} while (found);
hentry2->counter = 0; /* init new hash entry */
refname = modname;
}
else
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
{
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
/* Name not previously used, need only initialize hentry */
hentry->counter = 0;
}
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
}
dpns->rtable_names = lappend(dpns->rtable_names, refname);
rtindex++;
}
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
hash_destroy(names_hash);
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
}
/*
* set_deparse_for_query: set up deparse_namespace for deparsing a Query tree
*
* For convenience, this is defined to initialize the deparse_namespace struct
* from scratch.
*/
static void
set_deparse_for_query(deparse_namespace *dpns, Query *query,
List *parent_namespaces)
{
ListCell *lc;
ListCell *lc2;
/* Initialize *dpns and fill rtable/ctes links */
memset(dpns, 0, sizeof(deparse_namespace));
dpns->rtable = query->rtable;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->subplans = NIL;
dpns->ctes = query->cteList;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->appendrels = NULL;
/* Assign a unique relation alias to each RTE */
set_rtable_names(dpns, parent_namespaces, NULL);
/* Initialize dpns->rtable_columns to contain zeroed structs */
dpns->rtable_columns = NIL;
while (list_length(dpns->rtable_columns) < list_length(dpns->rtable))
dpns->rtable_columns = lappend(dpns->rtable_columns,
palloc0(sizeof(deparse_columns)));
/* If it's a utility query, it won't have a jointree */
if (query->jointree)
{
/* Detect whether global uniqueness of USING names is needed */
dpns->unique_using =
has_dangerous_join_using(dpns, (Node *) query->jointree);
/*
* Select names for columns merged by USING, via a recursive pass over
* the query jointree.
*/
set_using_names(dpns, (Node *) query->jointree, NIL);
}
/*
* Now assign remaining column aliases for each RTE. We do this in a
* linear scan of the rtable, so as to process RTEs whether or not they
* are in the jointree (we mustn't miss NEW.*, INSERT target relations,
* etc). JOIN RTEs must be processed after their children, but this is
* okay because they appear later in the rtable list than their children
* (cf Asserts in identify_join_columns()).
*/
forboth(lc, dpns->rtable, lc2, dpns->rtable_columns)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
deparse_columns *colinfo = (deparse_columns *) lfirst(lc2);
if (rte->rtekind == RTE_JOIN)
set_join_column_names(dpns, rte, colinfo);
else
set_relation_column_names(dpns, rte, colinfo);
}
}
/*
* set_simple_column_names: fill in column aliases for non-query situations
*
* This handles EXPLAIN and cases where we only have relation RTEs. Without
* a join tree, we can't do anything smart about join RTEs, but we don't
* need to (note that EXPLAIN should never see join alias Vars anyway).
* If we do hit a join RTE we'll just process it like a non-table base RTE.
*/
static void
set_simple_column_names(deparse_namespace *dpns)
{
ListCell *lc;
ListCell *lc2;
/* Initialize dpns->rtable_columns to contain zeroed structs */
dpns->rtable_columns = NIL;
while (list_length(dpns->rtable_columns) < list_length(dpns->rtable))
dpns->rtable_columns = lappend(dpns->rtable_columns,
palloc0(sizeof(deparse_columns)));
/* Assign unique column aliases within each RTE */
forboth(lc, dpns->rtable, lc2, dpns->rtable_columns)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
deparse_columns *colinfo = (deparse_columns *) lfirst(lc2);
set_relation_column_names(dpns, rte, colinfo);
}
}
/*
* has_dangerous_join_using: search jointree for unnamed JOIN USING
*
* Merged columns of a JOIN USING may act differently from either of the input
* columns, either because they are merged with COALESCE (in a FULL JOIN) or
* because an implicit coercion of the underlying input column is required.
* In such a case the column must be referenced as a column of the JOIN not as
* a column of either input. And this is problematic if the join is unnamed
* (alias-less): we cannot qualify the column's name with an RTE name, since
* there is none. (Forcibly assigning an alias to the join is not a solution,
* since that will prevent legal references to tables below the join.)
* To ensure that every column in the query is unambiguously referenceable,
* we must assign such merged columns names that are globally unique across
* the whole query, aliasing other columns out of the way as necessary.
*
* Because the ensuing re-aliasing is fairly damaging to the readability of
* the query, we don't do this unless we have to. So, we must pre-scan
* the join tree to see if we have to, before starting set_using_names().
*/
static bool
has_dangerous_join_using(deparse_namespace *dpns, Node *jtnode)
{
if (IsA(jtnode, RangeTblRef))
{
/* nothing to do here */
}
else if (IsA(jtnode, FromExpr))
{
FromExpr *f = (FromExpr *) jtnode;
ListCell *lc;
foreach(lc, f->fromlist)
{
if (has_dangerous_join_using(dpns, (Node *) lfirst(lc)))
return true;
}
}
else if (IsA(jtnode, JoinExpr))
{
JoinExpr *j = (JoinExpr *) jtnode;
/* Is it an unnamed JOIN with USING? */
if (j->alias == NULL && j->usingClause)
{
/*
* Yes, so check each join alias var to see if any of them are not
* simple references to underlying columns. If so, we have a
* dangerous situation and must pick unique aliases.
*/
RangeTblEntry *jrte = rt_fetch(j->rtindex, dpns->rtable);
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
/* We need only examine the merged columns */
for (int i = 0; i < jrte->joinmergedcols; i++)
{
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
Node *aliasvar = list_nth(jrte->joinaliasvars, i);
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
if (!IsA(aliasvar, Var))
return true;
}
}
/* Nope, but inspect children */
if (has_dangerous_join_using(dpns, j->larg))
return true;
if (has_dangerous_join_using(dpns, j->rarg))
return true;
}
else
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(jtnode));
return false;
}
/*
* set_using_names: select column aliases to be used for merged USING columns
*
* We do this during a recursive descent of the query jointree.
* dpns->unique_using must already be set to determine the global strategy.
*
* Column alias info is saved in the dpns->rtable_columns list, which is
* assumed to be filled with pre-zeroed deparse_columns structs.
*
* parentUsing is a list of all USING aliases assigned in parent joins of
* the current jointree node. (The passed-in list must not be modified.)
*/
static void
set_using_names(deparse_namespace *dpns, Node *jtnode, List *parentUsing)
{
if (IsA(jtnode, RangeTblRef))
{
/* nothing to do now */
}
else if (IsA(jtnode, FromExpr))
{
FromExpr *f = (FromExpr *) jtnode;
ListCell *lc;
foreach(lc, f->fromlist)
set_using_names(dpns, (Node *) lfirst(lc), parentUsing);
}
else if (IsA(jtnode, JoinExpr))
{
JoinExpr *j = (JoinExpr *) jtnode;
RangeTblEntry *rte = rt_fetch(j->rtindex, dpns->rtable);
deparse_columns *colinfo = deparse_columns_fetch(j->rtindex, dpns);
int *leftattnos;
int *rightattnos;
deparse_columns *leftcolinfo;
deparse_columns *rightcolinfo;
int i;
ListCell *lc;
/* Get info about the shape of the join */
identify_join_columns(j, rte, colinfo);
leftattnos = colinfo->leftattnos;
rightattnos = colinfo->rightattnos;
/* Look up the not-yet-filled-in child deparse_columns structs */
leftcolinfo = deparse_columns_fetch(colinfo->leftrti, dpns);
rightcolinfo = deparse_columns_fetch(colinfo->rightrti, dpns);
/*
* If this join is unnamed, then we cannot substitute new aliases at
* this level, so any name requirements pushed down to here must be
* pushed down again to the children.
*/
if (rte->alias == NULL)
{
for (i = 0; i < colinfo->num_cols; i++)
{
char *colname = colinfo->colnames[i];
if (colname == NULL)
continue;
/* Push down to left column, unless it's a system column */
if (leftattnos[i] > 0)
{
expand_colnames_array_to(leftcolinfo, leftattnos[i]);
leftcolinfo->colnames[leftattnos[i] - 1] = colname;
}
/* Same on the righthand side */
if (rightattnos[i] > 0)
{
expand_colnames_array_to(rightcolinfo, rightattnos[i]);
rightcolinfo->colnames[rightattnos[i] - 1] = colname;
}
}
}
/*
* If there's a USING clause, select the USING column names and push
* those names down to the children. We have two strategies:
*
* If dpns->unique_using is true, we force all USING names to be
* unique across the whole query level. In principle we'd only need
* the names of dangerous USING columns to be globally unique, but to
* safely assign all USING names in a single pass, we have to enforce
* the same uniqueness rule for all of them. However, if a USING
* column's name has been pushed down from the parent, we should use
* it as-is rather than making a uniqueness adjustment. This is
* necessary when we're at an unnamed join, and it creates no risk of
* ambiguity. Also, if there's a user-written output alias for a
* merged column, we prefer to use that rather than the input name;
* this simplifies the logic and seems likely to lead to less aliasing
* overall.
*
* If dpns->unique_using is false, we only need USING names to be
* unique within their own join RTE. We still need to honor
* pushed-down names, though.
*
* Though significantly different in results, these two strategies are
* implemented by the same code, with only the difference of whether
* to put assigned names into dpns->using_names.
*/
if (j->usingClause)
{
/* Copy the input parentUsing list so we don't modify it */
parentUsing = list_copy(parentUsing);
/* USING names must correspond to the first join output columns */
expand_colnames_array_to(colinfo, list_length(j->usingClause));
i = 0;
foreach(lc, j->usingClause)
{
char *colname = strVal(lfirst(lc));
/* Assert it's a merged column */
Assert(leftattnos[i] != 0 && rightattnos[i] != 0);
/* Adopt passed-down name if any, else select unique name */
if (colinfo->colnames[i] != NULL)
colname = colinfo->colnames[i];
else
{
/* Prefer user-written output alias if any */
if (rte->alias && i < list_length(rte->alias->colnames))
colname = strVal(list_nth(rte->alias->colnames, i));
/* Make it appropriately unique */
colname = make_colname_unique(colname, dpns, colinfo);
if (dpns->unique_using)
dpns->using_names = lappend(dpns->using_names,
colname);
/* Save it as output column name, too */
colinfo->colnames[i] = colname;
}
/* Remember selected names for use later */
colinfo->usingNames = lappend(colinfo->usingNames, colname);
parentUsing = lappend(parentUsing, colname);
/* Push down to left column, unless it's a system column */
if (leftattnos[i] > 0)
{
expand_colnames_array_to(leftcolinfo, leftattnos[i]);
leftcolinfo->colnames[leftattnos[i] - 1] = colname;
}
/* Same on the righthand side */
if (rightattnos[i] > 0)
{
expand_colnames_array_to(rightcolinfo, rightattnos[i]);
rightcolinfo->colnames[rightattnos[i] - 1] = colname;
}
i++;
}
}
/* Mark child deparse_columns structs with correct parentUsing info */
leftcolinfo->parentUsing = parentUsing;
rightcolinfo->parentUsing = parentUsing;
/* Now recursively assign USING column names in children */
set_using_names(dpns, j->larg, parentUsing);
set_using_names(dpns, j->rarg, parentUsing);
}
else
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(jtnode));
}
/*
* set_relation_column_names: select column aliases for a non-join RTE
*
* Column alias info is saved in *colinfo, which is assumed to be pre-zeroed.
* If any colnames entries are already filled in, those override local
* choices.
*/
static void
set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
deparse_columns *colinfo)
{
int ncolumns;
char **real_colnames;
bool changed_any;
int noldcolumns;
int i;
int j;
/*
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
* Construct an array of the current "real" column names of the RTE.
* real_colnames[] will be indexed by physical column number, with NULL
* entries for dropped columns.
*/
if (rte->rtekind == RTE_RELATION)
{
/* Relation --- look to the system catalogs for up-to-date info */
Relation rel;
TupleDesc tupdesc;
rel = relation_open(rte->relid, AccessShareLock);
tupdesc = RelationGetDescr(rel);
ncolumns = tupdesc->natts;
real_colnames = (char **) palloc(ncolumns * sizeof(char *));
for (i = 0; i < ncolumns; i++)
{
Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
if (attr->attisdropped)
real_colnames[i] = NULL;
else
real_colnames[i] = pstrdup(NameStr(attr->attname));
}
relation_close(rel, AccessShareLock);
}
else
{
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
/* Otherwise get the column names from eref or expandRTE() */
List *colnames;
ListCell *lc;
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
/*
* Functions returning composites have the annoying property that some
* of the composite type's columns might have been dropped since the
* query was parsed. If possible, use expandRTE() to handle that
* case, since it has the tedious logic needed to find out about
* dropped columns. However, if we're explaining a plan, then we
* don't have rte->functions because the planner thinks that won't be
* needed later, and that breaks expandRTE(). So in that case we have
* to rely on rte->eref, which may lead us to report a dropped
* column's old name; that seems close enough for EXPLAIN's purposes.
*
* For non-RELATION, non-FUNCTION RTEs, we can just look at rte->eref,
* which should be sufficiently up-to-date: no other RTE types can
* have columns get dropped from under them after parsing.
*/
if (rte->rtekind == RTE_FUNCTION && rte->functions != NIL)
{
/* Since we're not creating Vars, rtindex etc. don't matter */
expandRTE(rte, 1, 0, -1, true /* include dropped */ ,
&colnames, NULL);
}
else
colnames = rte->eref->colnames;
ncolumns = list_length(colnames);
real_colnames = (char **) palloc(ncolumns * sizeof(char *));
i = 0;
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
foreach(lc, colnames)
{
Partial fix for dropped columns in functions returning composite. When a view has a function-returning-composite in FROM, and there are some dropped columns in the underlying composite type, ruleutils.c printed junk in the column alias list for the reconstructed FROM entry. Before 9.3, this was prevented by doing get_rte_attribute_is_dropped tests while printing the column alias list; but that solution is not currently available to us for reasons I'll explain below. Instead, check for empty-string entries in the alias list, which can only exist if that column position had been dropped at the time the view was made. (The parser fills in empty strings to preserve the invariant that the aliases correspond to physical column positions.) While this is sufficient to handle the case of columns dropped before the view was made, we have still got issues with columns dropped after the view was made. In particular, the view could contain Vars that explicitly reference such columns! The dependency machinery really ought to refuse the column drop attempt in such cases, as it would do when trying to drop a table column that's explicitly referenced in views. However, we currently neglect to store dependencies on columns of composite types, and fixing that is likely to be too big to be back-patchable (not to mention that existing views in existing databases would not have the needed pg_depend entries anyway). So I'll leave that for a separate patch. Pre-9.3, ruleutils would print such Vars normally (with their original column names) even though it suppressed their entries in the RTE's column alias list. This is certainly bogus, since the printed view definition would fail to reload, but at least it didn't crash. However, as of 9.3 the printed column alias list is tightly tied to the names printed for Vars; so we can't treat columns as dropped for one purpose and not dropped for the other. This is why we can't just put back the get_rte_attribute_is_dropped test: it results in an assertion failure if the view in fact contains any Vars referencing the dropped column. Once we've got dependencies preventing such cases, we'll probably want to do it that way instead of relying on the empty-string test used here. This fix turned up a very ancient bug in outfuncs/readfuncs, namely that T_String nodes containing empty strings were not dumped/reloaded correctly: the node was printed as "<>" which is read as a string value of <>. Since (per SQL) we disallow empty-string identifiers, such nodes don't occur normally, which is why we'd not noticed. (Such nodes aren't used for literal constants, just identifiers.) Per report from Marc Schablewski. Back-patch to 9.3 which is where the rule printing behavior changed. The dangling-variable case is broken all the way back, but that's not what his complaint is about.
2014-07-19 20:28:22 +02:00
/*
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
* If the column name we find here is an empty string, then it's a
* dropped column, so change to NULL.
Partial fix for dropped columns in functions returning composite. When a view has a function-returning-composite in FROM, and there are some dropped columns in the underlying composite type, ruleutils.c printed junk in the column alias list for the reconstructed FROM entry. Before 9.3, this was prevented by doing get_rte_attribute_is_dropped tests while printing the column alias list; but that solution is not currently available to us for reasons I'll explain below. Instead, check for empty-string entries in the alias list, which can only exist if that column position had been dropped at the time the view was made. (The parser fills in empty strings to preserve the invariant that the aliases correspond to physical column positions.) While this is sufficient to handle the case of columns dropped before the view was made, we have still got issues with columns dropped after the view was made. In particular, the view could contain Vars that explicitly reference such columns! The dependency machinery really ought to refuse the column drop attempt in such cases, as it would do when trying to drop a table column that's explicitly referenced in views. However, we currently neglect to store dependencies on columns of composite types, and fixing that is likely to be too big to be back-patchable (not to mention that existing views in existing databases would not have the needed pg_depend entries anyway). So I'll leave that for a separate patch. Pre-9.3, ruleutils would print such Vars normally (with their original column names) even though it suppressed their entries in the RTE's column alias list. This is certainly bogus, since the printed view definition would fail to reload, but at least it didn't crash. However, as of 9.3 the printed column alias list is tightly tied to the names printed for Vars; so we can't treat columns as dropped for one purpose and not dropped for the other. This is why we can't just put back the get_rte_attribute_is_dropped test: it results in an assertion failure if the view in fact contains any Vars referencing the dropped column. Once we've got dependencies preventing such cases, we'll probably want to do it that way instead of relying on the empty-string test used here. This fix turned up a very ancient bug in outfuncs/readfuncs, namely that T_String nodes containing empty strings were not dumped/reloaded correctly: the node was printed as "<>" which is read as a string value of <>. Since (per SQL) we disallow empty-string identifiers, such nodes don't occur normally, which is why we'd not noticed. (Such nodes aren't used for literal constants, just identifiers.) Per report from Marc Schablewski. Back-patch to 9.3 which is where the rule printing behavior changed. The dangling-variable case is broken all the way back, but that's not what his complaint is about.
2014-07-19 20:28:22 +02:00
*/
char *cname = strVal(lfirst(lc));
if (cname[0] == '\0')
cname = NULL;
real_colnames[i] = cname;
i++;
}
}
/*
* Ensure colinfo->colnames has a slot for each column. (It could be long
* enough already, if we pushed down a name for the last column.) Note:
* it's possible that there are now more columns than there were when the
* query was parsed, ie colnames could be longer than rte->eref->colnames.
* We must assign unique aliases to the new columns too, else there could
* be unresolved conflicts when the view/rule is reloaded.
*/
expand_colnames_array_to(colinfo, ncolumns);
Assert(colinfo->num_cols == ncolumns);
/*
* Make sufficiently large new_colnames and is_new_col arrays, too.
*
* Note: because we leave colinfo->num_new_cols zero until after the loop,
* colname_is_unique will not consult that array, which is fine because it
* would only be duplicate effort.
*/
colinfo->new_colnames = (char **) palloc(ncolumns * sizeof(char *));
colinfo->is_new_col = (bool *) palloc(ncolumns * sizeof(bool));
/*
* Scan the columns, select a unique alias for each one, and store it in
* colinfo->colnames and colinfo->new_colnames. The former array has NULL
* entries for dropped columns, the latter omits them. Also mark
* new_colnames entries as to whether they are new since parse time; this
* is the case for entries beyond the length of rte->eref->colnames.
*/
noldcolumns = list_length(rte->eref->colnames);
changed_any = false;
j = 0;
for (i = 0; i < ncolumns; i++)
{
char *real_colname = real_colnames[i];
char *colname = colinfo->colnames[i];
/* Skip dropped columns */
if (real_colname == NULL)
{
Assert(colname == NULL); /* colnames[i] is already NULL */
continue;
}
/* If alias already assigned, that's what to use */
if (colname == NULL)
{
/* If user wrote an alias, prefer that over real column name */
if (rte->alias && i < list_length(rte->alias->colnames))
colname = strVal(list_nth(rte->alias->colnames, i));
else
colname = real_colname;
/* Unique-ify and insert into colinfo */
colname = make_colname_unique(colname, dpns, colinfo);
colinfo->colnames[i] = colname;
}
/* Put names of non-dropped columns in new_colnames[] too */
colinfo->new_colnames[j] = colname;
/* And mark them as new or not */
colinfo->is_new_col[j] = (i >= noldcolumns);
j++;
/* Remember if any assigned aliases differ from "real" name */
if (!changed_any && strcmp(colname, real_colname) != 0)
changed_any = true;
}
/*
* Set correct length for new_colnames[] array. (Note: if columns have
* been added, colinfo->num_cols includes them, which is not really quite
* right but is harmless, since any new columns must be at the end where
* they won't affect varattnos of pre-existing columns.)
*/
colinfo->num_new_cols = j;
/*
* For a relation RTE, we need only print the alias column names if any
* are different from the underlying "real" names. For a function RTE,
* always emit a complete column alias list; this is to protect against
* possible instability of the default column names (eg, from altering
* parameter names). For tablefunc RTEs, we never print aliases, because
* the column names are part of the clause itself. For other RTE types,
* print if we changed anything OR if there were user-written column
* aliases (since the latter would be part of the underlying "reality").
*/
if (rte->rtekind == RTE_RELATION)
colinfo->printaliases = changed_any;
else if (rte->rtekind == RTE_FUNCTION)
colinfo->printaliases = true;
else if (rte->rtekind == RTE_TABLEFUNC)
colinfo->printaliases = false;
else if (rte->alias && rte->alias->colnames != NIL)
colinfo->printaliases = true;
else
colinfo->printaliases = changed_any;
}
/*
* set_join_column_names: select column aliases for a join RTE
*
* Column alias info is saved in *colinfo, which is assumed to be pre-zeroed.
* If any colnames entries are already filled in, those override local
* choices. Also, names for USING columns were already chosen by
* set_using_names(). We further expect that column alias selection has been
* completed for both input RTEs.
*/
static void
set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
deparse_columns *colinfo)
{
deparse_columns *leftcolinfo;
deparse_columns *rightcolinfo;
bool changed_any;
int noldcolumns;
int nnewcolumns;
Bitmapset *leftmerged = NULL;
Bitmapset *rightmerged = NULL;
int i;
int j;
int ic;
int jc;
/* Look up the previously-filled-in child deparse_columns structs */
leftcolinfo = deparse_columns_fetch(colinfo->leftrti, dpns);
rightcolinfo = deparse_columns_fetch(colinfo->rightrti, dpns);
/*
* Ensure colinfo->colnames has a slot for each column. (It could be long
* enough already, if we pushed down a name for the last column.) Note:
* it's possible that one or both inputs now have more columns than there
* were when the query was parsed, but we'll deal with that below. We
* only need entries in colnames for pre-existing columns.
*/
noldcolumns = list_length(rte->eref->colnames);
expand_colnames_array_to(colinfo, noldcolumns);
Assert(colinfo->num_cols == noldcolumns);
/*
* Scan the join output columns, select an alias for each one, and store
* it in colinfo->colnames. If there are USING columns, set_using_names()
* already selected their names, so we can start the loop at the first
* non-merged column.
*/
changed_any = false;
for (i = list_length(colinfo->usingNames); i < noldcolumns; i++)
{
char *colname = colinfo->colnames[i];
char *real_colname;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
/* Join column must refer to at least one input column */
Assert(colinfo->leftattnos[i] != 0 || colinfo->rightattnos[i] != 0);
/* Get the child column name */
if (colinfo->leftattnos[i] > 0)
real_colname = leftcolinfo->colnames[colinfo->leftattnos[i] - 1];
else if (colinfo->rightattnos[i] > 0)
real_colname = rightcolinfo->colnames[colinfo->rightattnos[i] - 1];
else
{
/* We're joining system columns --- use eref name */
real_colname = strVal(list_nth(rte->eref->colnames, i));
}
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
/* If child col has been dropped, no need to assign a join colname */
if (real_colname == NULL)
{
colinfo->colnames[i] = NULL;
continue;
}
/* In an unnamed join, just report child column names as-is */
if (rte->alias == NULL)
{
colinfo->colnames[i] = real_colname;
continue;
}
/* If alias already assigned, that's what to use */
if (colname == NULL)
{
/* If user wrote an alias, prefer that over real column name */
if (rte->alias && i < list_length(rte->alias->colnames))
colname = strVal(list_nth(rte->alias->colnames, i));
else
colname = real_colname;
/* Unique-ify and insert into colinfo */
colname = make_colname_unique(colname, dpns, colinfo);
colinfo->colnames[i] = colname;
}
/* Remember if any assigned aliases differ from "real" name */
if (!changed_any && strcmp(colname, real_colname) != 0)
changed_any = true;
}
/*
* Calculate number of columns the join would have if it were re-parsed
* now, and create storage for the new_colnames and is_new_col arrays.
*
* Note: colname_is_unique will be consulting new_colnames[] during the
* loops below, so its not-yet-filled entries must be zeroes.
*/
nnewcolumns = leftcolinfo->num_new_cols + rightcolinfo->num_new_cols -
list_length(colinfo->usingNames);
colinfo->num_new_cols = nnewcolumns;
colinfo->new_colnames = (char **) palloc0(nnewcolumns * sizeof(char *));
colinfo->is_new_col = (bool *) palloc0(nnewcolumns * sizeof(bool));
/*
* Generating the new_colnames array is a bit tricky since any new columns
* added since parse time must be inserted in the right places. This code
* must match the parser, which will order a join's columns as merged
* columns first (in USING-clause order), then non-merged columns from the
* left input (in attnum order), then non-merged columns from the right
* input (ditto). If one of the inputs is itself a join, its columns will
* be ordered according to the same rule, which means newly-added columns
* might not be at the end. We can figure out what's what by consulting
* the leftattnos and rightattnos arrays plus the input is_new_col arrays.
*
* In these loops, i indexes leftattnos/rightattnos (so it's join varattno
* less one), j indexes new_colnames/is_new_col, and ic/jc have similar
* meanings for the current child RTE.
*/
/* Handle merged columns; they are first and can't be new */
i = j = 0;
while (i < noldcolumns &&
colinfo->leftattnos[i] != 0 &&
colinfo->rightattnos[i] != 0)
{
/* column name is already determined and known unique */
colinfo->new_colnames[j] = colinfo->colnames[i];
colinfo->is_new_col[j] = false;
/* build bitmapsets of child attnums of merged columns */
if (colinfo->leftattnos[i] > 0)
leftmerged = bms_add_member(leftmerged, colinfo->leftattnos[i]);
if (colinfo->rightattnos[i] > 0)
rightmerged = bms_add_member(rightmerged, colinfo->rightattnos[i]);
i++, j++;
}
/* Handle non-merged left-child columns */
ic = 0;
for (jc = 0; jc < leftcolinfo->num_new_cols; jc++)
{
char *child_colname = leftcolinfo->new_colnames[jc];
if (!leftcolinfo->is_new_col[jc])
{
/* Advance ic to next non-dropped old column of left child */
while (ic < leftcolinfo->num_cols &&
leftcolinfo->colnames[ic] == NULL)
ic++;
Assert(ic < leftcolinfo->num_cols);
ic++;
/* If it is a merged column, we already processed it */
if (bms_is_member(ic, leftmerged))
continue;
/* Else, advance i to the corresponding existing join column */
while (i < colinfo->num_cols &&
colinfo->colnames[i] == NULL)
i++;
Assert(i < colinfo->num_cols);
Assert(ic == colinfo->leftattnos[i]);
/* Use the already-assigned name of this column */
colinfo->new_colnames[j] = colinfo->colnames[i];
i++;
}
else
{
/*
* Unique-ify the new child column name and assign, unless we're
* in an unnamed join, in which case just copy
*/
if (rte->alias != NULL)
{
colinfo->new_colnames[j] =
make_colname_unique(child_colname, dpns, colinfo);
if (!changed_any &&
strcmp(colinfo->new_colnames[j], child_colname) != 0)
changed_any = true;
}
else
colinfo->new_colnames[j] = child_colname;
}
colinfo->is_new_col[j] = leftcolinfo->is_new_col[jc];
j++;
}
/* Handle non-merged right-child columns in exactly the same way */
ic = 0;
for (jc = 0; jc < rightcolinfo->num_new_cols; jc++)
{
char *child_colname = rightcolinfo->new_colnames[jc];
if (!rightcolinfo->is_new_col[jc])
{
/* Advance ic to next non-dropped old column of right child */
while (ic < rightcolinfo->num_cols &&
rightcolinfo->colnames[ic] == NULL)
ic++;
Assert(ic < rightcolinfo->num_cols);
ic++;
/* If it is a merged column, we already processed it */
if (bms_is_member(ic, rightmerged))
continue;
/* Else, advance i to the corresponding existing join column */
while (i < colinfo->num_cols &&
colinfo->colnames[i] == NULL)
i++;
Assert(i < colinfo->num_cols);
Assert(ic == colinfo->rightattnos[i]);
/* Use the already-assigned name of this column */
colinfo->new_colnames[j] = colinfo->colnames[i];
i++;
}
else
{
/*
* Unique-ify the new child column name and assign, unless we're
* in an unnamed join, in which case just copy
*/
if (rte->alias != NULL)
{
colinfo->new_colnames[j] =
make_colname_unique(child_colname, dpns, colinfo);
if (!changed_any &&
strcmp(colinfo->new_colnames[j], child_colname) != 0)
changed_any = true;
}
else
colinfo->new_colnames[j] = child_colname;
}
colinfo->is_new_col[j] = rightcolinfo->is_new_col[jc];
j++;
}
/* Assert we processed the right number of columns */
#ifdef USE_ASSERT_CHECKING
while (i < colinfo->num_cols && colinfo->colnames[i] == NULL)
i++;
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif
/*
* For a named join, print column aliases if we changed any from the child
* names. Unnamed joins cannot print aliases.
*/
if (rte->alias != NULL)
colinfo->printaliases = changed_any;
else
colinfo->printaliases = false;
}
/*
* colname_is_unique: is colname distinct from already-chosen column names?
*
* dpns is query-wide info, colinfo is for the column's RTE
*/
static bool
colname_is_unique(const char *colname, deparse_namespace *dpns,
deparse_columns *colinfo)
{
int i;
ListCell *lc;
/* Check against already-assigned column aliases within RTE */
for (i = 0; i < colinfo->num_cols; i++)
{
char *oldname = colinfo->colnames[i];
if (oldname && strcmp(oldname, colname) == 0)
return false;
}
/*
* If we're building a new_colnames array, check that too (this will be
* partially but not completely redundant with the previous checks)
*/
for (i = 0; i < colinfo->num_new_cols; i++)
{
char *oldname = colinfo->new_colnames[i];
if (oldname && strcmp(oldname, colname) == 0)
return false;
}
/* Also check against USING-column names that must be globally unique */
foreach(lc, dpns->using_names)
{
char *oldname = (char *) lfirst(lc);
if (strcmp(oldname, colname) == 0)
return false;
}
/* Also check against names already assigned for parent-join USING cols */
foreach(lc, colinfo->parentUsing)
{
char *oldname = (char *) lfirst(lc);
if (strcmp(oldname, colname) == 0)
return false;
}
return true;
}
/*
* make_colname_unique: modify colname if necessary to make it unique
*
* dpns is query-wide info, colinfo is for the column's RTE
*/
static char *
make_colname_unique(char *colname, deparse_namespace *dpns,
deparse_columns *colinfo)
{
/*
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
* If the selected name isn't unique, append digits to make it so. For a
* very long input name, we might have to truncate to stay within
* NAMEDATALEN.
*/
if (!colname_is_unique(colname, dpns, colinfo))
{
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
int colnamelen = strlen(colname);
char *modname = (char *) palloc(colnamelen + 16);
int i = 0;
do
{
Speed up ruleutils' name de-duplication code, and fix overlength-name case. Since commit 11e131854f8231a21613f834c40fe9d046926387, ruleutils.c has attempted to ensure that each RTE in a query or plan tree has a unique alias name. However, the code that was added for this could be quite slow, even as bad as O(N^3) if N identical RTE names must be replaced, as noted by Jeff Janes. Improve matters by building a transient hash table within set_rtable_names. The hash table in itself reduces the cost of detecting a duplicate from O(N) to O(1), and we can save another factor of N by storing the number of de-duplicated names already created for each entry, so that we don't have to re-try names already created. This way is probably a bit slower overall for small range tables, but almost by definition, such cases should not be a performance problem. In principle the same problem applies to the column-name-de-duplication code; but in practice that seems to be less of a problem, first because N is limited since we don't support extremely wide tables, and second because duplicate column names within an RTE are fairly rare, so that in practice the cost is more like O(N^2) not O(N^3). It would be very much messier to fix the column-name code, so for now I've left that alone. An independent problem in the same area was that the de-duplication code paid no attention to the identifier length limit, and would happily produce identifiers that were longer than NAMEDATALEN and wouldn't be unique after truncation to NAMEDATALEN. This could result in dump/reload failures, or perhaps even views that silently behaved differently than before. We can fix that by shortening the base name as needed. Fix it for both the relation and column name cases. In passing, check for interrupts in set_rtable_names, just in case it's still slow enough to be an issue. Back-patch to 9.3 where this code was introduced.
2015-11-16 19:45:17 +01:00
i++;
for (;;)
{
memcpy(modname, colname, colnamelen);
sprintf(modname + colnamelen, "_%d", i);
if (strlen(modname) < NAMEDATALEN)
break;
/* drop chars from colname to keep all the digits */
colnamelen = pg_mbcliplen(colname, colnamelen,
colnamelen - 1);
}
} while (!colname_is_unique(modname, dpns, colinfo));
colname = modname;
}
return colname;
}
/*
* expand_colnames_array_to: make colinfo->colnames at least n items long
*
* Any added array entries are initialized to zero.
*/
static void
expand_colnames_array_to(deparse_columns *colinfo, int n)
{
if (n > colinfo->num_cols)
{
if (colinfo->colnames == NULL)
colinfo->colnames = palloc0_array(char *, n);
else
colinfo->colnames = repalloc0_array(colinfo->colnames, char *, colinfo->num_cols, n);
colinfo->num_cols = n;
}
}
/*
* identify_join_columns: figure out where columns of a join come from
*
* Fills the join-specific fields of the colinfo struct, except for
* usingNames which is filled later.
*/
static void
identify_join_columns(JoinExpr *j, RangeTblEntry *jrte,
deparse_columns *colinfo)
{
int numjoincols;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
int jcolno;
int rcolno;
ListCell *lc;
/* Extract left/right child RT indexes */
if (IsA(j->larg, RangeTblRef))
colinfo->leftrti = ((RangeTblRef *) j->larg)->rtindex;
else if (IsA(j->larg, JoinExpr))
colinfo->leftrti = ((JoinExpr *) j->larg)->rtindex;
else
elog(ERROR, "unrecognized node type in jointree: %d",
(int) nodeTag(j->larg));
if (IsA(j->rarg, RangeTblRef))
colinfo->rightrti = ((RangeTblRef *) j->rarg)->rtindex;
else if (IsA(j->rarg, JoinExpr))
colinfo->rightrti = ((JoinExpr *) j->rarg)->rtindex;
else
elog(ERROR, "unrecognized node type in jointree: %d",
(int) nodeTag(j->rarg));
/* Assert children will be processed earlier than join in second pass */
Assert(colinfo->leftrti < j->rtindex);
Assert(colinfo->rightrti < j->rtindex);
/* Initialize result arrays with zeroes */
numjoincols = list_length(jrte->joinaliasvars);
Assert(numjoincols == list_length(jrte->eref->colnames));
colinfo->leftattnos = (int *) palloc0(numjoincols * sizeof(int));
colinfo->rightattnos = (int *) palloc0(numjoincols * sizeof(int));
/*
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
* Deconstruct RTE's joinleftcols/joinrightcols into desired format.
* Recall that the column(s) merged due to USING are the first column(s)
* of the join output. We need not do anything special while scanning
* joinleftcols, but while scanning joinrightcols we must distinguish
* merged from unmerged columns.
*/
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
jcolno = 0;
foreach(lc, jrte->joinleftcols)
{
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
int leftattno = lfirst_int(lc);
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
colinfo->leftattnos[jcolno++] = leftattno;
}
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
rcolno = 0;
foreach(lc, jrte->joinrightcols)
{
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
int rightattno = lfirst_int(lc);
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
if (rcolno < jrte->joinmergedcols) /* merged column? */
colinfo->rightattnos[rcolno] = rightattno;
else
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
colinfo->rightattnos[jcolno++] = rightattno;
rcolno++;
}
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
Assert(jcolno == numjoincols);
}
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
/*
* get_rtable_name: convenience function to get a previously assigned RTE alias
*
* The RTE must belong to the topmost namespace level in "context".
*/
static char *
get_rtable_name(int rtindex, deparse_context *context)
{
deparse_namespace *dpns = (deparse_namespace *) linitial(context->namespaces);
Assert(rtindex > 0 && rtindex <= list_length(dpns->rtable_names));
return (char *) list_nth(dpns->rtable_names, rtindex - 1);
}
/*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* set_deparse_plan: set up deparse_namespace to parse subexpressions
* of a given Plan node
*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* This sets the plan, outer_plan, inner_plan, outer_tlist, inner_tlist,
* and index_tlist fields. Caller must already have adjusted the ancestors
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* list if necessary. Note that the rtable, subplans, and ctes fields do
* not need to change when shifting attention to different plan nodes in a
* single plan tree.
*/
static void
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
set_deparse_plan(deparse_namespace *dpns, Plan *plan)
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->plan = plan;
/*
* We special-case Append and MergeAppend to pretend that the first child
* plan is the OUTER referent; we have to interpret OUTER Vars in their
* tlists according to one of the children, and the first one is the most
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
* natural choice.
*/
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (IsA(plan, Append))
dpns->outer_plan = linitial(((Append *) plan)->appendplans);
else if (IsA(plan, MergeAppend))
dpns->outer_plan = linitial(((MergeAppend *) plan)->mergeplans);
else
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->outer_plan = outerPlan(plan);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (dpns->outer_plan)
dpns->outer_tlist = dpns->outer_plan->targetlist;
else
dpns->outer_tlist = NIL;
/*
* For a SubqueryScan, pretend the subplan is INNER referent. (We don't
* use OUTER because that could someday conflict with the normal meaning.)
* Likewise, for a CteScan, pretend the subquery's plan is INNER referent.
* For a WorkTableScan, locate the parent RecursiveUnion plan node and use
* that as INNER referent.
*
* For MERGE, pretend the ModifyTable's source plan (its outer plan) is
* INNER referent. This is the join from the target relation to the data
* source, and all INNER_VAR Vars in other parts of the query refer to its
* targetlist.
*
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
* For ON CONFLICT .. UPDATE we just need the inner tlist to point to the
* excluded expression's tlist. (Similar to the SubqueryScan we don't want
* to reuse OUTER, it's used for RETURNING in some modify table cases,
* although not INSERT .. CONFLICT).
*/
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (IsA(plan, SubqueryScan))
dpns->inner_plan = ((SubqueryScan *) plan)->subplan;
else if (IsA(plan, CteScan))
dpns->inner_plan = list_nth(dpns->subplans,
((CteScan *) plan)->ctePlanId - 1);
else if (IsA(plan, WorkTableScan))
dpns->inner_plan = find_recursive_union(dpns,
(WorkTableScan *) plan);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
else if (IsA(plan, ModifyTable))
Add support for MERGE SQL command MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
2022-03-28 16:45:58 +02:00
{
if (((ModifyTable *) plan)->operation == CMD_MERGE)
dpns->inner_plan = outerPlan(plan);
Add support for MERGE SQL command MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
2022-03-28 16:45:58 +02:00
else
dpns->inner_plan = plan;
Add support for MERGE SQL command MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
2022-03-28 16:45:58 +02:00
}
else
dpns->inner_plan = innerPlan(plan);
if (IsA(plan, ModifyTable) && ((ModifyTable *) plan)->operation == CMD_INSERT)
dpns->inner_tlist = ((ModifyTable *) plan)->exclRelTlist;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
else if (dpns->inner_plan)
dpns->inner_tlist = dpns->inner_plan->targetlist;
else
dpns->inner_tlist = NIL;
Code review for foreign/custom join pushdown patch. Commit e7cb7ee14555cc9c5773e2c102efd6371f6f2005 included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.
2015-05-10 20:36:30 +02:00
/* Set up referent for INDEX_VAR Vars, if needed */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (IsA(plan, IndexOnlyScan))
dpns->index_tlist = ((IndexOnlyScan *) plan)->indextlist;
else if (IsA(plan, ForeignScan))
dpns->index_tlist = ((ForeignScan *) plan)->fdw_scan_tlist;
else if (IsA(plan, CustomScan))
dpns->index_tlist = ((CustomScan *) plan)->custom_scan_tlist;
else
dpns->index_tlist = NIL;
}
/*
* Locate the ancestor plan node that is the RecursiveUnion generating
* the WorkTableScan's work table. We can match on wtParam, since that
* should be unique within the plan tree.
*/
static Plan *
find_recursive_union(deparse_namespace *dpns, WorkTableScan *wtscan)
{
ListCell *lc;
foreach(lc, dpns->ancestors)
{
Plan *ancestor = (Plan *) lfirst(lc);
if (IsA(ancestor, RecursiveUnion) &&
((RecursiveUnion *) ancestor)->wtParam == wtscan->wtParam)
return ancestor;
}
elog(ERROR, "could not find RecursiveUnion for WorkTableScan with wtParam %d",
wtscan->wtParam);
return NULL;
}
/*
* push_child_plan: temporarily transfer deparsing attention to a child plan
*
* When expanding an OUTER_VAR or INNER_VAR reference, we must adjust the
* deparse context in case the referenced expression itself uses
* OUTER_VAR/INNER_VAR. We modify the top stack entry in-place to avoid
* affecting levelsup issues (although in a Plan tree there really shouldn't
* be any).
*
* Caller must provide a local deparse_namespace variable to save the
* previous state for pop_child_plan.
*/
static void
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(deparse_namespace *dpns, Plan *plan,
deparse_namespace *save_dpns)
{
/* Save state for restoration later */
*save_dpns = *dpns;
/* Link current plan node into ancestors list */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->ancestors = lcons(dpns->plan, dpns->ancestors);
/* Set attention on selected child */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
set_deparse_plan(dpns, plan);
}
/*
* pop_child_plan: undo the effects of push_child_plan
*/
static void
pop_child_plan(deparse_namespace *dpns, deparse_namespace *save_dpns)
{
List *ancestors;
/* Get rid of ancestors list cell added by push_child_plan */
ancestors = list_delete_first(dpns->ancestors);
/* Restore fields changed by push_child_plan */
*dpns = *save_dpns;
/* Make sure dpns->ancestors is right (may be unnecessary) */
dpns->ancestors = ancestors;
}
/*
* push_ancestor_plan: temporarily transfer deparsing attention to an
* ancestor plan
*
* When expanding a Param reference, we must adjust the deparse context
* to match the plan node that contains the expression being printed;
* otherwise we'd fail if that expression itself contains a Param or
* OUTER_VAR/INNER_VAR/INDEX_VAR variable.
*
* The target ancestor is conveniently identified by the ListCell holding it
* in dpns->ancestors.
*
* Caller must provide a local deparse_namespace variable to save the
* previous state for pop_ancestor_plan.
*/
static void
push_ancestor_plan(deparse_namespace *dpns, ListCell *ancestor_cell,
deparse_namespace *save_dpns)
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Plan *plan = (Plan *) lfirst(ancestor_cell);
/* Save state for restoration later */
*save_dpns = *dpns;
/* Build a new ancestor list with just this node's ancestors */
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
dpns->ancestors =
list_copy_tail(dpns->ancestors,
list_cell_number(dpns->ancestors, ancestor_cell) + 1);
/* Set attention on selected ancestor */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
set_deparse_plan(dpns, plan);
}
/*
* pop_ancestor_plan: undo the effects of push_ancestor_plan
*/
static void
pop_ancestor_plan(deparse_namespace *dpns, deparse_namespace *save_dpns)
{
/* Free the ancestor list made in push_ancestor_plan */
list_free(dpns->ancestors);
/* Restore fields changed by push_ancestor_plan */
*dpns = *save_dpns;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* make_ruledef - reconstruct the CREATE RULE command
* for a given pg_rewrite tuple
* ----------
*/
static void
make_ruledef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc,
int prettyFlags)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
char *rulename;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
char ev_type;
Oid ev_class;
bool is_instead;
char *ev_qual;
char *ev_action;
List *actions;
Relation ev_relation;
TupleDesc viewResultDesc = NULL;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
int fno;
Datum dat;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
bool isnull;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Get the attribute values from the rules tuple
*/
fno = SPI_fnumber(rulettc, "rulename");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
rulename = NameStr(*(DatumGetName(dat)));
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_type");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
ev_type = DatumGetChar(dat);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_class");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
ev_class = DatumGetObjectId(dat);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "is_instead");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
is_instead = DatumGetBool(dat);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_qual");
ev_qual = SPI_getvalue(ruletup, rulettc, fno);
Assert(ev_qual != NULL);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_action");
ev_action = SPI_getvalue(ruletup, rulettc, fno);
Assert(ev_action != NULL);
actions = (List *) stringToNode(ev_action);
if (actions == NIL)
elog(ERROR, "invalid empty ev_action list");
ev_relation = table_open(ev_class, AccessShareLock);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Build the rules definition text
*/
appendStringInfo(buf, "CREATE RULE %s AS",
quote_identifier(rulename));
if (prettyFlags & PRETTYFLAG_INDENT)
appendStringInfoString(buf, "\n ON ");
else
appendStringInfoString(buf, " ON ");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* The event the rule is fired for */
switch (ev_type)
{
case '1':
appendStringInfoString(buf, "SELECT");
viewResultDesc = RelationGetDescr(ev_relation);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case '2':
appendStringInfoString(buf, "UPDATE");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case '3':
appendStringInfoString(buf, "INSERT");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case '4':
appendStringInfoString(buf, "DELETE");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
default:
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("rule \"%s\" has unsupported event type %d",
rulename, ev_type)));
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
}
/* The relation the rule is fired on */
appendStringInfo(buf, " TO %s",
(prettyFlags & PRETTYFLAG_SCHEMA) ?
generate_relation_name(ev_class, NIL) :
generate_qualified_relation_name(ev_class));
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* If the rule has an event qualification, add it */
if (strcmp(ev_qual, "<>") != 0)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
Node *qual;
Query *query;
deparse_context context;
deparse_namespace dpns;
if (prettyFlags & PRETTYFLAG_INDENT)
appendStringInfoString(buf, "\n ");
appendStringInfoString(buf, " WHERE ");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
qual = stringToNode(ev_qual);
/*
* We need to make a context for recognizing any Vars in the qual
* (which can only be references to OLD and NEW). Use the rtable of
* the first query in the action list for this purpose.
*/
query = (Query *) linitial(actions);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* If the action is INSERT...SELECT, OLD/NEW have been pushed down
* into the SELECT, and that's what we need to look at. (Ugly kluge
* ... try to fix this when we redesign querytrees.)
*/
query = getInsertSelectQuery(query, NULL);
/* Must acquire locks right away; see notes in get_query_def() */
AcquireRewriteLocks(query, false, false);
context.buf = buf;
context.namespaces = list_make1(&dpns);
context.windowClause = NIL;
context.windowTList = NIL;
context.varprefix = (list_length(query->rtable) != 1);
context.prettyFlags = prettyFlags;
context.wrapColumn = WRAP_COLUMN_DEFAULT;
context.indentLevel = PRETTYINDENT_STD;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
context.special_exprkind = EXPR_KIND_NONE;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
context.appendparents = NULL;
set_deparse_for_query(&dpns, query, NIL);
get_rule_expr(qual, &context, false);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
appendStringInfoString(buf, " DO ");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* The INSTEAD keyword (if so) */
if (is_instead)
appendStringInfoString(buf, "INSTEAD ");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* Finally the rules actions */
if (list_length(actions) > 1)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
ListCell *action;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
Query *query;
appendStringInfoChar(buf, '(');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
foreach(action, actions)
{
query = (Query *) lfirst(action);
get_query_def(query, buf, NIL, viewResultDesc, true,
prettyFlags, WRAP_COLUMN_DEFAULT, 0);
if (prettyFlags)
appendStringInfoString(buf, ";\n");
else
appendStringInfoString(buf, "; ");
}
appendStringInfoString(buf, ");");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
else
{
Query *query;
query = (Query *) linitial(actions);
get_query_def(query, buf, NIL, viewResultDesc, true,
prettyFlags, WRAP_COLUMN_DEFAULT, 0);
appendStringInfoChar(buf, ';');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
table_close(ev_relation, AccessShareLock);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
* make_viewdef - reconstruct the SELECT part of a
* view rewrite rule
* ----------
*/
static void
make_viewdef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc,
int prettyFlags, int wrapColumn)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
Query *query;
char ev_type;
Oid ev_class;
bool is_instead;
char *ev_qual;
char *ev_action;
List *actions;
Relation ev_relation;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
int fno;
Datum dat;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
bool isnull;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Get the attribute values from the rules tuple
*/
fno = SPI_fnumber(rulettc, "ev_type");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
ev_type = DatumGetChar(dat);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_class");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
ev_class = DatumGetObjectId(dat);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "is_instead");
dat = SPI_getbinval(ruletup, rulettc, fno, &isnull);
Assert(!isnull);
is_instead = DatumGetBool(dat);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_qual");
ev_qual = SPI_getvalue(ruletup, rulettc, fno);
Assert(ev_qual != NULL);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
fno = SPI_fnumber(rulettc, "ev_action");
ev_action = SPI_getvalue(ruletup, rulettc, fno);
Assert(ev_action != NULL);
actions = (List *) stringToNode(ev_action);
if (list_length(actions) != 1)
{
/* keep output buffer empty and leave */
return;
}
query = (Query *) linitial(actions);
if (ev_type != '1' || !is_instead ||
strcmp(ev_qual, "<>") != 0 || query->commandType != CMD_SELECT)
{
/* keep output buffer empty and leave */
return;
}
ev_relation = table_open(ev_class, AccessShareLock);
get_query_def(query, buf, NIL, RelationGetDescr(ev_relation), true,
prettyFlags, wrapColumn, 0);
appendStringInfoChar(buf, ';');
table_close(ev_relation, AccessShareLock);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
* get_query_def - Parse back one query parsetree
*
* query: parsetree to be displayed
* buf: output text is appended to buf
* parentnamespace: list (initially empty) of outer-level deparse_namespace's
* resultDesc: if not NULL, the output tuple descriptor for the view
* represented by a SELECT query. We use the column names from it
* to label SELECT output columns, in preference to names in the query
* colNamesVisible: true if the surrounding context cares about the output
* column names at all (as, for example, an EXISTS() context does not);
* when false, we can suppress dummy column labels such as "?column?"
* prettyFlags: bitmask of PRETTYFLAG_XXX options
* wrapColumn: maximum line length, or -1 to disable wrapping
* startIndent: initial indentation amount
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
* ----------
*/
static void
get_query_def(Query *query, StringInfo buf, List *parentnamespace,
TupleDesc resultDesc, bool colNamesVisible,
int prettyFlags, int wrapColumn, int startIndent)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
deparse_context context;
deparse_namespace dpns;
/* Guard against excessively long or deeply-nested queries */
CHECK_FOR_INTERRUPTS();
check_stack_depth();
/*
* Before we begin to examine the query, acquire locks on referenced
* relations, and fix up deleted columns in JOIN RTEs. This ensures
* consistent results. Note we assume it's OK to scribble on the passed
* querytree!
*
* We are only deparsing the query (we are not about to execute it), so we
* only need AccessShareLock on the relations it mentions.
*/
AcquireRewriteLocks(query, false, false);
context.buf = buf;
context.namespaces = lcons(&dpns, list_copy(parentnamespace));
context.windowClause = NIL;
context.windowTList = NIL;
context.varprefix = (parentnamespace != NIL ||
list_length(query->rtable) != 1);
context.prettyFlags = prettyFlags;
context.wrapColumn = wrapColumn;
context.indentLevel = startIndent;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
context.special_exprkind = EXPR_KIND_NONE;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
context.appendparents = NULL;
set_deparse_for_query(&dpns, query, parentnamespace);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
switch (query->commandType)
{
case CMD_SELECT:
get_select_query_def(query, &context, resultDesc, colNamesVisible);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
case CMD_UPDATE:
get_update_query_def(query, &context, colNamesVisible);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
case CMD_INSERT:
get_insert_query_def(query, &context, colNamesVisible);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
case CMD_DELETE:
get_delete_query_def(query, &context, colNamesVisible);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case CMD_MERGE:
get_merge_query_def(query, &context, colNamesVisible);
break;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
case CMD_NOTHING:
appendStringInfoString(buf, "NOTHING");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case CMD_UTILITY:
get_utility_query_def(query, &context);
break;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
default:
elog(ERROR, "unrecognized query command type: %d",
query->commandType);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
}
}
/* ----------
* get_values_def - Parse back a VALUES list
* ----------
*/
static void
get_values_def(List *values_lists, deparse_context *context)
{
StringInfo buf = context->buf;
bool first_list = true;
ListCell *vtl;
appendStringInfoString(buf, "VALUES ");
foreach(vtl, values_lists)
{
List *sublist = (List *) lfirst(vtl);
bool first_col = true;
ListCell *lc;
if (first_list)
first_list = false;
else
appendStringInfoString(buf, ", ");
appendStringInfoChar(buf, '(');
foreach(lc, sublist)
{
Node *col = (Node *) lfirst(lc);
if (first_col)
first_col = false;
else
appendStringInfoChar(buf, ',');
/*
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
* Print the value. Whole-row Vars need special treatment.
*/
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
get_rule_expr_toplevel(col, context, false);
}
appendStringInfoChar(buf, ')');
}
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* get_with_clause - Parse back a WITH clause
* ----------
*/
static void
get_with_clause(Query *query, deparse_context *context)
{
StringInfo buf = context->buf;
const char *sep;
ListCell *l;
if (query->cteList == NIL)
return;
if (PRETTY_INDENT(context))
{
context->indentLevel += PRETTYINDENT_STD;
appendStringInfoChar(buf, ' ');
}
if (query->hasRecursive)
sep = "WITH RECURSIVE ";
else
sep = "WITH ";
foreach(l, query->cteList)
{
CommonTableExpr *cte = (CommonTableExpr *) lfirst(l);
appendStringInfoString(buf, sep);
appendStringInfoString(buf, quote_identifier(cte->ctename));
if (cte->aliascolnames)
{
bool first = true;
ListCell *col;
appendStringInfoChar(buf, '(');
foreach(col, cte->aliascolnames)
{
if (first)
first = false;
else
appendStringInfoString(buf, ", ");
appendStringInfoString(buf,
quote_identifier(strVal(lfirst(col))));
}
appendStringInfoChar(buf, ')');
}
Allow user control of CTE materialization, and change the default behavior. Historically we've always materialized the full output of a CTE query, treating WITH as an optimization fence (so that, for example, restrictions from the outer query cannot be pushed into it). This is appropriate when the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE query is non-recursive and side-effect-free, there's no hazard of changing the query results by pushing restrictions down. Another argument for materialization is that it can avoid duplicate computation of an expensive WITH query --- but that only applies if the WITH query is called more than once in the outer query. Even then it could still be a net loss, if each call has restrictions that would allow just a small part of the WITH query to be computed. Hence, let's change the behavior for WITH queries that are non-recursive and side-effect-free. By default, we will inline them into the outer query (removing the optimization fence) if they are called just once. If they are called more than once, we will keep the old behavior by default, but the user can override this and force inlining by specifying NOT MATERIALIZED. Lastly, the user can force the old behavior by specifying MATERIALIZED; this would mainly be useful when the query had deliberately been employing WITH as an optimization fence to prevent a poor choice of plan. Andreas Karlsson, Andrew Gierth, David Fetter Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
2019-02-16 22:11:12 +01:00
appendStringInfoString(buf, " AS ");
switch (cte->ctematerialized)
{
case CTEMaterializeDefault:
break;
case CTEMaterializeAlways:
appendStringInfoString(buf, "MATERIALIZED ");
break;
case CTEMaterializeNever:
appendStringInfoString(buf, "NOT MATERIALIZED ");
break;
}
appendStringInfoChar(buf, '(');
if (PRETTY_INDENT(context))
appendContextKeyword(context, "", 0, 0, 0);
get_query_def((Query *) cte->ctequery, buf, context->namespaces, NULL,
true,
context->prettyFlags, context->wrapColumn,
context->indentLevel);
if (PRETTY_INDENT(context))
appendContextKeyword(context, "", 0, 0, 0);
appendStringInfoChar(buf, ')');
if (cte->search_clause)
{
bool first = true;
ListCell *lc;
appendStringInfo(buf, " SEARCH %s FIRST BY ",
cte->search_clause->search_breadth_first ? "BREADTH" : "DEPTH");
foreach(lc, cte->search_clause->search_col_list)
{
if (first)
first = false;
else
appendStringInfoString(buf, ", ");
appendStringInfoString(buf,
quote_identifier(strVal(lfirst(lc))));
}
appendStringInfo(buf, " SET %s", quote_identifier(cte->search_clause->search_seq_column));
}
if (cte->cycle_clause)
{
bool first = true;
ListCell *lc;
appendStringInfoString(buf, " CYCLE ");
foreach(lc, cte->cycle_clause->cycle_col_list)
{
if (first)
first = false;
else
appendStringInfoString(buf, ", ");
appendStringInfoString(buf,
quote_identifier(strVal(lfirst(lc))));
}
appendStringInfo(buf, " SET %s", quote_identifier(cte->cycle_clause->cycle_mark_column));
{
Const *cmv = castNode(Const, cte->cycle_clause->cycle_mark_value);
Const *cmd = castNode(Const, cte->cycle_clause->cycle_mark_default);
if (!(cmv->consttype == BOOLOID && !cmv->constisnull && DatumGetBool(cmv->constvalue) == true &&
cmd->consttype == BOOLOID && !cmd->constisnull && DatumGetBool(cmd->constvalue) == false))
{
appendStringInfoString(buf, " TO ");
get_rule_expr(cte->cycle_clause->cycle_mark_value, context, false);
appendStringInfoString(buf, " DEFAULT ");
get_rule_expr(cte->cycle_clause->cycle_mark_default, context, false);
}
}
appendStringInfo(buf, " USING %s", quote_identifier(cte->cycle_clause->cycle_path_column));
}
sep = ", ";
}
if (PRETTY_INDENT(context))
{
context->indentLevel -= PRETTYINDENT_STD;
appendContextKeyword(context, "", 0, 0, 0);
}
else
appendStringInfoChar(buf, ' ');
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* get_select_query_def - Parse back a SELECT parsetree
* ----------
*/
static void
get_select_query_def(Query *query, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible)
{
StringInfo buf = context->buf;
List *save_windowclause;
List *save_windowtlist;
bool force_colno;
ListCell *l;
/* Insert the WITH clause if given */
get_with_clause(query, context);
/* Set up context for possible window functions */
save_windowclause = context->windowClause;
context->windowClause = query->windowClause;
save_windowtlist = context->windowTList;
context->windowTList = query->targetList;
/*
* If the Query node has a setOperations tree, then it's the top level of
* a UNION/INTERSECT/EXCEPT query; only the WITH, ORDER BY and LIMIT
* fields are interesting in the top query itself.
*/
if (query->setOperations)
{
get_setop_query(query->setOperations, query, context, resultDesc,
colNamesVisible);
/* ORDER BY clauses must be simple in this case */
force_colno = true;
}
else
{
get_basic_select_query(query, context, resultDesc, colNamesVisible);
force_colno = false;
}
/* Add the ORDER BY clause if given */
if (query->sortClause != NIL)
{
appendContextKeyword(context, " ORDER BY ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_rule_orderby(query->sortClause, query->targetList,
force_colno, context);
}
/*
* Add the LIMIT/OFFSET clauses if given. If non-default options, use the
* standard spelling of LIMIT.
*/
if (query->limitOffset != NULL)
{
appendContextKeyword(context, " OFFSET ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
get_rule_expr(query->limitOffset, context, false);
}
if (query->limitCount != NULL)
{
if (query->limitOption == LIMIT_OPTION_WITH_TIES)
{
appendContextKeyword(context, " FETCH FIRST ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
get_rule_expr(query->limitCount, context, false);
appendStringInfoString(buf, " ROWS WITH TIES");
}
else
{
appendContextKeyword(context, " LIMIT ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
if (IsA(query->limitCount, Const) &&
((Const *) query->limitCount)->constisnull)
appendStringInfoString(buf, "ALL");
else
get_rule_expr(query->limitCount, context, false);
}
}
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
/* Add FOR [KEY] UPDATE/SHARE clauses if present */
if (query->hasForUpdate)
{
foreach(l, query->rowMarks)
{
RowMarkClause *rc = (RowMarkClause *) lfirst(l);
/* don't print implicit clauses */
if (rc->pushedDown)
continue;
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
switch (rc->strength)
{
2015-03-15 23:41:47 +01:00
case LCS_NONE:
/* we intentionally throw an error for LCS_NONE */
elog(ERROR, "unrecognized LockClauseStrength %d",
(int) rc->strength);
break;
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
case LCS_FORKEYSHARE:
appendContextKeyword(context, " FOR KEY SHARE",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
break;
case LCS_FORSHARE:
appendContextKeyword(context, " FOR SHARE",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
break;
case LCS_FORNOKEYUPDATE:
appendContextKeyword(context, " FOR NO KEY UPDATE",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
break;
case LCS_FORUPDATE:
appendContextKeyword(context, " FOR UPDATE",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
break;
}
appendStringInfo(buf, " OF %s",
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
quote_identifier(get_rtable_name(rc->rti,
context)));
if (rc->waitPolicy == LockWaitError)
appendStringInfoString(buf, " NOWAIT");
else if (rc->waitPolicy == LockWaitSkip)
appendStringInfoString(buf, " SKIP LOCKED");
}
}
context->windowClause = save_windowclause;
context->windowTList = save_windowtlist;
}
/*
* Detect whether query looks like SELECT ... FROM VALUES(),
* with no need to rename the output columns of the VALUES RTE.
* If so, return the VALUES RTE. Otherwise return NULL.
*/
static RangeTblEntry *
get_simple_values_rte(Query *query, TupleDesc resultDesc)
{
RangeTblEntry *result = NULL;
ListCell *lc;
/*
* We want to detect a match even if the Query also contains OLD or NEW
* rule RTEs. So the idea is to scan the rtable and see if there is only
* one inFromCl RTE that is a VALUES RTE.
*/
foreach(lc, query->rtable)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
if (rte->rtekind == RTE_VALUES && rte->inFromCl)
{
if (result)
return NULL; /* multiple VALUES (probably not possible) */
result = rte;
}
else if (rte->rtekind == RTE_RELATION && !rte->inFromCl)
continue; /* ignore rule entries */
else
return NULL; /* something else -> not simple VALUES */
}
/*
* We don't need to check the targetlist in any great detail, because
* parser/analyze.c will never generate a "bare" VALUES RTE --- they only
* appear inside auto-generated sub-queries with very restricted
* structure. However, DefineView might have modified the tlist by
* injecting new column aliases, or we might have some other column
* aliases forced by a resultDesc. We can only simplify if the RTE's
* column names match the names that get_target_list() would select.
*/
if (result)
{
ListCell *lcn;
int colno;
if (list_length(query->targetList) != list_length(result->eref->colnames))
return NULL; /* this probably cannot happen */
colno = 0;
forboth(lc, query->targetList, lcn, result->eref->colnames)
{
TargetEntry *tle = (TargetEntry *) lfirst(lc);
char *cname = strVal(lfirst(lcn));
char *colname;
if (tle->resjunk)
return NULL; /* this probably cannot happen */
/* compute name that get_target_list would use for column */
colno++;
if (resultDesc && colno <= resultDesc->natts)
colname = NameStr(TupleDescAttr(resultDesc, colno - 1)->attname);
else
colname = tle->resname;
/* does it match the VALUES RTE? */
if (colname == NULL || strcmp(colname, cname) != 0)
return NULL; /* column name has been changed */
}
}
return result;
}
static void
get_basic_select_query(Query *query, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
RangeTblEntry *values_rte;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
char *sep;
ListCell *l;
if (PRETTY_INDENT(context))
{
context->indentLevel += PRETTYINDENT_STD;
appendStringInfoChar(buf, ' ');
}
/*
* If the query looks like SELECT * FROM (VALUES ...), then print just the
* VALUES part. This reverses what transformValuesClause() did at parse
* time.
*/
values_rte = get_simple_values_rte(query, resultDesc);
if (values_rte)
{
get_values_def(values_rte->values_lists, context);
return;
}
/*
* Build up the query string - first we say SELECT
*/
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
if (query->isReturn)
appendStringInfoString(buf, "RETURN");
else
appendStringInfoString(buf, "SELECT");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* Add the DISTINCT clause if given */
if (query->distinctClause != NIL)
{
if (query->hasDistinctOn)
{
appendStringInfoString(buf, " DISTINCT ON (");
sep = "";
foreach(l, query->distinctClause)
{
SortGroupClause *srt = (SortGroupClause *) lfirst(l);
appendStringInfoString(buf, sep);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
get_rule_sortgroupclause(srt->tleSortGroupRef, query->targetList,
false, context);
sep = ", ";
}
appendStringInfoChar(buf, ')');
}
else
appendStringInfoString(buf, " DISTINCT");
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* Then we tell what to select (the targetlist) */
get_target_list(query->targetList, context, resultDesc, colNamesVisible);
/* Add the FROM clause if needed */
get_from_clause(query, " FROM ", context);
/* Add the WHERE clause if given */
if (query->jointree->quals != NULL)
{
appendContextKeyword(context, " WHERE ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_rule_expr(query->jointree->quals, context, false);
}
/* Add the GROUP BY clause if given */
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
if (query->groupClause != NULL || query->groupingSets != NULL)
{
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
ParseExprKind save_exprkind;
appendContextKeyword(context, " GROUP BY ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
if (query->groupDistinct)
appendStringInfoString(buf, "DISTINCT ");
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
save_exprkind = context->special_exprkind;
context->special_exprkind = EXPR_KIND_GROUP_BY;
if (query->groupingSets == NIL)
{
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
sep = "";
foreach(l, query->groupClause)
{
SortGroupClause *grp = (SortGroupClause *) lfirst(l);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
appendStringInfoString(buf, sep);
get_rule_sortgroupclause(grp->tleSortGroupRef, query->targetList,
false, context);
sep = ", ";
}
}
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
else
{
sep = "";
foreach(l, query->groupingSets)
{
GroupingSet *grp = lfirst(l);
appendStringInfoString(buf, sep);
get_rule_groupingset(grp, query->targetList, true, context);
sep = ", ";
}
}
context->special_exprkind = save_exprkind;
}
/* Add the HAVING clause if given */
if (query->havingQual != NULL)
{
appendContextKeyword(context, " HAVING ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 0);
get_rule_expr(query->havingQual, context, false);
}
/* Add the WINDOW clause if needed */
if (query->windowClause != NIL)
get_rule_windowclause(query, context);
}
/* ----------
* get_target_list - Parse back a SELECT target list
*
* This is also used for RETURNING lists in INSERT/UPDATE/DELETE.
*
* resultDesc and colNamesVisible are as for get_query_def()
* ----------
*/
static void
get_target_list(List *targetList, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible)
{
StringInfo buf = context->buf;
StringInfoData targetbuf;
bool last_was_multiline = false;
char *sep;
int colno;
ListCell *l;
/* we use targetbuf to hold each TLE's text temporarily */
initStringInfo(&targetbuf);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
sep = " ";
colno = 0;
foreach(l, targetList)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
char *colname;
char *attname;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (tle->resjunk)
continue; /* ignore junk entries */
appendStringInfoString(buf, sep);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
sep = ", ";
colno++;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Put the new field text into targetbuf so we can decide after we've
* got it whether or not it needs to go on a new line.
*/
resetStringInfo(&targetbuf);
context->buf = &targetbuf;
/*
* We special-case Var nodes rather than using get_rule_expr. This is
* needed because get_rule_expr will display a whole-row Var as
* "foo.*", which is the preferred notation in most contexts, but at
* the top level of a SELECT list it's not right (the parser will
* expand that notation into multiple columns, yielding behavior
* different from a whole-row Var). We need to call get_variable
* directly so that we can tell it to do the right thing, and so that
* we can get the attribute name which is the default AS label.
*/
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
if (tle->expr && (IsA(tle->expr, Var)))
{
attname = get_variable((Var *) tle->expr, 0, true, context);
}
else
{
get_rule_expr((Node *) tle->expr, context, true);
/*
* When colNamesVisible is true, we should always show the
* assigned column name explicitly. Otherwise, show it only if
* it's not FigureColname's fallback.
*/
attname = colNamesVisible ? NULL : "?column?";
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Figure out what the result column should be called. In the context
* of a view, use the view's tuple descriptor (so as to pick up the
* effects of any column RENAME that's been done on the view).
* Otherwise, just use what we can find in the TLE.
*/
if (resultDesc && colno <= resultDesc->natts)
colname = NameStr(TupleDescAttr(resultDesc, colno - 1)->attname);
else
colname = tle->resname;
/* Show AS unless the column's name is correct as-is */
if (colname) /* resname could be NULL */
{
if (attname == NULL || strcmp(attname, colname) != 0)
appendStringInfo(&targetbuf, " AS %s", quote_identifier(colname));
}
/* Restore context's output buffer */
context->buf = buf;
/* Consider line-wrapping if enabled */
if (PRETTY_INDENT(context) && context->wrapColumn >= 0)
{
int leading_nl_pos;
/* Does the new field start with a new line? */
if (targetbuf.len > 0 && targetbuf.data[0] == '\n')
leading_nl_pos = 0;
else
leading_nl_pos = -1;
/* If so, we shouldn't add anything */
if (leading_nl_pos >= 0)
{
/* instead, remove any trailing spaces currently in buf */
removeStringInfoSpaces(buf);
}
else
{
char *trailing_nl;
/* Locate the start of the current line in the output buffer */
trailing_nl = strrchr(buf->data, '\n');
if (trailing_nl == NULL)
trailing_nl = buf->data;
else
trailing_nl++;
/*
* Add a newline, plus some indentation, if the new field is
* not the first and either the new field would cause an
* overflow or the last field used more than one line.
*/
if (colno > 1 &&
((strlen(trailing_nl) + targetbuf.len > context->wrapColumn) ||
last_was_multiline))
appendContextKeyword(context, "", -PRETTYINDENT_STD,
PRETTYINDENT_STD, PRETTYINDENT_VAR);
}
/* Remember this field's multiline status for next iteration */
last_was_multiline =
(strchr(targetbuf.data + leading_nl_pos + 1, '\n') != NULL);
}
/* Add the new field */
appendBinaryStringInfo(buf, targetbuf.data, targetbuf.len);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* clean up */
pfree(targetbuf.data);
}
static void
get_setop_query(Node *setOp, Query *query, deparse_context *context,
TupleDesc resultDesc, bool colNamesVisible)
{
StringInfo buf = context->buf;
bool need_paren;
/* Guard against excessively long or deeply-nested queries */
CHECK_FOR_INTERRUPTS();
check_stack_depth();
if (IsA(setOp, RangeTblRef))
{
RangeTblRef *rtr = (RangeTblRef *) setOp;
RangeTblEntry *rte = rt_fetch(rtr->rtindex, query->rtable);
Query *subquery = rte->subquery;
Assert(subquery != NULL);
Assert(subquery->setOperations == NULL);
/* Need parens if WITH, ORDER BY, FOR UPDATE, or LIMIT; see gram.y */
need_paren = (subquery->cteList ||
subquery->sortClause ||
subquery->rowMarks ||
subquery->limitOffset ||
subquery->limitCount);
if (need_paren)
appendStringInfoChar(buf, '(');
get_query_def(subquery, buf, context->namespaces, resultDesc,
colNamesVisible,
context->prettyFlags, context->wrapColumn,
context->indentLevel);
if (need_paren)
appendStringInfoChar(buf, ')');
}
else if (IsA(setOp, SetOperationStmt))
{
SetOperationStmt *op = (SetOperationStmt *) setOp;
int subindent;
/*
* We force parens when nesting two SetOperationStmts, except when the
* lefthand input is another setop of the same kind. Syntactically,
* we could omit parens in rather more cases, but it seems best to use
* parens to flag cases where the setop operator changes. If we use
* parens, we also increase the indentation level for the child query.
*
* There are some cases in which parens are needed around a leaf query
* too, but those are more easily handled at the next level down (see
* code above).
*/
if (IsA(op->larg, SetOperationStmt))
{
SetOperationStmt *lop = (SetOperationStmt *) op->larg;
if (op->op == lop->op && op->all == lop->all)
need_paren = false;
else
need_paren = true;
}
else
need_paren = false;
if (need_paren)
{
appendStringInfoChar(buf, '(');
subindent = PRETTYINDENT_STD;
appendContextKeyword(context, "", subindent, 0, 0);
}
else
subindent = 0;
get_setop_query(op->larg, query, context, resultDesc, colNamesVisible);
if (need_paren)
appendContextKeyword(context, ") ", -subindent, 0, 0);
else if (PRETTY_INDENT(context))
appendContextKeyword(context, "", -subindent, 0, 0);
else
appendStringInfoChar(buf, ' ');
switch (op->op)
{
case SETOP_UNION:
appendStringInfoString(buf, "UNION ");
break;
case SETOP_INTERSECT:
appendStringInfoString(buf, "INTERSECT ");
break;
case SETOP_EXCEPT:
appendStringInfoString(buf, "EXCEPT ");
break;
default:
elog(ERROR, "unrecognized set op: %d",
(int) op->op);
}
if (op->all)
appendStringInfoString(buf, "ALL ");
/* Always parenthesize if RHS is another setop */
need_paren = IsA(op->rarg, SetOperationStmt);
/*
* The indentation code here is deliberately a bit different from that
* for the lefthand input, because we want the line breaks in
* different places.
*/
if (need_paren)
{
appendStringInfoChar(buf, '(');
subindent = PRETTYINDENT_STD;
}
else
subindent = 0;
appendContextKeyword(context, "", subindent, 0, 0);
get_setop_query(op->rarg, query, context, resultDesc, false);
if (PRETTY_INDENT(context))
context->indentLevel -= subindent;
if (need_paren)
appendContextKeyword(context, ")", 0, 0, 0);
}
else
{
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(setOp));
}
}
/*
* Display a sort/group clause.
*
* Also returns the expression tree, so caller need not find it again.
*/
static Node *
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
get_rule_sortgroupclause(Index ref, List *tlist, bool force_colno,
deparse_context *context)
{
StringInfo buf = context->buf;
TargetEntry *tle;
Node *expr;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
tle = get_sortgroupref_tle(ref, tlist);
expr = (Node *) tle->expr;
/*
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
* Use column-number form if requested by caller. Otherwise, if
* expression is a constant, force it to be dumped with an explicit cast
* as decoration --- this is because a simple integer constant is
* ambiguous (and will be misinterpreted by findTargetlistEntry()) if we
* dump it without any decoration. If it's anything more complex than a
* simple Var, then force extra parens around it, to ensure it can't be
* misinterpreted as a cube() or rollup() construct.
*/
if (force_colno)
{
Assert(!tle->resjunk);
appendStringInfo(buf, "%d", tle->resno);
}
else if (expr && IsA(expr, Const))
get_const_expr((Const *) expr, context, 1);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
else if (!expr || IsA(expr, Var))
get_rule_expr(expr, context, true);
else
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
{
/*
* We must force parens for function-like expressions even if
* PRETTY_PAREN is off, since those are the ones in danger of
* misparsing. For other expressions we need to force them only if
* PRETTY_PAREN is on, since otherwise the expression will output them
* itself. (We can't skip the parens.)
*/
bool need_paren = (PRETTY_PAREN(context)
|| IsA(expr, FuncExpr)
|| IsA(expr, Aggref)
|| IsA(expr, WindowFunc)
|| IsA(expr, JsonConstructorExpr));
2015-05-24 03:35:49 +02:00
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
if (need_paren)
appendStringInfoChar(context->buf, '(');
get_rule_expr(expr, context, true);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
if (need_paren)
appendStringInfoChar(context->buf, ')');
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
}
return expr;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
/*
* Display a GroupingSet
*/
static void
get_rule_groupingset(GroupingSet *gset, List *targetlist,
bool omit_parens, deparse_context *context)
{
ListCell *l;
StringInfo buf = context->buf;
bool omit_child_parens = true;
char *sep = "";
switch (gset->kind)
{
case GROUPING_SET_EMPTY:
appendStringInfoString(buf, "()");
return;
case GROUPING_SET_SIMPLE:
{
if (!omit_parens || list_length(gset->content) != 1)
appendStringInfoChar(buf, '(');
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
foreach(l, gset->content)
{
Index ref = lfirst_int(l);
appendStringInfoString(buf, sep);
get_rule_sortgroupclause(ref, targetlist,
false, context);
sep = ", ";
}
if (!omit_parens || list_length(gset->content) != 1)
appendStringInfoChar(buf, ')');
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
}
return;
case GROUPING_SET_ROLLUP:
appendStringInfoString(buf, "ROLLUP(");
break;
case GROUPING_SET_CUBE:
appendStringInfoString(buf, "CUBE(");
break;
case GROUPING_SET_SETS:
appendStringInfoString(buf, "GROUPING SETS (");
omit_child_parens = false;
break;
}
foreach(l, gset->content)
{
appendStringInfoString(buf, sep);
get_rule_groupingset(lfirst(l), targetlist, omit_child_parens, context);
sep = ", ";
}
appendStringInfoChar(buf, ')');
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
}
/*
* Display an ORDER BY list.
*/
static void
get_rule_orderby(List *orderList, List *targetList,
bool force_colno, deparse_context *context)
{
StringInfo buf = context->buf;
const char *sep;
ListCell *l;
sep = "";
foreach(l, orderList)
{
SortGroupClause *srt = (SortGroupClause *) lfirst(l);
Node *sortexpr;
Oid sortcoltype;
TypeCacheEntry *typentry;
appendStringInfoString(buf, sep);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
sortexpr = get_rule_sortgroupclause(srt->tleSortGroupRef, targetList,
force_colno, context);
sortcoltype = exprType(sortexpr);
/* See whether operator is default < or > for datatype */
typentry = lookup_type_cache(sortcoltype,
TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
if (srt->sortop == typentry->lt_opr)
{
/* ASC is default, so emit nothing for it */
if (srt->nulls_first)
appendStringInfoString(buf, " NULLS FIRST");
}
else if (srt->sortop == typentry->gt_opr)
{
appendStringInfoString(buf, " DESC");
/* DESC defaults to NULLS FIRST */
if (!srt->nulls_first)
appendStringInfoString(buf, " NULLS LAST");
}
else
{
appendStringInfo(buf, " USING %s",
generate_operator_name(srt->sortop,
sortcoltype,
sortcoltype));
/* be specific to eliminate ambiguity */
if (srt->nulls_first)
appendStringInfoString(buf, " NULLS FIRST");
else
appendStringInfoString(buf, " NULLS LAST");
}
sep = ", ";
}
}
/*
* Display a WINDOW clause.
*
* Note that the windowClause list might contain only anonymous window
* specifications, in which case we should print nothing here.
*/
static void
get_rule_windowclause(Query *query, deparse_context *context)
{
StringInfo buf = context->buf;
const char *sep;
ListCell *l;
sep = NULL;
foreach(l, query->windowClause)
{
WindowClause *wc = (WindowClause *) lfirst(l);
if (wc->name == NULL)
continue; /* ignore anonymous windows */
if (sep == NULL)
appendContextKeyword(context, " WINDOW ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
else
appendStringInfoString(buf, sep);
appendStringInfo(buf, "%s AS ", quote_identifier(wc->name));
get_rule_windowspec(wc, query->targetList, context);
sep = ", ";
}
}
/*
* Display a window definition
*/
static void
get_rule_windowspec(WindowClause *wc, List *targetList,
deparse_context *context)
{
StringInfo buf = context->buf;
bool needspace = false;
const char *sep;
ListCell *l;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
appendStringInfoString(buf, quote_identifier(wc->refname));
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
if (wc->partitionClause && !wc->refname)
{
if (needspace)
appendStringInfoChar(buf, ' ');
appendStringInfoString(buf, "PARTITION BY ");
sep = "";
foreach(l, wc->partitionClause)
{
SortGroupClause *grp = (SortGroupClause *) lfirst(l);
appendStringInfoString(buf, sep);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
get_rule_sortgroupclause(grp->tleSortGroupRef, targetList,
false, context);
sep = ", ";
}
needspace = true;
}
/* print ordering clause only if not inherited */
if (wc->orderClause && !wc->copiedOrder)
{
if (needspace)
appendStringInfoChar(buf, ' ');
appendStringInfoString(buf, "ORDER BY ");
get_rule_orderby(wc->orderClause, targetList, false, context);
needspace = true;
}
/* framing clause is never inherited, so print unless it's default */
if (wc->frameOptions & FRAMEOPTION_NONDEFAULT)
{
if (needspace)
appendStringInfoChar(buf, ' ');
if (wc->frameOptions & FRAMEOPTION_RANGE)
appendStringInfoString(buf, "RANGE ");
else if (wc->frameOptions & FRAMEOPTION_ROWS)
appendStringInfoString(buf, "ROWS ");
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
else if (wc->frameOptions & FRAMEOPTION_GROUPS)
appendStringInfoString(buf, "GROUPS ");
else
Assert(false);
if (wc->frameOptions & FRAMEOPTION_BETWEEN)
appendStringInfoString(buf, "BETWEEN ");
if (wc->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)
appendStringInfoString(buf, "UNBOUNDED PRECEDING ");
else if (wc->frameOptions & FRAMEOPTION_START_CURRENT_ROW)
appendStringInfoString(buf, "CURRENT ROW ");
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
else if (wc->frameOptions & FRAMEOPTION_START_OFFSET)
{
get_rule_expr(wc->startOffset, context, false);
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
if (wc->frameOptions & FRAMEOPTION_START_OFFSET_PRECEDING)
appendStringInfoString(buf, " PRECEDING ");
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
else if (wc->frameOptions & FRAMEOPTION_START_OFFSET_FOLLOWING)
appendStringInfoString(buf, " FOLLOWING ");
else
Assert(false);
}
else
Assert(false);
if (wc->frameOptions & FRAMEOPTION_BETWEEN)
{
appendStringInfoString(buf, "AND ");
if (wc->frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING)
appendStringInfoString(buf, "UNBOUNDED FOLLOWING ");
else if (wc->frameOptions & FRAMEOPTION_END_CURRENT_ROW)
appendStringInfoString(buf, "CURRENT ROW ");
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
else if (wc->frameOptions & FRAMEOPTION_END_OFFSET)
{
get_rule_expr(wc->endOffset, context, false);
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
if (wc->frameOptions & FRAMEOPTION_END_OFFSET_PRECEDING)
appendStringInfoString(buf, " PRECEDING ");
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
else if (wc->frameOptions & FRAMEOPTION_END_OFFSET_FOLLOWING)
appendStringInfoString(buf, " FOLLOWING ");
else
Assert(false);
}
else
Assert(false);
}
Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 06:06:50 +01:00
if (wc->frameOptions & FRAMEOPTION_EXCLUDE_CURRENT_ROW)
appendStringInfoString(buf, "EXCLUDE CURRENT ROW ");
else if (wc->frameOptions & FRAMEOPTION_EXCLUDE_GROUP)
appendStringInfoString(buf, "EXCLUDE GROUP ");
else if (wc->frameOptions & FRAMEOPTION_EXCLUDE_TIES)
appendStringInfoString(buf, "EXCLUDE TIES ");
/* we will now have a trailing space; remove it */
buf->len--;
}
appendStringInfoChar(buf, ')');
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* get_insert_query_def - Parse back an INSERT parsetree
* ----------
*/
static void
get_insert_query_def(Query *query, deparse_context *context,
bool colNamesVisible)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
RangeTblEntry *select_rte = NULL;
RangeTblEntry *values_rte = NULL;
RangeTblEntry *rte;
char *sep;
ListCell *l;
List *strippedexprs;
/* Insert the WITH clause if given */
get_with_clause(query, context);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* If it's an INSERT ... SELECT or multi-row VALUES, there will be a
* single RTE for the SELECT or VALUES. Plain VALUES has neither.
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
*/
foreach(l, query->rtable)
{
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
rte = (RangeTblEntry *) lfirst(l);
if (rte->rtekind == RTE_SUBQUERY)
{
if (select_rte)
elog(ERROR, "too many subquery RTEs in INSERT");
select_rte = rte;
}
2006-10-04 02:30:14 +02:00
if (rte->rtekind == RTE_VALUES)
{
if (values_rte)
elog(ERROR, "too many values RTEs in INSERT");
values_rte = rte;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
if (select_rte && values_rte)
elog(ERROR, "both subquery and values RTEs in INSERT");
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Start the query with INSERT INTO relname
*/
rte = rt_fetch(query->resultRelation, query->rtable);
Assert(rte->rtekind == RTE_RELATION);
if (PRETTY_INDENT(context))
{
context->indentLevel += PRETTYINDENT_STD;
appendStringInfoChar(buf, ' ');
}
appendStringInfo(buf, "INSERT INTO %s",
generate_relation_name(rte->relid, NIL));
/* Print the relation alias, if needed; INSERT requires explicit AS */
get_rte_alias(rte, query->resultRelation, true, context);
/* always want a space here */
appendStringInfoChar(buf, ' ');
/*
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
* Add the insert-column-names list. Any indirection decoration needed on
* the column names can be inferred from the top targetlist.
*/
strippedexprs = NIL;
sep = "";
if (query->targetList)
appendStringInfoChar(buf, '(');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
foreach(l, query->targetList)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
if (tle->resjunk)
continue; /* ignore junk entries */
appendStringInfoString(buf, sep);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
sep = ", ";
/*
* Put out name of target column; look in the catalogs, not at
* tle->resname, since resname will fail to track RENAME.
*/
appendStringInfoString(buf,
quote_identifier(get_attname(rte->relid,
tle->resno,
false)));
/*
* Print any indirection needed (subfields or subscripts), and strip
* off the top-level nodes representing the indirection assignments.
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
* Add the stripped expressions to strippedexprs. (If it's a
* single-VALUES statement, the stripped expressions are the VALUES to
* print below. Otherwise they're just Vars and not really
* interesting.)
*/
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
strippedexprs = lappend(strippedexprs,
processIndirection((Node *) tle->expr,
context));
}
if (query->targetList)
appendStringInfoString(buf, ") ");
if (query->override)
{
if (query->override == OVERRIDING_SYSTEM_VALUE)
appendStringInfoString(buf, "OVERRIDING SYSTEM VALUE ");
else if (query->override == OVERRIDING_USER_VALUE)
appendStringInfoString(buf, "OVERRIDING USER VALUE ");
}
if (select_rte)
{
/* Add the SELECT */
get_query_def(select_rte->subquery, buf, context->namespaces, NULL,
false,
context->prettyFlags, context->wrapColumn,
context->indentLevel);
}
else if (values_rte)
{
/* Add the multi-VALUES expression lists */
get_values_def(values_rte->values_lists, context);
}
else if (strippedexprs)
{
/* Add the single-VALUES expression list */
appendContextKeyword(context, "VALUES (",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
get_rule_list_toplevel(strippedexprs, context, false);
appendStringInfoChar(buf, ')');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
else
{
/* No expressions, so it must be DEFAULT VALUES */
appendStringInfoString(buf, "DEFAULT VALUES");
}
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
/* Add ON CONFLICT if present */
if (query->onConflict)
{
OnConflictExpr *confl = query->onConflict;
appendStringInfoString(buf, " ON CONFLICT");
if (confl->arbiterElems)
{
/* Add the single-VALUES expression list */
appendStringInfoChar(buf, '(');
get_rule_expr((Node *) confl->arbiterElems, context, false);
appendStringInfoChar(buf, ')');
/* Add a WHERE clause (for partial indexes) if given */
if (confl->arbiterWhere != NULL)
{
bool save_varprefix;
/*
* Force non-prefixing of Vars, since parser assumes that they
* belong to target relation. WHERE clause does not use
* InferenceElem, so this is separately required.
*/
save_varprefix = context->varprefix;
context->varprefix = false;
appendContextKeyword(context, " WHERE ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_rule_expr(confl->arbiterWhere, context, false);
context->varprefix = save_varprefix;
}
}
2016-05-11 22:20:03 +02:00
else if (OidIsValid(confl->constraint))
{
char *constraint = get_constraint_name(confl->constraint);
2016-05-11 22:20:03 +02:00
if (!constraint)
elog(ERROR, "cache lookup failed for constraint %u",
confl->constraint);
appendStringInfo(buf, " ON CONSTRAINT %s",
2016-05-11 22:20:03 +02:00
quote_identifier(constraint));
}
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
if (confl->action == ONCONFLICT_NOTHING)
{
appendStringInfoString(buf, " DO NOTHING");
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
}
else
{
appendStringInfoString(buf, " DO UPDATE SET ");
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
/* Deparse targetlist */
get_update_query_targetlist_def(query, confl->onConflictSet,
context, rte);
/* Add a WHERE clause if given */
if (confl->onConflictWhere != NULL)
{
appendContextKeyword(context, " WHERE ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_rule_expr(confl->onConflictWhere, context, false);
}
}
}
/* Add RETURNING if present */
if (query->returningList)
{
appendContextKeyword(context, " RETURNING",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_target_list(query->returningList, context, NULL, colNamesVisible);
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
* get_update_query_def - Parse back an UPDATE parsetree
* ----------
*/
static void
get_update_query_def(Query *query, deparse_context *context,
bool colNamesVisible)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
RangeTblEntry *rte;
/* Insert the WITH clause if given */
get_with_clause(query, context);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Start the query with UPDATE relname SET
*/
rte = rt_fetch(query->resultRelation, query->rtable);
Assert(rte->rtekind == RTE_RELATION);
if (PRETTY_INDENT(context))
{
appendStringInfoChar(buf, ' ');
context->indentLevel += PRETTYINDENT_STD;
}
appendStringInfo(buf, "UPDATE %s%s",
only_marker(rte),
generate_relation_name(rte->relid, NIL));
/* Print the relation alias, if needed */
get_rte_alias(rte, query->resultRelation, false, context);
appendStringInfoString(buf, " SET ");
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
/* Deparse targetlist */
get_update_query_targetlist_def(query, query->targetList, context, rte);
/* Add the FROM clause if needed */
get_from_clause(query, " FROM ", context);
/* Add a WHERE clause if given */
if (query->jointree->quals != NULL)
{
appendContextKeyword(context, " WHERE ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_rule_expr(query->jointree->quals, context, false);
}
/* Add RETURNING if present */
if (query->returningList)
{
appendContextKeyword(context, " RETURNING",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_target_list(query->returningList, context, NULL, colNamesVisible);
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
}
}
/* ----------
* get_update_query_targetlist_def - Parse back an UPDATE targetlist
* ----------
*/
static void
get_update_query_targetlist_def(Query *query, List *targetList,
deparse_context *context, RangeTblEntry *rte)
{
StringInfo buf = context->buf;
ListCell *l;
ListCell *next_ma_cell;
int remaining_ma_columns;
const char *sep;
SubLink *cur_ma_sublink;
List *ma_sublinks;
/*
* Prepare to deal with MULTIEXPR assignments: collect the source SubLinks
* into a list. We expect them to appear, in ID order, in resjunk tlist
* entries.
*/
ma_sublinks = NIL;
if (query->hasSubLinks) /* else there can't be any */
{
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
foreach(l, targetList)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
if (tle->resjunk && IsA(tle->expr, SubLink))
{
SubLink *sl = (SubLink *) tle->expr;
if (sl->subLinkType == MULTIEXPR_SUBLINK)
{
ma_sublinks = lappend(ma_sublinks, sl);
Assert(sl->subLinkId == list_length(ma_sublinks));
}
}
}
}
next_ma_cell = list_head(ma_sublinks);
cur_ma_sublink = NULL;
remaining_ma_columns = 0;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* Add the comma separated list of 'attname = value' */
sep = "";
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
foreach(l, targetList)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
Node *expr;
if (tle->resjunk)
continue; /* ignore junk entries */
/* Emit separator (OK whether we're in multiassignment or not) */
appendStringInfoString(buf, sep);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
sep = ", ";
2001-03-22 05:01:46 +01:00
/*
* Check to see if we're starting a multiassignment group: if so,
* output a left paren.
*/
if (next_ma_cell != NULL && cur_ma_sublink == NULL)
{
/*
* We must dig down into the expr to see if it's a PARAM_MULTIEXPR
* Param. That could be buried under FieldStores and
* SubscriptingRefs and CoerceToDomains (cf processIndirection()),
* and underneath those there could be an implicit type coercion.
* Because we would ignore implicit type coercions anyway, we
* don't need to be as careful as processIndirection() is about
* descending past implicit CoerceToDomains.
*/
expr = (Node *) tle->expr;
while (expr)
{
if (IsA(expr, FieldStore))
{
FieldStore *fstore = (FieldStore *) expr;
expr = (Node *) linitial(fstore->newvals);
}
else if (IsA(expr, SubscriptingRef))
{
SubscriptingRef *sbsref = (SubscriptingRef *) expr;
if (sbsref->refassgnexpr == NULL)
break;
expr = (Node *) sbsref->refassgnexpr;
}
else if (IsA(expr, CoerceToDomain))
{
CoerceToDomain *cdomain = (CoerceToDomain *) expr;
if (cdomain->coercionformat != COERCE_IMPLICIT_CAST)
break;
expr = (Node *) cdomain->arg;
}
else
break;
}
expr = strip_implicit_coercions(expr);
if (expr && IsA(expr, Param) &&
((Param *) expr)->paramkind == PARAM_MULTIEXPR)
{
cur_ma_sublink = (SubLink *) lfirst(next_ma_cell);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
next_ma_cell = lnext(ma_sublinks, next_ma_cell);
remaining_ma_columns = count_nonjunk_tlist_entries(((Query *) cur_ma_sublink->subselect)->targetList);
Assert(((Param *) expr)->paramid ==
((cur_ma_sublink->subLinkId << 16) | 1));
appendStringInfoChar(buf, '(');
}
}
/*
* Put out name of target column; look in the catalogs, not at
* tle->resname, since resname will fail to track RENAME.
*/
appendStringInfoString(buf,
quote_identifier(get_attname(rte->relid,
tle->resno,
false)));
/*
* Print any indirection needed (subfields or subscripts), and strip
* off the top-level nodes representing the indirection assignments.
*/
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
expr = processIndirection((Node *) tle->expr, context);
/*
* If we're in a multiassignment, skip printing anything more, unless
* this is the last column; in which case, what we print should be the
* sublink, not the Param.
*/
if (cur_ma_sublink != NULL)
{
if (--remaining_ma_columns > 0)
continue; /* not the last column of multiassignment */
appendStringInfoChar(buf, ')');
expr = (Node *) cur_ma_sublink;
cur_ma_sublink = NULL;
}
appendStringInfoString(buf, " = ");
get_rule_expr(expr, context, false);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
}
/* ----------
* get_delete_query_def - Parse back a DELETE parsetree
* ----------
*/
static void
get_delete_query_def(Query *query, deparse_context *context,
bool colNamesVisible)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
RangeTblEntry *rte;
/* Insert the WITH clause if given */
get_with_clause(query, context);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Start the query with DELETE FROM relname
*/
rte = rt_fetch(query->resultRelation, query->rtable);
Assert(rte->rtekind == RTE_RELATION);
if (PRETTY_INDENT(context))
{
appendStringInfoChar(buf, ' ');
context->indentLevel += PRETTYINDENT_STD;
}
appendStringInfo(buf, "DELETE FROM %s%s",
only_marker(rte),
generate_relation_name(rte->relid, NIL));
/* Print the relation alias, if needed */
get_rte_alias(rte, query->resultRelation, false, context);
/* Add the USING clause if given */
get_from_clause(query, " USING ", context);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* Add a WHERE clause if given */
if (query->jointree->quals != NULL)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
appendContextKeyword(context, " WHERE ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_rule_expr(query->jointree->quals, context, false);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* Add RETURNING if present */
if (query->returningList)
{
appendContextKeyword(context, " RETURNING",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_target_list(query->returningList, context, NULL, colNamesVisible);
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
* get_merge_query_def - Parse back a MERGE parsetree
* ----------
*/
static void
get_merge_query_def(Query *query, deparse_context *context,
bool colNamesVisible)
{
StringInfo buf = context->buf;
RangeTblEntry *rte;
ListCell *lc;
bool haveNotMatchedBySource;
/* Insert the WITH clause if given */
get_with_clause(query, context);
/*
* Start the query with MERGE INTO relname
*/
rte = rt_fetch(query->resultRelation, query->rtable);
Assert(rte->rtekind == RTE_RELATION);
if (PRETTY_INDENT(context))
{
appendStringInfoChar(buf, ' ');
context->indentLevel += PRETTYINDENT_STD;
}
appendStringInfo(buf, "MERGE INTO %s%s",
only_marker(rte),
generate_relation_name(rte->relid, NIL));
/* Print the relation alias, if needed */
get_rte_alias(rte, query->resultRelation, false, context);
/* Print the source relation and join clause */
get_from_clause(query, " USING ", context);
appendContextKeyword(context, " ON ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
get_rule_expr(query->mergeJoinCondition, context, false);
/*
* Test for any NOT MATCHED BY SOURCE actions. If there are none, then
* any NOT MATCHED BY TARGET actions are output as "WHEN NOT MATCHED", per
* SQL standard. Otherwise, we have a non-SQL-standard query, so output
* "BY SOURCE" / "BY TARGET" qualifiers for all NOT MATCHED actions, to be
* more explicit.
*/
haveNotMatchedBySource = false;
foreach(lc, query->mergeActionList)
{
MergeAction *action = lfirst_node(MergeAction, lc);
if (action->matchKind == MERGE_WHEN_NOT_MATCHED_BY_SOURCE)
{
haveNotMatchedBySource = true;
break;
}
}
/* Print each merge action */
foreach(lc, query->mergeActionList)
{
MergeAction *action = lfirst_node(MergeAction, lc);
appendContextKeyword(context, " WHEN ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
switch (action->matchKind)
{
case MERGE_WHEN_MATCHED:
appendStringInfo(buf, "MATCHED");
break;
case MERGE_WHEN_NOT_MATCHED_BY_SOURCE:
appendStringInfo(buf, "NOT MATCHED BY SOURCE");
break;
case MERGE_WHEN_NOT_MATCHED_BY_TARGET:
if (haveNotMatchedBySource)
appendStringInfo(buf, "NOT MATCHED BY TARGET");
else
appendStringInfo(buf, "NOT MATCHED");
break;
default:
elog(ERROR, "unrecognized matchKind: %d",
(int) action->matchKind);
}
if (action->qual)
{
appendContextKeyword(context, " AND ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 3);
get_rule_expr(action->qual, context, false);
}
appendContextKeyword(context, " THEN ",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 3);
if (action->commandType == CMD_INSERT)
{
/* This generally matches get_insert_query_def() */
List *strippedexprs = NIL;
const char *sep = "";
ListCell *lc2;
appendStringInfoString(buf, "INSERT");
if (action->targetList)
appendStringInfoString(buf, " (");
foreach(lc2, action->targetList)
{
TargetEntry *tle = (TargetEntry *) lfirst(lc2);
Assert(!tle->resjunk);
appendStringInfoString(buf, sep);
sep = ", ";
appendStringInfoString(buf,
quote_identifier(get_attname(rte->relid,
tle->resno,
false)));
strippedexprs = lappend(strippedexprs,
processIndirection((Node *) tle->expr,
context));
}
if (action->targetList)
appendStringInfoChar(buf, ')');
if (action->override)
{
if (action->override == OVERRIDING_SYSTEM_VALUE)
appendStringInfoString(buf, " OVERRIDING SYSTEM VALUE");
else if (action->override == OVERRIDING_USER_VALUE)
appendStringInfoString(buf, " OVERRIDING USER VALUE");
}
if (strippedexprs)
{
appendContextKeyword(context, " VALUES (",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 4);
get_rule_list_toplevel(strippedexprs, context, false);
appendStringInfoChar(buf, ')');
}
else
appendStringInfoString(buf, " DEFAULT VALUES");
}
else if (action->commandType == CMD_UPDATE)
{
appendStringInfoString(buf, "UPDATE SET ");
get_update_query_targetlist_def(query, action->targetList,
context, rte);
}
else if (action->commandType == CMD_DELETE)
appendStringInfoString(buf, "DELETE");
else if (action->commandType == CMD_NOTHING)
appendStringInfoString(buf, "DO NOTHING");
}
/* Add RETURNING if present */
if (query->returningList)
{
appendContextKeyword(context, " RETURNING",
-PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
get_target_list(query->returningList, context, NULL, colNamesVisible);
}
}
/* ----------
* get_utility_query_def - Parse back a UTILITY parsetree
* ----------
*/
static void
get_utility_query_def(Query *query, deparse_context *context)
{
StringInfo buf = context->buf;
if (query->utilityStmt && IsA(query->utilityStmt, NotifyStmt))
{
NotifyStmt *stmt = (NotifyStmt *) query->utilityStmt;
2001-03-22 05:01:46 +01:00
appendContextKeyword(context, "",
0, PRETTYINDENT_STD, 1);
appendStringInfo(buf, "NOTIFY %s",
quote_identifier(stmt->conditionname));
if (stmt->payload)
{
appendStringInfoString(buf, ", ");
simple_quote_literal(buf, stmt->payload);
}
}
else
{
/* Currently only NOTIFY utility commands can appear in rules */
elog(ERROR, "unexpected utility statement type");
}
}
/*
* Display a Var appropriately.
*
* In some cases (currently only when recursing into an unnamed join)
* the Var's varlevelsup has to be interpreted with respect to a context
* above the current one; levelsup indicates the offset.
*
* If istoplevel is true, the Var is at the top level of a SELECT's
* targetlist, which means we need special treatment of whole-row Vars.
* Instead of the normal "tab.*", we'll print "tab.*::typename", which is a
* dirty hack to prevent "tab.*" from being expanded into multiple columns.
* (The parser will strip the useless coercion, so no inefficiency is added in
* dump and reload.) We used to print just "tab" in such cases, but that is
* ambiguous and will yield the wrong result if "tab" is also a plain column
* name in the query.
*
* Returns the attname of the Var, or NULL if the Var has no attname (because
* it is a whole-row Var or a subplan output reference).
*/
static char *
get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
{
StringInfo buf = context->buf;
RangeTblEntry *rte;
AttrNumber attnum;
int netlevelsup;
deparse_namespace *dpns;
Remove arbitrary 64K-or-so limit on rangetable size. Up to now the size of a query's rangetable has been limited by the constants INNER_VAR et al, which mustn't be equal to any real rangetable index. 65000 doubtless seemed like enough for anybody, and it still is orders of magnitude larger than the number of joins we can realistically handle. However, we need a rangetable entry for each child partition that is (or might be) processed by a query. Queries with a few thousand partitions are getting more realistic, so that the day when that limit becomes a problem is in sight, even if it's not here yet. Hence, let's raise the limit. Rather than just increase the values of INNER_VAR et al, this patch adopts the approach of making them small negative values, so that rangetables could theoretically become as long as INT_MAX. The bulk of the patch is concerned with changing Var.varno and some related variables from "Index" (unsigned int) to plain "int". This is basically cosmetic, with little actual effect other than to help debuggers print their values nicely. As such, I've only bothered with changing places that could actually see INNER_VAR et al, which the parser and most of the planner don't. We do have to be careful in places that are performing less/greater comparisons on varnos, but there are very few such places, other than the IS_SPECIAL_VARNO macro itself. A notable side effect of this patch is that while it used to be possible to add INNER_VAR et al to a Bitmapset, that will now draw an error. I don't see any likelihood that it wouldn't be a bug to include these fake varnos in a bitmapset of real varnos, so I think this is all to the good. Although this touches outfuncs/readfuncs, I don't think a catversion bump is required, since stored rules would never contain Vars with these fake varnos. Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut Discussion: https://postgr.es/m/43c7f2f5-1e27-27aa-8c65-c91859d15190@postgrespro.ru
2021-09-15 20:11:21 +02:00
int varno;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
AttrNumber varattno;
deparse_columns *colinfo;
char *refname;
char *attname;
/* Find appropriate nesting depth */
netlevelsup = var->varlevelsup + levelsup;
if (netlevelsup >= list_length(context->namespaces))
elog(ERROR, "bogus varlevelsup: %d offset %d",
var->varlevelsup, levelsup);
dpns = (deparse_namespace *) list_nth(context->namespaces,
netlevelsup);
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
/*
* If we have a syntactic referent for the Var, and we're working from a
* parse tree, prefer to use the syntactic referent. Otherwise, fall back
* on the semantic referent. (Forcing use of the semantic referent when
* printing plan trees is a design choice that's perhaps more motivated by
* backwards compatibility than anything else. But it does have the
* advantage of making plans more explicit.)
*/
if (var->varnosyn > 0 && dpns->plan == NULL)
{
varno = var->varnosyn;
varattno = var->varattnosyn;
}
else
{
varno = var->varno;
varattno = var->varattno;
}
/*
* Try to find the relevant RTE in this rtable. In a plan tree, it's
* likely that varno is OUTER_VAR or INNER_VAR, in which case we must dig
* down into the subplans, or INDEX_VAR, which is resolved similarly. Also
* find the aliases previously assigned for this RTE.
*/
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
if (varno >= 1 && varno <= list_length(dpns->rtable))
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
/*
* We might have been asked to map child Vars to some parent relation.
*/
if (context->appendparents && dpns->appendrels)
{
Remove arbitrary 64K-or-so limit on rangetable size. Up to now the size of a query's rangetable has been limited by the constants INNER_VAR et al, which mustn't be equal to any real rangetable index. 65000 doubtless seemed like enough for anybody, and it still is orders of magnitude larger than the number of joins we can realistically handle. However, we need a rangetable entry for each child partition that is (or might be) processed by a query. Queries with a few thousand partitions are getting more realistic, so that the day when that limit becomes a problem is in sight, even if it's not here yet. Hence, let's raise the limit. Rather than just increase the values of INNER_VAR et al, this patch adopts the approach of making them small negative values, so that rangetables could theoretically become as long as INT_MAX. The bulk of the patch is concerned with changing Var.varno and some related variables from "Index" (unsigned int) to plain "int". This is basically cosmetic, with little actual effect other than to help debuggers print their values nicely. As such, I've only bothered with changing places that could actually see INNER_VAR et al, which the parser and most of the planner don't. We do have to be careful in places that are performing less/greater comparisons on varnos, but there are very few such places, other than the IS_SPECIAL_VARNO macro itself. A notable side effect of this patch is that while it used to be possible to add INNER_VAR et al to a Bitmapset, that will now draw an error. I don't see any likelihood that it wouldn't be a bug to include these fake varnos in a bitmapset of real varnos, so I think this is all to the good. Although this touches outfuncs/readfuncs, I don't think a catversion bump is required, since stored rules would never contain Vars with these fake varnos. Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut Discussion: https://postgr.es/m/43c7f2f5-1e27-27aa-8c65-c91859d15190@postgrespro.ru
2021-09-15 20:11:21 +02:00
int pvarno = varno;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
AttrNumber pvarattno = varattno;
AppendRelInfo *appinfo = dpns->appendrels[pvarno];
bool found = false;
/* Only map up to inheritance parents, not UNION ALL appendrels */
while (appinfo &&
rt_fetch(appinfo->parent_relid,
dpns->rtable)->rtekind == RTE_RELATION)
{
found = false;
if (pvarattno > 0) /* system columns stay as-is */
{
if (pvarattno > appinfo->num_child_cols)
break; /* safety check */
pvarattno = appinfo->parent_colnos[pvarattno - 1];
if (pvarattno == 0)
break; /* Var is local to child */
}
pvarno = appinfo->parent_relid;
found = true;
/* If the parent is itself a child, continue up. */
Assert(pvarno > 0 && pvarno <= list_length(dpns->rtable));
appinfo = dpns->appendrels[pvarno];
}
/*
* If we found an ancestral rel, and that rel is included in
* appendparents, print that column not the original one.
*/
if (found && bms_is_member(pvarno, context->appendparents))
{
varno = pvarno;
varattno = pvarattno;
}
}
rte = rt_fetch(varno, dpns->rtable);
refname = (char *) list_nth(dpns->rtable_names, varno - 1);
colinfo = deparse_columns_fetch(varno, dpns);
attnum = varattno;
}
else
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
resolve_special_varno((Node *) var, context,
get_special_variable, NULL);
return NULL;
}
/*
* The planner will sometimes emit Vars referencing resjunk elements of a
* subquery's target list (this is currently only possible if it chooses
* to generate a "physical tlist" for a SubqueryScan or CteScan node).
* Although we prefer to print subquery-referencing Vars using the
* subquery's alias, that's not possible for resjunk items since they have
* no alias. So in that case, drill down to the subplan and print the
* contents of the referenced tlist item. This works because in a plan
* tree, such Vars can only occur in a SubqueryScan or CteScan node, and
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* we'll have set dpns->inner_plan to reference the child plan node.
*/
if ((rte->rtekind == RTE_SUBQUERY || rte->rtekind == RTE_CTE) &&
attnum > list_length(rte->eref->colnames) &&
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
dpns->inner_plan)
{
TargetEntry *tle;
deparse_namespace save_dpns;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
tle = get_tle_by_resno(dpns->inner_tlist, attnum);
if (!tle)
elog(ERROR, "invalid attnum %d for relation \"%s\"",
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
attnum, rte->eref->aliasname);
Assert(netlevelsup == 0);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(dpns, dpns->inner_plan, &save_dpns);
/*
* Force parentheses because our caller probably assumed a Var is a
* simple expression.
*/
if (!IsA(tle->expr, Var))
appendStringInfoChar(buf, '(');
get_rule_expr((Node *) tle->expr, context, true);
if (!IsA(tle->expr, Var))
appendStringInfoChar(buf, ')');
pop_child_plan(dpns, &save_dpns);
return NULL;
}
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
/*
* If it's an unnamed join, look at the expansion of the alias variable.
* If it's a simple reference to one of the input vars, then recursively
* print the name of that var instead. When it's not a simple reference,
* we have to just print the unqualified join column name. (This can only
* happen with "dangerous" merged columns in a JOIN USING; we took pains
* previously to make the unqualified column name unique in such cases.)
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
*
* This wouldn't work in decompiling plan trees, because we don't store
* joinaliasvars lists after planning; but a plan tree should never
* contain a join alias variable.
*/
if (rte->rtekind == RTE_JOIN && rte->alias == NULL)
{
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
if (rte->joinaliasvars == NIL)
elog(ERROR, "cannot decompile join alias var in plan tree");
if (attnum > 0)
{
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
Var *aliasvar;
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
aliasvar = (Var *) list_nth(rte->joinaliasvars, attnum - 1);
/* we intentionally don't strip implicit coercions here */
if (aliasvar && IsA(aliasvar, Var))
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
{
return get_variable(aliasvar, var->varlevelsup + levelsup,
istoplevel, context);
}
}
Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was Ashutosh Bapat's complaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
2012-09-22 01:03:10 +02:00
/*
* Unnamed join has no refname. (Note: since it's unnamed, there is
* no way the user could have referenced it to create a whole-row Var
* for it. So we don't have to cover that case below.)
*/
Assert(refname == NULL);
}
if (attnum == InvalidAttrNumber)
attname = NULL;
else if (attnum > 0)
{
/* Get column name to use from the colinfo struct */
if (attnum > colinfo->num_cols)
elog(ERROR, "invalid attnum %d for relation \"%s\"",
attnum, rte->eref->aliasname);
attname = colinfo->colnames[attnum - 1];
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
/*
* If we find a Var referencing a dropped column, it seems better to
* print something (anything) than to fail. In general this should
Close old gap in dependency checks for functions returning composite. The dependency logic failed to register a column-level dependency when a view or rule contains a reference to a specific column of the result of a function-returning-composite. That meant you could drop the column from the composite type, causing trouble for future executions of the view. We've known about this for years, but never summoned the energy to actually fix it, instead installing various low-level defenses to prevent crashing on references to dropped columns. We had to do that to plug the hole in stable branches, where there might be pre-existing broken references; but let's fix the root cause today. To do that, add some logic (borrowed from get_rte_attribute_is_dropped) to find_expr_references_walker, to check whether a Var referencing an RTE_FUNCTION RTE is referencing a column of a composite type, and if so add the proper dependency. However ... it seems mighty unwise to remove said low-level defenses, since there could be other bugs now or in the future that allow reaching them. By the same token, letting those defenses go untested seems unwise. Hence, rather than just dropping the associated test cases, hack them to continue working by the expedient of manually dropping the pg_depend entries that this fix installs. Back-patch into v15. I don't want to risk changing this behavior in stable branches, but it seems not too late for v15. (Since we have already forced initdb for beta3, we can be sure that all production v15 installations will have these added dependencies.) Discussion: https://postgr.es/m/182492.1658431155@sss.pgh.pa.us
2022-07-22 18:46:42 +02:00
* not happen, but it used to be possible for some cases involving
* functions returning named composite types, and perhaps there are
* still bugs out there.
Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 19:56:02 +02:00
*/
if (attname == NULL)
attname = "?dropped?column?";
}
else
{
/* System column - name is fixed, get it from the catalog */
attname = get_rte_attribute_name(rte, attnum);
}
if (refname && (context->varprefix || attname == NULL))
{
appendStringInfoString(buf, quote_identifier(refname));
appendStringInfoChar(buf, '.');
}
if (attname)
appendStringInfoString(buf, quote_identifier(attname));
else
{
appendStringInfoChar(buf, '*');
if (istoplevel)
appendStringInfo(buf, "::%s",
format_type_with_typemod(var->vartype,
var->vartypmod));
}
return attname;
}
/*
* Deparse a Var which references OUTER_VAR, INNER_VAR, or INDEX_VAR. This
* routine is actually a callback for resolve_special_varno, which handles
* finding the correct TargetEntry. We get the expression contained in that
* TargetEntry and just need to deparse it, a job we can throw back on
* get_rule_expr.
*/
static void
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
get_special_variable(Node *node, deparse_context *context, void *callback_arg)
{
StringInfo buf = context->buf;
/*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* For a non-Var referent, force parentheses because our caller probably
* assumed a Var is a simple expression.
*/
if (!IsA(node, Var))
appendStringInfoChar(buf, '(');
get_rule_expr(node, context, true);
if (!IsA(node, Var))
appendStringInfoChar(buf, ')');
}
/*
* Chase through plan references to special varnos (OUTER_VAR, INNER_VAR,
* INDEX_VAR) until we find a real Var or some kind of non-Var node; then,
* invoke the callback provided.
*/
static void
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
resolve_special_varno(Node *node, deparse_context *context,
rsv_callback callback, void *callback_arg)
{
Var *var;
deparse_namespace *dpns;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
/* This function is recursive, so let's be paranoid. */
check_stack_depth();
/* If it's not a Var, invoke the callback. */
if (!IsA(node, Var))
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
(*callback) (node, context, callback_arg);
return;
}
/* Find appropriate nesting depth */
var = (Var *) node;
dpns = (deparse_namespace *) list_nth(context->namespaces,
var->varlevelsup);
/*
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
* If varno is special, recurse. (Don't worry about varnosyn; if we're
* here, we already decided not to use that.)
*/
if (var->varno == OUTER_VAR && dpns->outer_tlist)
{
TargetEntry *tle;
deparse_namespace save_dpns;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Bitmapset *save_appendparents;
tle = get_tle_by_resno(dpns->outer_tlist, var->varattno);
if (!tle)
elog(ERROR, "bogus varattno for OUTER_VAR var: %d", var->varattno);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
/*
* If we're descending to the first child of an Append or MergeAppend,
* update appendparents. This will affect deparsing of all Vars
* appearing within the eventually-resolved subexpression.
*/
save_appendparents = context->appendparents;
if (IsA(dpns->plan, Append))
context->appendparents = bms_union(context->appendparents,
((Append *) dpns->plan)->apprelids);
else if (IsA(dpns->plan, MergeAppend))
context->appendparents = bms_union(context->appendparents,
((MergeAppend *) dpns->plan)->apprelids);
push_child_plan(dpns, dpns->outer_plan, &save_dpns);
resolve_special_varno((Node *) tle->expr, context,
callback, callback_arg);
pop_child_plan(dpns, &save_dpns);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
context->appendparents = save_appendparents;
return;
}
else if (var->varno == INNER_VAR && dpns->inner_tlist)
{
TargetEntry *tle;
deparse_namespace save_dpns;
tle = get_tle_by_resno(dpns->inner_tlist, var->varattno);
if (!tle)
elog(ERROR, "bogus varattno for INNER_VAR var: %d", var->varattno);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(dpns, dpns->inner_plan, &save_dpns);
resolve_special_varno((Node *) tle->expr, context,
callback, callback_arg);
pop_child_plan(dpns, &save_dpns);
return;
}
else if (var->varno == INDEX_VAR && dpns->index_tlist)
{
TargetEntry *tle;
tle = get_tle_by_resno(dpns->index_tlist, var->varattno);
if (!tle)
elog(ERROR, "bogus varattno for INDEX_VAR var: %d", var->varattno);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
resolve_special_varno((Node *) tle->expr, context,
callback, callback_arg);
return;
}
else if (var->varno < 1 || var->varno > list_length(dpns->rtable))
elog(ERROR, "bogus varno: %d", var->varno);
/* Not special. Just invoke the callback. */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
(*callback) (node, context, callback_arg);
}
/*
* Get the name of a field of an expression of composite type. The
* expression is usually a Var, but we handle other cases too.
*
* levelsup is an extra offset to interpret the Var's varlevelsup correctly.
*
* This is fairly straightforward when the expression has a named composite
* type; we need only look up the type in the catalogs. However, the type
* could also be RECORD. Since no actual table or view column is allowed to
* have type RECORD, a Var of type RECORD must refer to a JOIN or FUNCTION RTE
* or to a subquery output. We drill down to find the ultimate defining
* expression and attempt to infer the field name from it. We ereport if we
* can't determine the name.
*
* Similarly, a PARAM of type RECORD has to refer to some expression of
* a determinable composite type.
*/
static const char *
get_name_for_var_field(Var *var, int fieldno,
int levelsup, deparse_context *context)
{
RangeTblEntry *rte;
AttrNumber attnum;
int netlevelsup;
deparse_namespace *dpns;
Remove arbitrary 64K-or-so limit on rangetable size. Up to now the size of a query's rangetable has been limited by the constants INNER_VAR et al, which mustn't be equal to any real rangetable index. 65000 doubtless seemed like enough for anybody, and it still is orders of magnitude larger than the number of joins we can realistically handle. However, we need a rangetable entry for each child partition that is (or might be) processed by a query. Queries with a few thousand partitions are getting more realistic, so that the day when that limit becomes a problem is in sight, even if it's not here yet. Hence, let's raise the limit. Rather than just increase the values of INNER_VAR et al, this patch adopts the approach of making them small negative values, so that rangetables could theoretically become as long as INT_MAX. The bulk of the patch is concerned with changing Var.varno and some related variables from "Index" (unsigned int) to plain "int". This is basically cosmetic, with little actual effect other than to help debuggers print their values nicely. As such, I've only bothered with changing places that could actually see INNER_VAR et al, which the parser and most of the planner don't. We do have to be careful in places that are performing less/greater comparisons on varnos, but there are very few such places, other than the IS_SPECIAL_VARNO macro itself. A notable side effect of this patch is that while it used to be possible to add INNER_VAR et al to a Bitmapset, that will now draw an error. I don't see any likelihood that it wouldn't be a bug to include these fake varnos in a bitmapset of real varnos, so I think this is all to the good. Although this touches outfuncs/readfuncs, I don't think a catversion bump is required, since stored rules would never contain Vars with these fake varnos. Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut Discussion: https://postgr.es/m/43c7f2f5-1e27-27aa-8c65-c91859d15190@postgrespro.ru
2021-09-15 20:11:21 +02:00
int varno;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
AttrNumber varattno;
TupleDesc tupleDesc;
Node *expr;
/*
* If it's a RowExpr that was expanded from a whole-row Var, use the
* column names attached to it. (We could let get_expr_result_tupdesc()
* handle this, but it's much cheaper to just pull out the name we need.)
*/
if (IsA(var, RowExpr))
{
RowExpr *r = (RowExpr *) var;
if (fieldno > 0 && fieldno <= list_length(r->colnames))
return strVal(list_nth(r->colnames, fieldno - 1));
}
/*
* If it's a Param of type RECORD, try to find what the Param refers to.
*/
if (IsA(var, Param))
{
Param *param = (Param *) var;
ListCell *ancestor_cell;
expr = find_param_referent(param, context, &dpns, &ancestor_cell);
if (expr)
{
/* Found a match, so recurse to decipher the field name */
deparse_namespace save_dpns;
const char *result;
push_ancestor_plan(dpns, ancestor_cell, &save_dpns);
result = get_name_for_var_field((Var *) expr, fieldno,
0, context);
pop_ancestor_plan(dpns, &save_dpns);
return result;
}
}
/*
* If it's a Var of type RECORD, we have to find what the Var refers to;
* if not, we can use get_expr_result_tupdesc().
*/
if (!IsA(var, Var) ||
var->vartype != RECORDOID)
{
tupleDesc = get_expr_result_tupdesc((Node *) var, false);
/* Got the tupdesc, so we can extract the field name */
Assert(fieldno >= 1 && fieldno <= tupleDesc->natts);
return NameStr(TupleDescAttr(tupleDesc, fieldno - 1)->attname);
}
/* Find appropriate nesting depth */
netlevelsup = var->varlevelsup + levelsup;
if (netlevelsup >= list_length(context->namespaces))
elog(ERROR, "bogus varlevelsup: %d offset %d",
var->varlevelsup, levelsup);
dpns = (deparse_namespace *) list_nth(context->namespaces,
netlevelsup);
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
/*
* If we have a syntactic referent for the Var, and we're working from a
* parse tree, prefer to use the syntactic referent. Otherwise, fall back
* on the semantic referent. (See comments in get_variable().)
*/
if (var->varnosyn > 0 && dpns->plan == NULL)
{
varno = var->varnosyn;
varattno = var->varattnosyn;
}
else
{
varno = var->varno;
varattno = var->varattno;
}
/*
* Try to find the relevant RTE in this rtable. In a plan tree, it's
* likely that varno is OUTER_VAR or INNER_VAR, in which case we must dig
* down into the subplans, or INDEX_VAR, which is resolved similarly.
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
*
* Note: unlike get_variable and resolve_special_varno, we need not worry
* about inheritance mapping: a child Var should have the same datatype as
* its parent, and here we're really only interested in the Var's type.
*/
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
if (varno >= 1 && varno <= list_length(dpns->rtable))
{
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
rte = rt_fetch(varno, dpns->rtable);
attnum = varattno;
}
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
else if (varno == OUTER_VAR && dpns->outer_tlist)
{
TargetEntry *tle;
deparse_namespace save_dpns;
const char *result;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
tle = get_tle_by_resno(dpns->outer_tlist, varattno);
if (!tle)
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
elog(ERROR, "bogus varattno for OUTER_VAR var: %d", varattno);
Assert(netlevelsup == 0);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(dpns, dpns->outer_plan, &save_dpns);
result = get_name_for_var_field((Var *) tle->expr, fieldno,
levelsup, context);
pop_child_plan(dpns, &save_dpns);
return result;
}
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
else if (varno == INNER_VAR && dpns->inner_tlist)
{
TargetEntry *tle;
deparse_namespace save_dpns;
const char *result;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
tle = get_tle_by_resno(dpns->inner_tlist, varattno);
if (!tle)
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
elog(ERROR, "bogus varattno for INNER_VAR var: %d", varattno);
Assert(netlevelsup == 0);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(dpns, dpns->inner_plan, &save_dpns);
result = get_name_for_var_field((Var *) tle->expr, fieldno,
levelsup, context);
pop_child_plan(dpns, &save_dpns);
return result;
}
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
else if (varno == INDEX_VAR && dpns->index_tlist)
{
TargetEntry *tle;
const char *result;
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
tle = get_tle_by_resno(dpns->index_tlist, varattno);
if (!tle)
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
elog(ERROR, "bogus varattno for INDEX_VAR var: %d", varattno);
Assert(netlevelsup == 0);
result = get_name_for_var_field((Var *) tle->expr, fieldno,
levelsup, context);
return result;
}
else
{
Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit 5815696bc. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
2020-01-09 17:56:59 +01:00
elog(ERROR, "bogus varno: %d", varno);
return NULL; /* keep compiler quiet */
}
if (attnum == InvalidAttrNumber)
{
/* Var is whole-row reference to RTE, so select the right field */
return get_rte_attribute_name(rte, fieldno);
}
/*
* This part has essentially the same logic as the parser's
* expandRecordVariable() function, but we are dealing with a different
* representation of the input context, and we only need one field name
* not a TupleDesc. Also, we need special cases for finding subquery and
* CTE subplans when deparsing Plan trees.
*/
expr = (Node *) var; /* default if we can't drill down */
switch (rte->rtekind)
{
case RTE_RELATION:
case RTE_VALUES:
case RTE_NAMEDTUPLESTORE:
In the planner, replace an empty FROM clause with a dummy RTE. The fact that "SELECT expression" has no base relations has long been a thorn in the side of the planner. It makes it hard to flatten a sub-query that looks like that, or is a trivial VALUES() item, because the planner generally uses relid sets to identify sub-relations, and such a sub-query would have an empty relid set if we flattened it. prepjointree.c contains some baroque logic that works around this in certain special cases --- but there is a much better answer. We can replace an empty FROM clause with a dummy RTE that acts like a table of one row and no columns, and then there are no such corner cases to worry about. Instead we need some logic to get rid of useless dummy RTEs, but that's simpler and covers more cases than what was there before. For really trivial cases, where the query is just "SELECT expression" and nothing else, there's a hazard that adding the extra RTE makes for a noticeable slowdown; even though it's not much processing, there's not that much for the planner to do overall. However testing says that the penalty is very small, close to the noise level. In more complex queries, this is able to find optimizations that we could not find before. The new RTE type is called RTE_RESULT, since the "scan" plan type it gives rise to is a Result node (the same plan we produced for a "SELECT expression" query before). To avoid confusion, rename the old ResultPath path type to GroupResultPath, reflecting that it's only used in degenerate grouping cases where we know the query produces just one grouped row. (It wouldn't work to unify the two cases, because there are different rules about where the associated quals live during query_planner.) Note: although this touches readfuncs.c, I don't think a catversion bump is required, because the added case can't occur in stored rules, only plans. Patch by me, reviewed by David Rowley and Mark Dilger Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
case RTE_RESULT:
2005-10-15 04:49:52 +02:00
/*
* This case should not occur: a column of a table, values list,
* or ENR shouldn't have type RECORD. Fall through and fail (most
* likely) at the bottom.
*/
break;
case RTE_SUBQUERY:
/* Subselect-in-FROM: examine sub-select's output expr */
{
if (rte->subquery)
{
TargetEntry *ste = get_tle_by_resno(rte->subquery->targetList,
attnum);
if (ste == NULL || ste->resjunk)
elog(ERROR, "subquery %s does not have attribute %d",
rte->eref->aliasname, attnum);
expr = (Node *) ste->expr;
if (IsA(expr, Var))
{
/*
* Recurse into the sub-select to see what its Var
* refers to. We have to build an additional level of
* namespace to keep in step with varlevelsup in the
* subselect; furthermore, the subquery RTE might be
* from an outer query level, in which case the
* namespace for the subselect must have that outer
* level as parent namespace.
*/
List *save_nslist = context->namespaces;
List *parent_namespaces;
deparse_namespace mydpns;
const char *result;
parent_namespaces = list_copy_tail(context->namespaces,
netlevelsup);
set_deparse_for_query(&mydpns, rte->subquery,
parent_namespaces);
context->namespaces = lcons(&mydpns, parent_namespaces);
result = get_name_for_var_field((Var *) expr, fieldno,
0, context);
context->namespaces = save_nslist;
return result;
}
/* else fall through to inspect the expression */
}
else
{
/*
* We're deparsing a Plan tree so we don't have complete
* RTE entries (in particular, rte->subquery is NULL). But
* the only place we'd see a Var directly referencing a
* SUBQUERY RTE is in a SubqueryScan plan node, and we can
* look into the child plan's tlist instead.
*/
TargetEntry *tle;
deparse_namespace save_dpns;
const char *result;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (!dpns->inner_plan)
elog(ERROR, "failed to find plan for subquery %s",
rte->eref->aliasname);
tle = get_tle_by_resno(dpns->inner_tlist, attnum);
if (!tle)
elog(ERROR, "bogus varattno for subquery var: %d",
attnum);
Assert(netlevelsup == 0);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(dpns, dpns->inner_plan, &save_dpns);
result = get_name_for_var_field((Var *) tle->expr, fieldno,
levelsup, context);
pop_child_plan(dpns, &save_dpns);
return result;
}
}
break;
case RTE_JOIN:
/* Join RTE --- recursively inspect the alias variable */
if (rte->joinaliasvars == NIL)
elog(ERROR, "cannot decompile join alias var in plan tree");
Assert(attnum > 0 && attnum <= list_length(rte->joinaliasvars));
expr = (Node *) list_nth(rte->joinaliasvars, attnum - 1);
Assert(expr != NULL);
/* we intentionally don't strip implicit coercions here */
if (IsA(expr, Var))
return get_name_for_var_field((Var *) expr, fieldno,
var->varlevelsup + levelsup,
context);
/* else fall through to inspect the expression */
break;
case RTE_FUNCTION:
case RTE_TABLEFUNC:
2005-10-15 04:49:52 +02:00
/*
* We couldn't get here unless a function is declared with one of
* its result columns as RECORD, which is not allowed.
*/
break;
case RTE_CTE:
/* CTE reference: examine subquery's output expr */
{
CommonTableExpr *cte = NULL;
Index ctelevelsup;
ListCell *lc;
/*
* Try to find the referenced CTE using the namespace stack.
*/
ctelevelsup = rte->ctelevelsup + netlevelsup;
if (ctelevelsup >= list_length(context->namespaces))
lc = NULL;
else
{
deparse_namespace *ctedpns;
ctedpns = (deparse_namespace *)
list_nth(context->namespaces, ctelevelsup);
foreach(lc, ctedpns->ctes)
{
cte = (CommonTableExpr *) lfirst(lc);
if (strcmp(cte->ctename, rte->ctename) == 0)
break;
}
}
if (lc != NULL)
{
Query *ctequery = (Query *) cte->ctequery;
TargetEntry *ste = get_tle_by_resno(GetCTETargetList(cte),
attnum);
if (ste == NULL || ste->resjunk)
elog(ERROR, "CTE %s does not have attribute %d",
rte->eref->aliasname, attnum);
expr = (Node *) ste->expr;
if (IsA(expr, Var))
{
/*
* Recurse into the CTE to see what its Var refers to.
* We have to build an additional level of namespace
* to keep in step with varlevelsup in the CTE;
* furthermore it could be an outer CTE (compare
* SUBQUERY case above).
*/
List *save_nslist = context->namespaces;
List *parent_namespaces;
deparse_namespace mydpns;
const char *result;
parent_namespaces = list_copy_tail(context->namespaces,
ctelevelsup);
set_deparse_for_query(&mydpns, ctequery,
parent_namespaces);
context->namespaces = lcons(&mydpns, parent_namespaces);
result = get_name_for_var_field((Var *) expr, fieldno,
0, context);
context->namespaces = save_nslist;
return result;
}
/* else fall through to inspect the expression */
}
else
{
/*
* We're deparsing a Plan tree so we don't have a CTE
* list. But the only places we'd see a Var directly
* referencing a CTE RTE are in CteScan or WorkTableScan
* plan nodes. For those cases, set_deparse_plan arranged
* for dpns->inner_plan to be the plan node that emits the
* CTE or RecursiveUnion result, and we can look at its
* tlist instead.
*/
TargetEntry *tle;
deparse_namespace save_dpns;
const char *result;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (!dpns->inner_plan)
elog(ERROR, "failed to find plan for CTE %s",
rte->eref->aliasname);
tle = get_tle_by_resno(dpns->inner_tlist, attnum);
if (!tle)
elog(ERROR, "bogus varattno for subquery var: %d",
attnum);
Assert(netlevelsup == 0);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
push_child_plan(dpns, dpns->inner_plan, &save_dpns);
result = get_name_for_var_field((Var *) tle->expr, fieldno,
levelsup, context);
pop_child_plan(dpns, &save_dpns);
return result;
}
}
break;
}
/*
* We now have an expression we can't expand any more, so see if
* get_expr_result_tupdesc() can do anything with it.
*/
tupleDesc = get_expr_result_tupdesc(expr, false);
/* Got the tupdesc, so we can extract the field name */
Assert(fieldno >= 1 && fieldno <= tupleDesc->natts);
return NameStr(TupleDescAttr(tupleDesc, fieldno - 1)->attname);
}
/*
* Try to find the referenced expression for a PARAM_EXEC Param that might
* reference a parameter supplied by an upper NestLoop or SubPlan plan node.
*
* If successful, return the expression and set *dpns_p and *ancestor_cell_p
* appropriately for calling push_ancestor_plan(). If no referent can be
* found, return NULL.
*/
static Node *
find_param_referent(Param *param, deparse_context *context,
deparse_namespace **dpns_p, ListCell **ancestor_cell_p)
{
/* Initialize output parameters to prevent compiler warnings */
*dpns_p = NULL;
*ancestor_cell_p = NULL;
/*
* If it's a PARAM_EXEC parameter, look for a matching NestLoopParam or
* SubPlan argument. This will necessarily be in some ancestor of the
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* current expression's Plan node.
*/
if (param->paramkind == PARAM_EXEC)
{
deparse_namespace *dpns;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Plan *child_plan;
ListCell *lc;
dpns = (deparse_namespace *) linitial(context->namespaces);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
child_plan = dpns->plan;
foreach(lc, dpns->ancestors)
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Node *ancestor = (Node *) lfirst(lc);
ListCell *lc2;
/*
* NestLoops transmit params to their inner child only.
*/
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (IsA(ancestor, NestLoop) &&
child_plan == innerPlan(ancestor))
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
NestLoop *nl = (NestLoop *) ancestor;
foreach(lc2, nl->nestParams)
{
NestLoopParam *nlp = (NestLoopParam *) lfirst(lc2);
if (nlp->paramno == param->paramid)
{
/* Found a match, so return it */
*dpns_p = dpns;
*ancestor_cell_p = lc;
return (Node *) nlp->paramval;
}
}
}
/*
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
* If ancestor is a SubPlan, check the arguments it provides.
*/
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
if (IsA(ancestor, SubPlan))
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
SubPlan *subplan = (SubPlan *) ancestor;
ListCell *lc3;
ListCell *lc4;
forboth(lc3, subplan->parParam, lc4, subplan->args)
{
int paramid = lfirst_int(lc3);
Node *arg = (Node *) lfirst(lc4);
if (paramid == param->paramid)
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
/*
* Found a match, so return it. But, since Vars in
* the arg are to be evaluated in the surrounding
* context, we have to point to the next ancestor item
* that is *not* a SubPlan.
*/
ListCell *rest;
for_each_cell(rest, dpns->ancestors,
lnext(dpns->ancestors, lc))
{
Node *ancestor2 = (Node *) lfirst(rest);
if (!IsA(ancestor2, SubPlan))
{
*dpns_p = dpns;
*ancestor_cell_p = rest;
return arg;
}
}
elog(ERROR, "SubPlan cannot be outermost ancestor");
}
}
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
/* SubPlan isn't a kind of Plan, so skip the rest */
continue;
}
/*
* We need not consider the ancestor's initPlan list, since
* initplans never have any parParams.
*/
/* No luck, crawl up to next ancestor */
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
child_plan = (Plan *) ancestor;
}
}
/* No referent found */
return NULL;
}
/*
* Try to find a subplan/initplan that emits the value for a PARAM_EXEC Param.
*
* If successful, return the generating subplan/initplan and set *column_p
* to the subplan's 0-based output column number.
* Otherwise, return NULL.
*/
static SubPlan *
find_param_generator(Param *param, deparse_context *context, int *column_p)
{
/* Initialize output parameter to prevent compiler warnings */
*column_p = 0;
/*
* If it's a PARAM_EXEC parameter, search the current plan node as well as
* ancestor nodes looking for a subplan or initplan that emits the value
* for the Param. It could appear in the setParams of an initplan or
* MULTIEXPR_SUBLINK subplan, or in the paramIds of an ancestral SubPlan.
*/
if (param->paramkind == PARAM_EXEC)
{
SubPlan *result;
deparse_namespace *dpns;
ListCell *lc;
dpns = (deparse_namespace *) linitial(context->namespaces);
/* First check the innermost plan node's initplans */
result = find_param_generator_initplan(param, dpns->plan, column_p);
if (result)
return result;
/*
* The plan's targetlist might contain MULTIEXPR_SUBLINK SubPlans,
* which can be referenced by Params elsewhere in the targetlist.
* (Such Params should always be in the same targetlist, so there's no
* need to do this work at upper plan nodes.)
*/
foreach_node(TargetEntry, tle, dpns->plan->targetlist)
{
if (tle->expr && IsA(tle->expr, SubPlan))
{
SubPlan *subplan = (SubPlan *) tle->expr;
if (subplan->subLinkType == MULTIEXPR_SUBLINK)
{
foreach_int(paramid, subplan->setParam)
{
if (paramid == param->paramid)
{
/* Found a match, so return it. */
*column_p = foreach_current_index(paramid);
return subplan;
}
}
}
}
}
/* No luck, so check the ancestor nodes */
foreach(lc, dpns->ancestors)
{
Node *ancestor = (Node *) lfirst(lc);
/*
* If ancestor is a SubPlan, check the paramIds it provides.
*/
if (IsA(ancestor, SubPlan))
{
SubPlan *subplan = (SubPlan *) ancestor;
foreach_int(paramid, subplan->paramIds)
{
if (paramid == param->paramid)
{
/* Found a match, so return it. */
*column_p = foreach_current_index(paramid);
return subplan;
}
}
/* SubPlan isn't a kind of Plan, so skip the rest */
continue;
}
/*
* Otherwise, it's some kind of Plan node, so check its initplans.
*/
result = find_param_generator_initplan(param, (Plan *) ancestor,
column_p);
if (result)
return result;
/* No luck, crawl up to next ancestor */
}
}
/* No generator found */
return NULL;
}
/*
* Subroutine for find_param_generator: search one Plan node's initplans
*/
static SubPlan *
find_param_generator_initplan(Param *param, Plan *plan, int *column_p)
{
foreach_node(SubPlan, subplan, plan->initPlan)
{
foreach_int(paramid, subplan->setParam)
{
if (paramid == param->paramid)
{
/* Found a match, so return it. */
*column_p = foreach_current_index(paramid);
return subplan;
}
}
}
return NULL;
}
/*
* Display a Param appropriately.
*/
static void
get_parameter(Param *param, deparse_context *context)
{
Node *expr;
deparse_namespace *dpns;
ListCell *ancestor_cell;
SubPlan *subplan;
int column;
/*
* If it's a PARAM_EXEC parameter, try to locate the expression from which
* the parameter was computed. This stanza handles only cases in which
* the Param represents an input to the subplan we are currently in.
*/
expr = find_param_referent(param, context, &dpns, &ancestor_cell);
if (expr)
{
/* Found a match, so print it */
deparse_namespace save_dpns;
bool save_varprefix;
bool need_paren;
/* Switch attention to the ancestor plan node */
push_ancestor_plan(dpns, ancestor_cell, &save_dpns);
/*
* Force prefixing of Vars, since they won't belong to the relation
* being scanned in the original plan node.
*/
save_varprefix = context->varprefix;
context->varprefix = true;
/*
* A Param's expansion is typically a Var, Aggref, GroupingFunc, or
* upper-level Param, which wouldn't need extra parentheses.
* Otherwise, insert parens to ensure the expression looks atomic.
*/
need_paren = !(IsA(expr, Var) ||
IsA(expr, Aggref) ||
IsA(expr, GroupingFunc) ||
IsA(expr, Param));
if (need_paren)
appendStringInfoChar(context->buf, '(');
get_rule_expr(expr, context, false);
if (need_paren)
appendStringInfoChar(context->buf, ')');
context->varprefix = save_varprefix;
pop_ancestor_plan(dpns, &save_dpns);
return;
}
/*
* Alternatively, maybe it's a subplan output, which we print as a
* reference to the subplan. (We could drill down into the subplan and
* print the relevant targetlist expression, but that has been deemed too
* confusing since it would violate normal SQL scope rules. Also, we're
* relying on this reference to show that the testexpr containing the
* Param has anything to do with that subplan at all.)
*/
subplan = find_param_generator(param, context, &column);
if (subplan)
{
appendStringInfo(context->buf, "(%s%s).col%d",
subplan->useHashTable ? "hashed " : "",
subplan->plan_name, column + 1);
return;
}
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
/*
* If it's an external parameter, see if the outermost namespace provides
* function argument names.
*/
if (param->paramkind == PARAM_EXTERN && context->namespaces != NIL)
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
{
dpns = llast(context->namespaces);
if (dpns->argnames &&
param->paramid > 0 &&
param->paramid <= dpns->numargs)
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
{
char *argname = dpns->argnames[param->paramid - 1];
if (argname)
{
bool should_qualify = false;
ListCell *lc;
/*
* Qualify the parameter name if there are any other deparse
* namespaces with range tables. This avoids qualifying in
* trivial cases like "RETURN a + b", but makes it safe in all
* other cases.
*/
foreach(lc, context->namespaces)
{
deparse_namespace *depns = lfirst(lc);
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
if (depns->rtable_names != NIL)
SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
2021-04-07 21:30:08 +02:00
{
should_qualify = true;
break;
}
}
if (should_qualify)
{
appendStringInfoString(context->buf, quote_identifier(dpns->funcname));
appendStringInfoChar(context->buf, '.');
}
appendStringInfoString(context->buf, quote_identifier(argname));
return;
}
}
}
/*
* Not PARAM_EXEC, or couldn't find referent: just print $N.
*
* It's a bug if we get here for anything except PARAM_EXTERN Params, but
* in production builds printing $N seems more useful than failing.
*/
Assert(param->paramkind == PARAM_EXTERN);
appendStringInfo(context->buf, "$%d", param->paramid);
}
/*
* get_simple_binary_op_name
*
* helper function for isSimpleNode
* will return single char binary operator name, or NULL if it's not
*/
static const char *
get_simple_binary_op_name(OpExpr *expr)
{
List *args = expr->args;
if (list_length(args) == 2)
{
/* binary operator */
Node *arg1 = (Node *) linitial(args);
Node *arg2 = (Node *) lsecond(args);
const char *op;
op = generate_operator_name(expr->opno, exprType(arg1), exprType(arg2));
if (strlen(op) == 1)
return op;
}
return NULL;
}
/*
* isSimpleNode - check if given node is simple (doesn't need parenthesizing)
*
* true : simple in the context of parent node's type
* false : not simple
*/
static bool
isSimpleNode(Node *node, Node *parentNode, int prettyFlags)
{
if (!node)
return false;
switch (nodeTag(node))
{
case T_Var:
case T_Const:
case T_Param:
case T_CoerceToDomainValue:
case T_SetToDefault:
case T_CurrentOfExpr:
/* single words: always simple */
return true;
case T_SubscriptingRef:
case T_ArrayExpr:
case T_RowExpr:
case T_CoalesceExpr:
case T_MinMaxExpr:
case T_SQLValueFunction:
case T_XmlExpr:
Code review for NextValueExpr expression node type. Add missing infrastructure for this node type, notably in ruleutils.c where its lack could demonstrably cause EXPLAIN to fail. Add outfuncs/readfuncs support. (outfuncs support is useful today for debugging purposes. The readfuncs support may never be needed, since at present it would only matter for parallel query and NextValueExpr should never appear in a parallelizable query; but it seems like a bad idea to have a primnode type that isn't fully supported here.) Teach planner infrastructure that NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node with cost cpu_operator_cost. Given its limited scope of usage, there *might* be no live bug today from the lack of that knowledge, but it's certainly going to bite us on the rear someday. Teach pg_stat_statements about the new node type, too. While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction, XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost. Failing to do this for SQLValueFunction was an oversight in my commit 0bb51aa96. The others are longer-standing oversights, but no time like the present to fix them. (In principle, CoerceToDomain could have cost much higher than this, but it doesn't presently seem worth trying to examine the domain's constraints here.) Modify execExprInterp.c to execute NextValueExpr as an out-of-line function; it seems quite unlikely to me that it's worth insisting that it be inlined in all expression eval methods. Besides, providing the out-of-line function doesn't stop anyone from inlining if they want to. Adjust some places where NextValueExpr support had been inserted with the aid of a dartboard rather than keeping it in the same order as elsewhere. Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
2017-07-14 21:25:43 +02:00
case T_NextValueExpr:
case T_NullIfExpr:
case T_Aggref:
case T_GroupingFunc:
case T_WindowFunc:
case T_MergeSupportFunc:
case T_FuncExpr:
case T_JsonConstructorExpr:
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
case T_JsonExpr:
/* function-like: name(..) or name[..] */
return true;
/* CASE keywords act as parentheses */
case T_CaseExpr:
return true;
case T_FieldSelect:
2003-08-04 02:43:34 +02:00
/*
* appears simple since . has top precedence, unless parent is
* T_FieldSelect itself!
*/
return !IsA(parentNode, FieldSelect);
case T_FieldStore:
/*
* treat like FieldSelect (probably doesn't matter)
*/
return !IsA(parentNode, FieldStore);
case T_CoerceToDomain:
/* maybe simple, check args */
return isSimpleNode((Node *) ((CoerceToDomain *) node)->arg,
node, prettyFlags);
case T_RelabelType:
return isSimpleNode((Node *) ((RelabelType *) node)->arg,
node, prettyFlags);
case T_CoerceViaIO:
return isSimpleNode((Node *) ((CoerceViaIO *) node)->arg,
node, prettyFlags);
case T_ArrayCoerceExpr:
return isSimpleNode((Node *) ((ArrayCoerceExpr *) node)->arg,
node, prettyFlags);
case T_ConvertRowtypeExpr:
return isSimpleNode((Node *) ((ConvertRowtypeExpr *) node)->arg,
node, prettyFlags);
case T_OpExpr:
{
/* depends on parent node type; needs further checking */
if (prettyFlags & PRETTYFLAG_PAREN && IsA(parentNode, OpExpr))
2003-08-04 02:43:34 +02:00
{
const char *op;
const char *parentOp;
bool is_lopriop;
bool is_hipriop;
bool is_lopriparent;
bool is_hipriparent;
2003-08-04 02:43:34 +02:00
op = get_simple_binary_op_name((OpExpr *) node);
if (!op)
return false;
/* We know only the basic operators + - and * / % */
is_lopriop = (strchr("+-", *op) != NULL);
is_hipriop = (strchr("*/%", *op) != NULL);
if (!(is_lopriop || is_hipriop))
return false;
parentOp = get_simple_binary_op_name((OpExpr *) parentNode);
if (!parentOp)
return false;
is_lopriparent = (strchr("+-", *parentOp) != NULL);
is_hipriparent = (strchr("*/%", *parentOp) != NULL);
if (!(is_lopriparent || is_hipriparent))
return false;
if (is_hipriop && is_lopriparent)
return true; /* op binds tighter than parent */
if (is_lopriop && is_hipriparent)
return false;
/*
* Operators are same priority --- can skip parens only if
* we have (a - b) - c, not a - (b - c).
*/
if (node == (Node *) linitial(((OpExpr *) parentNode)->args))
return true;
return false;
2003-08-04 02:43:34 +02:00
}
/* else do the same stuff as for T_SubLink et al. */
}
/* FALLTHROUGH */
case T_SubLink:
case T_NullTest:
case T_BooleanTest:
case T_DistinctExpr:
case T_JsonIsPredicate:
switch (nodeTag(parentNode))
{
case T_FuncExpr:
{
/* special handling for casts and COERCE_SQL_SYNTAX */
CoercionForm type = ((FuncExpr *) parentNode)->funcformat;
if (type == COERCE_EXPLICIT_CAST ||
type == COERCE_IMPLICIT_CAST ||
type == COERCE_SQL_SYNTAX)
return false;
return true; /* own parentheses */
}
case T_BoolExpr: /* lower precedence */
case T_SubscriptingRef: /* other separators */
case T_ArrayExpr: /* other separators */
case T_RowExpr: /* other separators */
case T_CoalesceExpr: /* own parentheses */
case T_MinMaxExpr: /* own parentheses */
case T_XmlExpr: /* own parentheses */
case T_NullIfExpr: /* other separators */
case T_Aggref: /* own parentheses */
case T_GroupingFunc: /* own parentheses */
case T_WindowFunc: /* own parentheses */
case T_CaseExpr: /* other separators */
return true;
default:
return false;
}
case T_BoolExpr:
switch (nodeTag(parentNode))
{
case T_BoolExpr:
if (prettyFlags & PRETTYFLAG_PAREN)
{
BoolExprType type;
BoolExprType parentType;
type = ((BoolExpr *) node)->boolop;
parentType = ((BoolExpr *) parentNode)->boolop;
switch (type)
{
case NOT_EXPR:
case AND_EXPR:
if (parentType == AND_EXPR || parentType == OR_EXPR)
return true;
break;
case OR_EXPR:
if (parentType == OR_EXPR)
return true;
break;
}
}
return false;
case T_FuncExpr:
{
/* special handling for casts and COERCE_SQL_SYNTAX */
CoercionForm type = ((FuncExpr *) parentNode)->funcformat;
if (type == COERCE_EXPLICIT_CAST ||
type == COERCE_IMPLICIT_CAST ||
type == COERCE_SQL_SYNTAX)
return false;
return true; /* own parentheses */
}
case T_SubscriptingRef: /* other separators */
case T_ArrayExpr: /* other separators */
case T_RowExpr: /* other separators */
case T_CoalesceExpr: /* own parentheses */
case T_MinMaxExpr: /* own parentheses */
case T_XmlExpr: /* own parentheses */
case T_NullIfExpr: /* other separators */
case T_Aggref: /* own parentheses */
case T_GroupingFunc: /* own parentheses */
case T_WindowFunc: /* own parentheses */
case T_CaseExpr: /* other separators */
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
case T_JsonExpr: /* own parentheses */
return true;
default:
return false;
}
case T_JsonValueExpr:
/* maybe simple, check args */
return isSimpleNode((Node *) ((JsonValueExpr *) node)->raw_expr,
node, prettyFlags);
default:
break;
}
/* those we don't know: in dubio complexo */
return false;
}
/*
* appendContextKeyword - append a keyword to buffer
*
* If prettyPrint is enabled, perform a line break, and adjust indentation.
* Otherwise, just append the keyword.
*/
static void
appendContextKeyword(deparse_context *context, const char *str,
int indentBefore, int indentAfter, int indentPlus)
{
StringInfo buf = context->buf;
if (PRETTY_INDENT(context))
{
int indentAmount;
context->indentLevel += indentBefore;
/* remove any trailing spaces currently in the buffer ... */
removeStringInfoSpaces(buf);
/* ... then add a newline and some spaces */
appendStringInfoChar(buf, '\n');
if (context->indentLevel < PRETTYINDENT_LIMIT)
indentAmount = Max(context->indentLevel, 0) + indentPlus;
else
{
/*
* If we're indented more than PRETTYINDENT_LIMIT characters, try
* to conserve horizontal space by reducing the per-level
* indentation. For best results the scale factor here should
* divide all the indent amounts that get added to indentLevel
* (PRETTYINDENT_STD, etc). It's important that the indentation
* not grow unboundedly, else deeply-nested trees use O(N^2)
* whitespace; so we also wrap modulo PRETTYINDENT_LIMIT.
*/
indentAmount = PRETTYINDENT_LIMIT +
(context->indentLevel - PRETTYINDENT_LIMIT) /
(PRETTYINDENT_STD / 2);
indentAmount %= PRETTYINDENT_LIMIT;
/* scale/wrap logic affects indentLevel, but not indentPlus */
indentAmount += indentPlus;
}
appendStringInfoSpaces(buf, indentAmount);
appendStringInfoString(buf, str);
context->indentLevel += indentAfter;
if (context->indentLevel < 0)
context->indentLevel = 0;
}
else
appendStringInfoString(buf, str);
}
/*
* removeStringInfoSpaces - delete trailing spaces from a buffer.
*
* Possibly this should move to stringinfo.c at some point.
*/
static void
removeStringInfoSpaces(StringInfo str)
{
while (str->len > 0 && str->data[str->len - 1] == ' ')
str->data[--(str->len)] = '\0';
}
/*
* get_rule_expr_paren - deparse expr using get_rule_expr,
* embracing the string with parentheses if necessary for prettyPrint.
*
* Never embrace if prettyFlags=0, because it's done in the calling node.
*
* Any node that does *not* embrace its argument node by sql syntax (with
* parentheses, non-operator keywords like CASE/WHEN/ON, or comma etc) should
* use get_rule_expr_paren instead of get_rule_expr so parentheses can be
* added.
*/
static void
get_rule_expr_paren(Node *node, deparse_context *context,
bool showimplicit, Node *parentNode)
{
bool need_paren;
need_paren = PRETTY_PAREN(context) &&
!isSimpleNode(node, parentNode, context->prettyFlags);
if (need_paren)
appendStringInfoChar(context->buf, '(');
get_rule_expr(node, context, showimplicit);
if (need_paren)
appendStringInfoChar(context->buf, ')');
}
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
static void
get_json_behavior(JsonBehavior *behavior, deparse_context *context,
const char *on)
{
/*
* The order of array elements must correspond to the order of
* JsonBehaviorType members.
*/
const char *behavior_names[] =
{
" NULL",
" ERROR",
" EMPTY",
" TRUE",
" FALSE",
" UNKNOWN",
" EMPTY ARRAY",
" EMPTY OBJECT",
" DEFAULT "
};
if ((int) behavior->btype < 0 || behavior->btype >= lengthof(behavior_names))
elog(ERROR, "invalid json behavior type: %d", behavior->btype);
appendStringInfoString(context->buf, behavior_names[behavior->btype]);
if (behavior->btype == JSON_BEHAVIOR_DEFAULT)
get_rule_expr(behavior->expr, context, false);
appendStringInfo(context->buf, " ON %s", on);
}
/*
* get_json_expr_options
*
Add basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-04-04 12:57:08 +02:00
* Parse back common options for JSON_QUERY, JSON_VALUE, JSON_EXISTS and
* JSON_TABLE columns.
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
*/
static void
get_json_expr_options(JsonExpr *jsexpr, deparse_context *context,
JsonBehaviorType default_behavior)
{
if (jsexpr->op == JSON_QUERY_OP)
{
if (jsexpr->wrapper == JSW_CONDITIONAL)
appendStringInfo(context->buf, " WITH CONDITIONAL WRAPPER");
else if (jsexpr->wrapper == JSW_UNCONDITIONAL)
appendStringInfo(context->buf, " WITH UNCONDITIONAL WRAPPER");
/* The default */
else if (jsexpr->wrapper == JSW_NONE || jsexpr->wrapper == JSW_UNSPEC)
appendStringInfo(context->buf, " WITHOUT WRAPPER");
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
if (jsexpr->omit_quotes)
appendStringInfo(context->buf, " OMIT QUOTES");
/* The default */
else
appendStringInfo(context->buf, " KEEP QUOTES");
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
}
if (jsexpr->on_empty && jsexpr->on_empty->btype != default_behavior)
get_json_behavior(jsexpr->on_empty, context, "EMPTY");
if (jsexpr->on_error && jsexpr->on_error->btype != default_behavior)
get_json_behavior(jsexpr->on_error, context, "ERROR");
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* get_rule_expr - Parse back an expression
*
* Note: showimplicit determines whether we display any implicit cast that
* is present at the top of the expression tree. It is a passed argument,
* not a field of the context struct, because we change the value as we
* recurse down into the expression. In general we suppress implicit casts
* when the result type is known with certainty (eg, the arguments of an
* OR must be boolean). We display implicit casts for arguments of functions
* and operators, since this is needed to be certain that the same function
* or operator will be chosen when the expression is re-parsed.
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
* ----------
*/
static void
get_rule_expr(Node *node, deparse_context *context,
bool showimplicit)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
if (node == NULL)
return;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* Guard against excessively long or deeply-nested queries */
CHECK_FOR_INTERRUPTS();
check_stack_depth();
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* Each level of get_rule_expr must emit an indivisible term
* (parenthesized if necessary) to ensure result is reparsed into the same
* expression tree. The only exception is that when the input is a List,
* we emit the component items comma-separated with no surrounding
* decoration; this is convenient for most callers.
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
*/
switch (nodeTag(node))
{
case T_Var:
(void) get_variable((Var *) node, 0, false, context);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case T_Const:
get_const_expr((Const *) node, context, 0);
break;
case T_Param:
get_parameter((Param *) node, context);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
break;
case T_Aggref:
get_agg_expr((Aggref *) node, context, (Aggref *) node);
break;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
case T_GroupingFunc:
{
GroupingFunc *gexpr = (GroupingFunc *) node;
appendStringInfoString(buf, "GROUPING(");
get_rule_expr((Node *) gexpr->args, context, true);
appendStringInfoChar(buf, ')');
}
break;
case T_WindowFunc:
get_windowfunc_expr((WindowFunc *) node, context);
break;
case T_MergeSupportFunc:
appendStringInfoString(buf, "MERGE_ACTION()");
break;
case T_SubscriptingRef:
{
SubscriptingRef *sbsref = (SubscriptingRef *) node;
bool need_parens;
/*
* If the argument is a CaseTestExpr, we must be inside a
* FieldStore, ie, we are assigning to an element of an array
* within a composite column. Since we already punted on
* displaying the FieldStore's target information, just punt
* here too, and display only the assignment source
* expression.
*/
if (IsA(sbsref->refexpr, CaseTestExpr))
{
Assert(sbsref->refassgnexpr);
get_rule_expr((Node *) sbsref->refassgnexpr,
context, showimplicit);
break;
}
/*
* Parenthesize the argument unless it's a simple Var or a
* FieldSelect. (In particular, if it's another
* SubscriptingRef, we *must* parenthesize to avoid
* confusion.)
*/
need_parens = !IsA(sbsref->refexpr, Var) &&
!IsA(sbsref->refexpr, FieldSelect);
if (need_parens)
appendStringInfoChar(buf, '(');
get_rule_expr((Node *) sbsref->refexpr, context, showimplicit);
if (need_parens)
appendStringInfoChar(buf, ')');
2004-08-29 07:07:03 +02:00
/*
* If there's a refassgnexpr, we want to print the node in the
* format "container[subscripts] := refassgnexpr". This is
* not legal SQL, so decompilation of INSERT or UPDATE
* statements should always use processIndirection as part of
* the statement-level syntax. We should only see this when
* EXPLAIN tries to print the targetlist of a plan resulting
* from such a statement.
*/
if (sbsref->refassgnexpr)
{
Node *refassgnexpr;
/*
* Use processIndirection to print this node's subscripts
* as well as any additional field selections or
* subscripting in immediate descendants. It returns the
* RHS expr that is actually being "assigned".
*/
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
refassgnexpr = processIndirection(node, context);
appendStringInfoString(buf, " := ");
get_rule_expr(refassgnexpr, context, showimplicit);
}
else
{
/* Just an ordinary container fetch, so print subscripts */
printSubscripts(sbsref, context);
}
}
break;
case T_FuncExpr:
get_func_expr((FuncExpr *) node, context, showimplicit);
break;
case T_NamedArgExpr:
{
NamedArgExpr *na = (NamedArgExpr *) node;
appendStringInfo(buf, "%s => ", quote_identifier(na->name));
get_rule_expr((Node *) na->arg, context, showimplicit);
}
break;
case T_OpExpr:
get_oper_expr((OpExpr *) node, context);
break;
case T_DistinctExpr:
{
DistinctExpr *expr = (DistinctExpr *) node;
List *args = expr->args;
Node *arg1 = (Node *) linitial(args);
Node *arg2 = (Node *) lsecond(args);
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren(arg1, context, true, node);
appendStringInfoString(buf, " IS DISTINCT FROM ");
get_rule_expr_paren(arg2, context, true, node);
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
break;
case T_NullIfExpr:
{
NullIfExpr *nullifexpr = (NullIfExpr *) node;
appendStringInfoString(buf, "NULLIF(");
get_rule_expr((Node *) nullifexpr->args, context, true);
appendStringInfoChar(buf, ')');
}
break;
case T_ScalarArrayOpExpr:
{
ScalarArrayOpExpr *expr = (ScalarArrayOpExpr *) node;
List *args = expr->args;
Node *arg1 = (Node *) linitial(args);
Node *arg2 = (Node *) lsecond(args);
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren(arg1, context, true, node);
appendStringInfo(buf, " %s %s (",
generate_operator_name(expr->opno,
exprType(arg1),
Improve handling of domains over arrays. This patch eliminates various bizarre behaviors caused by sloppy thinking about the difference between a domain type and its underlying array type. In particular, the operation of updating one element of such an array has to be considered as yielding a value of the underlying array type, *not* a value of the domain, because there's no assurance that the domain's CHECK constraints are still satisfied. If we're intending to store the result back into a domain column, we have to re-cast to the domain type so that constraints are re-checked. For similar reasons, such a domain can't be blindly matched to an ANYARRAY polymorphic parameter, because the polymorphic function is likely to apply array-ish operations that could invalidate the domain constraints. For the moment, we just forbid such matching. We might later wish to insert an automatic downcast to the underlying array type, but such a change should also change matching of domains to ANYELEMENT for consistency. To ensure that all such logic is rechecked, this patch removes the original hack of setting a domain's pg_type.typelem field to match its base type; the typelem will always be zero instead. In those places where it's really okay to look through the domain type with no other logic changes, use the newly added get_base_element_type function in place of get_element_type. catversion bumped due to change in pg_type contents. Per bug #5717 from Richard Huxton and subsequent discussion.
2010-10-21 22:07:17 +02:00
get_base_element_type(exprType(arg2))),
expr->useOr ? "ANY" : "ALL");
get_rule_expr_paren(arg2, context, true, node);
Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK. When we shoehorned "x op ANY (array)" into the SQL syntax, we created a fundamental ambiguity as to the proper treatment of a sub-SELECT on the righthand side: perhaps what's meant is to compare x against each row of the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar sub-SELECT that delivers a single array value whose members should be compared against x. The grammar resolves it as the former case whenever the RHS is a select_with_parens, making the latter case hard to reach --- but you can get at it, with tricks such as attaching a no-op cast to the sub-SELECT. Parse analysis would throw away the no-op cast, leaving a parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr. ruleutils.c was not clued in on this fine point, and would naively emit "x op ANY ((SELECT ...))", which would be parsed as the first alternative, typically leading to errors like "operator does not exist: text = text[]" during dump/reload of a view or rule containing such a construct. To fix, emit a no-op cast when dumping such a parsetree. This might well be exactly what the user wrote to get the construct accepted in the first place; and even if she got there with some other dodge, it is a valid representation of the parsetree. Per report from Karl Czajkowski. He mentioned only a case involving RLS policies, but actually the problem is very old, so back-patch to all supported branches. Report: <20160421001832.GB7976@moraine.isi.edu>
2016-04-21 20:20:18 +02:00
/*
* There's inherent ambiguity in "x op ANY/ALL (y)" when y is
* a bare sub-SELECT. Since we're here, the sub-SELECT must
* be meant as a scalar sub-SELECT yielding an array value to
* be used in ScalarArrayOpExpr; but the grammar will
* preferentially interpret such a construct as an ANY/ALL
* SubLink. To prevent misparsing the output that way, insert
* a dummy coercion (which will be stripped by parse analysis,
* so no inefficiency is added in dump and reload). This is
* indeed most likely what the user wrote to get the construct
* accepted in the first place.
*/
if (IsA(arg2, SubLink) &&
((SubLink *) arg2)->subLinkType == EXPR_SUBLINK)
appendStringInfo(buf, "::%s",
format_type_with_typemod(exprType(arg2),
exprTypmod(arg2)));
appendStringInfoChar(buf, ')');
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
break;
case T_BoolExpr:
{
BoolExpr *expr = (BoolExpr *) node;
Node *first_arg = linitial(expr->args);
Add for_each_from, to simplify loops starting from non-first list cells. We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit 1cff1b95a changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in 1cff1b95a. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in 1cff1b95a, and are unused as of cc99baa43. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-29 02:32:53 +02:00
ListCell *arg;
switch (expr->boolop)
{
case AND_EXPR:
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren(first_arg, context,
false, node);
Add for_each_from, to simplify loops starting from non-first list cells. We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit 1cff1b95a changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in 1cff1b95a. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in 1cff1b95a, and are unused as of cc99baa43. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-29 02:32:53 +02:00
for_each_from(arg, expr->args, 1)
{
appendStringInfoString(buf, " AND ");
get_rule_expr_paren((Node *) lfirst(arg), context,
false, node);
}
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
break;
case OR_EXPR:
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren(first_arg, context,
false, node);
Add for_each_from, to simplify loops starting from non-first list cells. We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit 1cff1b95a changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in 1cff1b95a. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in 1cff1b95a, and are unused as of cc99baa43. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-29 02:32:53 +02:00
for_each_from(arg, expr->args, 1)
{
appendStringInfoString(buf, " OR ");
get_rule_expr_paren((Node *) lfirst(arg), context,
false, node);
}
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
break;
case NOT_EXPR:
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
appendStringInfoString(buf, "NOT ");
get_rule_expr_paren(first_arg, context,
false, node);
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
break;
default:
elog(ERROR, "unrecognized boolop: %d",
(int) expr->boolop);
}
}
break;
case T_SubLink:
get_sublink_expr((SubLink *) node, context);
break;
case T_SubPlan:
{
SubPlan *subplan = (SubPlan *) node;
/*
* We cannot see an already-planned subplan in rule deparsing,
* only while EXPLAINing a query plan. We don't try to
* reconstruct the original SQL, just reference the subplan
* that appears elsewhere in EXPLAIN's result. It does seem
* useful to show the subLinkType and testexpr (if any), and
* we also note whether the subplan will be hashed.
*/
switch (subplan->subLinkType)
{
case EXISTS_SUBLINK:
appendStringInfoString(buf, "EXISTS(");
Assert(subplan->testexpr == NULL);
break;
case ALL_SUBLINK:
appendStringInfoString(buf, "(ALL ");
Assert(subplan->testexpr != NULL);
break;
case ANY_SUBLINK:
appendStringInfoString(buf, "(ANY ");
Assert(subplan->testexpr != NULL);
break;
case ROWCOMPARE_SUBLINK:
/* Parenthesizing the testexpr seems sufficient */
appendStringInfoChar(buf, '(');
Assert(subplan->testexpr != NULL);
break;
case EXPR_SUBLINK:
/* No need to decorate these subplan references */
appendStringInfoChar(buf, '(');
Assert(subplan->testexpr == NULL);
break;
case MULTIEXPR_SUBLINK:
/* MULTIEXPR isn't executed in the normal way */
appendStringInfoString(buf, "(rescan ");
Assert(subplan->testexpr == NULL);
break;
case ARRAY_SUBLINK:
appendStringInfoString(buf, "ARRAY(");
Assert(subplan->testexpr == NULL);
break;
case CTE_SUBLINK:
/* This case is unreachable within expressions */
appendStringInfoString(buf, "CTE(");
Assert(subplan->testexpr == NULL);
break;
}
if (subplan->testexpr != NULL)
{
deparse_namespace *dpns;
/*
* Push SubPlan into ancestors list while deparsing
* testexpr, so that we can handle PARAM_EXEC references
* to the SubPlan's paramIds. (This makes it look like
* the SubPlan is an "ancestor" of the current plan node,
* which is a little weird, but it does no harm.) In this
* path, we don't need to mention the SubPlan explicitly,
* because the referencing Params will show its existence.
*/
dpns = (deparse_namespace *) linitial(context->namespaces);
dpns->ancestors = lcons(subplan, dpns->ancestors);
get_rule_expr(subplan->testexpr, context, showimplicit);
appendStringInfoChar(buf, ')');
dpns->ancestors = list_delete_first(dpns->ancestors);
}
else
{
/* No referencing Params, so show the SubPlan's name */
if (subplan->useHashTable)
appendStringInfo(buf, "hashed %s)", subplan->plan_name);
else
appendStringInfo(buf, "%s)", subplan->plan_name);
}
}
break;
case T_AlternativeSubPlan:
{
AlternativeSubPlan *asplan = (AlternativeSubPlan *) node;
ListCell *lc;
Move resolution of AlternativeSubPlan choices to the planner. When commit bd3daddaf introduced AlternativeSubPlans, I had some ambitions towards allowing the choice of subplan to change during execution. That has not happened, or even been thought about, in the ensuing twelve years; so it seems like a failed experiment. So let's rip that out and resolve the choice of subplan at the end of planning (in setrefs.c) rather than during executor startup. This has a number of positive benefits: * Removal of a few hundred lines of executor code, since AlternativeSubPlans need no longer be supported there. * Removal of executor-startup overhead (particularly, initialization of subplans that won't be used). * Removal of incidental costs of having a larger plan tree, such as tree-scanning and copying costs in the plancache; not to mention setrefs.c's own costs of processing the discarded subplans. * EXPLAIN no longer has to print a weird (and undocumented) representation of an AlternativeSubPlan choice; it sees only the subplan actually used. This should mean less confusion for users. * Since setrefs.c knows which subexpression of a plan node it's working on at any instant, it's possible to adjust the estimated number of executions of the subplan based on that. For example, we should usually estimate more executions of a qual expression than a targetlist expression. The implementation used here is pretty simplistic, because we don't want to expend a lot of cycles on the issue; but it's better than ignoring the point entirely, as the executor had to. That last point might possibly result in shifting the choice between hashed and non-hashed EXISTS subplans in a few cases, but in general this patch isn't meant to change planner choices. Since we're doing the resolution so late, it's really impossible to change any plan choices outside the AlternativeSubPlan itself. Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us
2020-09-27 18:51:28 +02:00
/*
* This case cannot be reached in normal usage, since no
* AlternativeSubPlan can appear either in parsetrees or
* finished plan trees. We keep it just in case somebody
* wants to use this code to print planner data structures.
*/
appendStringInfoString(buf, "(alternatives: ");
foreach(lc, asplan->subplans)
{
SubPlan *splan = lfirst_node(SubPlan, lc);
if (splan->useHashTable)
appendStringInfo(buf, "hashed %s", splan->plan_name);
else
appendStringInfoString(buf, splan->plan_name);
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
if (lnext(asplan->subplans, lc))
appendStringInfoString(buf, " or ");
}
appendStringInfoChar(buf, ')');
}
break;
case T_FieldSelect:
{
FieldSelect *fselect = (FieldSelect *) node;
Node *arg = (Node *) fselect->arg;
int fno = fselect->fieldnum;
const char *fieldname;
bool need_parens;
/*
* Parenthesize the argument unless it's an SubscriptingRef or
* another FieldSelect. Note in particular that it would be
* WRONG to not parenthesize a Var argument; simplicity is not
* the issue here, having the right number of names is.
*/
need_parens = !IsA(arg, SubscriptingRef) &&
!IsA(arg, FieldSelect);
if (need_parens)
appendStringInfoChar(buf, '(');
get_rule_expr(arg, context, true);
if (need_parens)
appendStringInfoChar(buf, ')');
/*
* Get and print the field name.
*/
fieldname = get_name_for_var_field((Var *) arg, fno,
0, context);
appendStringInfo(buf, ".%s", quote_identifier(fieldname));
}
break;
case T_FieldStore:
{
FieldStore *fstore = (FieldStore *) node;
bool need_parens;
2004-08-29 07:07:03 +02:00
/*
* There is no good way to represent a FieldStore as real SQL,
* so decompilation of INSERT or UPDATE statements should
* always use processIndirection as part of the
* statement-level syntax. We should only get here when
* EXPLAIN tries to print the targetlist of a plan resulting
* from such a statement. The plan case is even harder than
* ordinary rules would be, because the planner tries to
* collapse multiple assignments to the same field or subfield
* into one FieldStore; so we can see a list of target fields
* not just one, and the arguments could be FieldStores
* themselves. We don't bother to try to print the target
* field names; we just print the source arguments, with a
* ROW() around them if there's more than one. This isn't
* terribly complete, but it's probably good enough for
* EXPLAIN's purposes; especially since anything more would be
* either hopelessly confusing or an even poorer
* representation of what the plan is actually doing.
*/
need_parens = (list_length(fstore->newvals) != 1);
if (need_parens)
appendStringInfoString(buf, "ROW(");
get_rule_expr((Node *) fstore->newvals, context, showimplicit);
if (need_parens)
appendStringInfoChar(buf, ')');
}
break;
case T_RelabelType:
{
RelabelType *relabel = (RelabelType *) node;
Node *arg = (Node *) relabel->arg;
if (relabel->relabelformat == COERCE_IMPLICIT_CAST &&
!showimplicit)
{
/* don't show the implicit cast */
get_rule_expr_paren(arg, context, false, node);
}
else
{
get_coercion_expr(arg, context,
relabel->resulttype,
relabel->resulttypmod,
node);
}
}
break;
case T_CoerceViaIO:
{
CoerceViaIO *iocoerce = (CoerceViaIO *) node;
Node *arg = (Node *) iocoerce->arg;
if (iocoerce->coerceformat == COERCE_IMPLICIT_CAST &&
!showimplicit)
{
/* don't show the implicit cast */
get_rule_expr_paren(arg, context, false, node);
}
else
{
get_coercion_expr(arg, context,
iocoerce->resulttype,
-1,
node);
}
}
break;
case T_ArrayCoerceExpr:
{
ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) node;
Node *arg = (Node *) acoerce->arg;
if (acoerce->coerceformat == COERCE_IMPLICIT_CAST &&
!showimplicit)
{
/* don't show the implicit cast */
get_rule_expr_paren(arg, context, false, node);
}
else
{
get_coercion_expr(arg, context,
acoerce->resulttype,
acoerce->resulttypmod,
node);
}
}
break;
case T_ConvertRowtypeExpr:
{
ConvertRowtypeExpr *convert = (ConvertRowtypeExpr *) node;
Node *arg = (Node *) convert->arg;
if (convert->convertformat == COERCE_IMPLICIT_CAST &&
!showimplicit)
{
/* don't show the implicit cast */
get_rule_expr_paren(arg, context, false, node);
}
else
{
get_coercion_expr(arg, context,
convert->resulttype, -1,
node);
}
}
break;
case T_CollateExpr:
{
CollateExpr *collate = (CollateExpr *) node;
Node *arg = (Node *) collate->arg;
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren(arg, context, showimplicit, node);
appendStringInfo(buf, " COLLATE %s",
generate_collation_name(collate->collOid));
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
break;
case T_CaseExpr:
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
CaseExpr *caseexpr = (CaseExpr *) node;
ListCell *temp;
appendContextKeyword(context, "CASE",
0, PRETTYINDENT_VAR, 0);
if (caseexpr->arg)
{
appendStringInfoChar(buf, ' ');
get_rule_expr((Node *) caseexpr->arg, context, true);
}
foreach(temp, caseexpr->args)
{
CaseWhen *when = (CaseWhen *) lfirst(temp);
Node *w = (Node *) when->expr;
if (caseexpr->arg)
{
/*
* The parser should have produced WHEN clauses of the
* form "CaseTestExpr = RHS", possibly with an
* implicit coercion inserted above the CaseTestExpr.
* For accurate decompilation of rules it's essential
* that we show just the RHS. However in an
* expression that's been through the optimizer, the
* WHEN clause could be almost anything (since the
* equality operator could have been expanded into an
* inline function). If we don't recognize the form
* of the WHEN clause, just punt and display it as-is.
*/
if (IsA(w, OpExpr))
{
List *args = ((OpExpr *) w)->args;
if (list_length(args) == 2 &&
IsA(strip_implicit_coercions(linitial(args)),
CaseTestExpr))
w = (Node *) lsecond(args);
}
}
if (!PRETTY_INDENT(context))
appendStringInfoChar(buf, ' ');
appendContextKeyword(context, "WHEN ",
0, 0, 0);
get_rule_expr(w, context, false);
appendStringInfoString(buf, " THEN ");
get_rule_expr((Node *) when->result, context, true);
}
if (!PRETTY_INDENT(context))
appendStringInfoChar(buf, ' ');
appendContextKeyword(context, "ELSE ",
0, 0, 0);
get_rule_expr((Node *) caseexpr->defresult, context, true);
if (!PRETTY_INDENT(context))
appendStringInfoChar(buf, ' ');
appendContextKeyword(context, "END",
-PRETTYINDENT_VAR, 0, 0);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
break;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
case T_CaseTestExpr:
{
/*
* Normally we should never get here, since for expressions
* that can contain this node type we attempt to avoid
* recursing to it. But in an optimized expression we might
* be unable to avoid that (see comments for CaseExpr). If we
* do see one, print it as CASE_TEST_EXPR.
*/
appendStringInfoString(buf, "CASE_TEST_EXPR");
}
break;
case T_ArrayExpr:
{
ArrayExpr *arrayexpr = (ArrayExpr *) node;
appendStringInfoString(buf, "ARRAY[");
get_rule_expr((Node *) arrayexpr->elements, context, true);
appendStringInfoChar(buf, ']');
/*
* If the array isn't empty, we assume its elements are
* coerced to the desired type. If it's empty, though, we
* need an explicit coercion to the array type.
*/
if (arrayexpr->elements == NIL)
appendStringInfo(buf, "::%s",
format_type_with_typemod(arrayexpr->array_typeid, -1));
}
break;
2003-08-04 02:43:34 +02:00
case T_RowExpr:
{
RowExpr *rowexpr = (RowExpr *) node;
TupleDesc tupdesc = NULL;
ListCell *arg;
int i;
char *sep;
/*
* If it's a named type and not RECORD, we may have to skip
* dropped columns and/or claim there are NULLs for added
* columns.
*/
if (rowexpr->row_typeid != RECORDOID)
{
tupdesc = lookup_rowtype_tupdesc(rowexpr->row_typeid, -1);
Assert(list_length(rowexpr->args) <= tupdesc->natts);
}
/*
* SQL99 allows "ROW" to be omitted when there is more than
* one column, but for simplicity we always print it.
*/
appendStringInfoString(buf, "ROW(");
sep = "";
i = 0;
foreach(arg, rowexpr->args)
{
Node *e = (Node *) lfirst(arg);
if (tupdesc == NULL ||
!TupleDescAttr(tupdesc, i)->attisdropped)
{
appendStringInfoString(buf, sep);
/* Whole-row Vars need special treatment here */
get_rule_expr_toplevel(e, context, true);
sep = ", ";
}
i++;
}
if (tupdesc != NULL)
{
while (i < tupdesc->natts)
{
if (!TupleDescAttr(tupdesc, i)->attisdropped)
{
appendStringInfoString(buf, sep);
appendStringInfoString(buf, "NULL");
sep = ", ";
}
i++;
}
ReleaseTupleDesc(tupdesc);
}
appendStringInfoChar(buf, ')');
if (rowexpr->row_format == COERCE_EXPLICIT_CAST)
appendStringInfo(buf, "::%s",
format_type_with_typemod(rowexpr->row_typeid, -1));
}
break;
case T_RowCompareExpr:
{
RowCompareExpr *rcexpr = (RowCompareExpr *) node;
/*
* SQL99 allows "ROW" to be omitted when there is more than
* one column, but for simplicity we always print it. Within
* a ROW expression, whole-row Vars need special treatment, so
* use get_rule_list_toplevel.
*/
appendStringInfoString(buf, "(ROW(");
get_rule_list_toplevel(rcexpr->largs, context, true);
2006-10-04 02:30:14 +02:00
/*
* We assume that the name of the first-column operator will
* do for all the rest too. This is definitely open to
* failure, eg if some but not all operators were renamed
* since the construct was parsed, but there seems no way to
* be perfect.
*/
appendStringInfo(buf, ") %s ROW(",
generate_operator_name(linitial_oid(rcexpr->opnos),
exprType(linitial(rcexpr->largs)),
exprType(linitial(rcexpr->rargs))));
get_rule_list_toplevel(rcexpr->rargs, context, true);
appendStringInfoString(buf, "))");
}
break;
case T_CoalesceExpr:
{
CoalesceExpr *coalesceexpr = (CoalesceExpr *) node;
appendStringInfoString(buf, "COALESCE(");
get_rule_expr((Node *) coalesceexpr->args, context, true);
appendStringInfoChar(buf, ')');
}
break;
2003-08-04 02:43:34 +02:00
case T_MinMaxExpr:
{
MinMaxExpr *minmaxexpr = (MinMaxExpr *) node;
switch (minmaxexpr->op)
{
case IS_GREATEST:
appendStringInfoString(buf, "GREATEST(");
break;
case IS_LEAST:
appendStringInfoString(buf, "LEAST(");
break;
}
get_rule_expr((Node *) minmaxexpr->args, context, true);
appendStringInfoChar(buf, ')');
}
break;
case T_SQLValueFunction:
{
SQLValueFunction *svf = (SQLValueFunction *) node;
/*
* Note: this code knows that typmod for time, timestamp, and
* timestamptz just prints as integer.
*/
switch (svf->op)
{
case SVFOP_CURRENT_DATE:
appendStringInfoString(buf, "CURRENT_DATE");
break;
case SVFOP_CURRENT_TIME:
appendStringInfoString(buf, "CURRENT_TIME");
break;
case SVFOP_CURRENT_TIME_N:
appendStringInfo(buf, "CURRENT_TIME(%d)", svf->typmod);
break;
case SVFOP_CURRENT_TIMESTAMP:
appendStringInfoString(buf, "CURRENT_TIMESTAMP");
break;
case SVFOP_CURRENT_TIMESTAMP_N:
appendStringInfo(buf, "CURRENT_TIMESTAMP(%d)",
svf->typmod);
break;
case SVFOP_LOCALTIME:
appendStringInfoString(buf, "LOCALTIME");
break;
case SVFOP_LOCALTIME_N:
appendStringInfo(buf, "LOCALTIME(%d)", svf->typmod);
break;
case SVFOP_LOCALTIMESTAMP:
appendStringInfoString(buf, "LOCALTIMESTAMP");
break;
case SVFOP_LOCALTIMESTAMP_N:
appendStringInfo(buf, "LOCALTIMESTAMP(%d)",
svf->typmod);
break;
case SVFOP_CURRENT_ROLE:
appendStringInfoString(buf, "CURRENT_ROLE");
break;
case SVFOP_CURRENT_USER:
appendStringInfoString(buf, "CURRENT_USER");
break;
case SVFOP_USER:
appendStringInfoString(buf, "USER");
break;
case SVFOP_SESSION_USER:
appendStringInfoString(buf, "SESSION_USER");
break;
case SVFOP_CURRENT_CATALOG:
appendStringInfoString(buf, "CURRENT_CATALOG");
break;
case SVFOP_CURRENT_SCHEMA:
appendStringInfoString(buf, "CURRENT_SCHEMA");
break;
}
}
break;
case T_XmlExpr:
{
XmlExpr *xexpr = (XmlExpr *) node;
bool needcomma = false;
ListCell *arg;
ListCell *narg;
Const *con;
switch (xexpr->op)
{
case IS_XMLCONCAT:
appendStringInfoString(buf, "XMLCONCAT(");
break;
case IS_XMLELEMENT:
appendStringInfoString(buf, "XMLELEMENT(");
break;
case IS_XMLFOREST:
appendStringInfoString(buf, "XMLFOREST(");
break;
case IS_XMLPARSE:
appendStringInfoString(buf, "XMLPARSE(");
break;
case IS_XMLPI:
appendStringInfoString(buf, "XMLPI(");
break;
case IS_XMLROOT:
appendStringInfoString(buf, "XMLROOT(");
break;
case IS_XMLSERIALIZE:
appendStringInfoString(buf, "XMLSERIALIZE(");
break;
case IS_DOCUMENT:
break;
}
if (xexpr->op == IS_XMLPARSE || xexpr->op == IS_XMLSERIALIZE)
{
if (xexpr->xmloption == XMLOPTION_DOCUMENT)
appendStringInfoString(buf, "DOCUMENT ");
else
appendStringInfoString(buf, "CONTENT ");
}
if (xexpr->name)
{
appendStringInfo(buf, "NAME %s",
quote_identifier(map_xml_name_to_sql_identifier(xexpr->name)));
needcomma = true;
}
if (xexpr->named_args)
{
if (xexpr->op != IS_XMLFOREST)
{
if (needcomma)
appendStringInfoString(buf, ", ");
appendStringInfoString(buf, "XMLATTRIBUTES(");
needcomma = false;
}
forboth(arg, xexpr->named_args, narg, xexpr->arg_names)
{
Node *e = (Node *) lfirst(arg);
char *argname = strVal(lfirst(narg));
if (needcomma)
appendStringInfoString(buf, ", ");
get_rule_expr((Node *) e, context, true);
appendStringInfo(buf, " AS %s",
quote_identifier(map_xml_name_to_sql_identifier(argname)));
needcomma = true;
}
if (xexpr->op != IS_XMLFOREST)
appendStringInfoChar(buf, ')');
}
if (xexpr->args)
{
if (needcomma)
appendStringInfoString(buf, ", ");
switch (xexpr->op)
{
case IS_XMLCONCAT:
case IS_XMLELEMENT:
case IS_XMLFOREST:
case IS_XMLPI:
case IS_XMLSERIALIZE:
/* no extra decoration needed */
get_rule_expr((Node *) xexpr->args, context, true);
break;
case IS_XMLPARSE:
Assert(list_length(xexpr->args) == 2);
get_rule_expr((Node *) linitial(xexpr->args),
context, true);
con = lsecond_node(Const, xexpr->args);
Assert(!con->constisnull);
if (DatumGetBool(con->constvalue))
appendStringInfoString(buf,
" PRESERVE WHITESPACE");
else
appendStringInfoString(buf,
" STRIP WHITESPACE");
break;
case IS_XMLROOT:
Assert(list_length(xexpr->args) == 3);
get_rule_expr((Node *) linitial(xexpr->args),
context, true);
appendStringInfoString(buf, ", VERSION ");
con = (Const *) lsecond(xexpr->args);
if (IsA(con, Const) &&
con->constisnull)
appendStringInfoString(buf, "NO VALUE");
else
get_rule_expr((Node *) con, context, false);
con = lthird_node(Const, xexpr->args);
if (con->constisnull)
/* suppress STANDALONE NO VALUE */ ;
else
{
switch (DatumGetInt32(con->constvalue))
{
case XML_STANDALONE_YES:
appendStringInfoString(buf,
", STANDALONE YES");
break;
case XML_STANDALONE_NO:
appendStringInfoString(buf,
", STANDALONE NO");
break;
case XML_STANDALONE_NO_VALUE:
appendStringInfoString(buf,
", STANDALONE NO VALUE");
break;
default:
break;
}
}
break;
case IS_DOCUMENT:
get_rule_expr_paren((Node *) xexpr->args, context, false, node);
break;
}
}
if (xexpr->op == IS_XMLSERIALIZE)
appendStringInfo(buf, " AS %s",
format_type_with_typemod(xexpr->type,
xexpr->typmod));
if (xexpr->op == IS_DOCUMENT)
appendStringInfoString(buf, " IS DOCUMENT");
else
appendStringInfoChar(buf, ')');
}
break;
case T_NullTest:
{
NullTest *ntest = (NullTest *) node;
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren((Node *) ntest->arg, context, true, node);
Fix assorted fallout from IS [NOT] NULL patch. Commits 4452000f3 et al established semantics for NullTest.argisrow that are a bit different from its initial conception: rather than being merely a cache of whether we've determined the input to have composite type, the flag now has the further meaning that we should apply field-by-field testing as per the standard's definition of IS [NOT] NULL. If argisrow is false and yet the input has composite type, the construct instead has the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print such cases correctly. In the case of ruleutils.c, this merely results in cosmetic changes in EXPLAIN output, since the case can't currently arise in stored rules. However, it represents a live bug for deparse.c, which would formerly have sent a remote query that had semantics different from the local behavior. (From the user's standpoint, this means that testing a remote nested-composite column for null-ness could have had unexpected recursive behavior much like that fixed in 4452000f3.) In a related but somewhat independent fix, make plancat.c set argisrow to false in all NullTest expressions constructed to represent "attnotnull" constructs. Since attnotnull is actually enforced as a simple null-value check, this is a more accurate representation of the semantics; we were previously overpromising what it meant for composite columns, which might possibly lead to incorrect planner optimizations. (It seems that what the SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so arguably we are violating the spec and should fix attnotnull to do the other thing. If we ever do, this part should get reverted.) Back-patch, same as the previous commit. Discussion: <10682.1469566308@sss.pgh.pa.us>
2016-07-28 22:09:15 +02:00
/*
* For scalar inputs, we prefer to print as IS [NOT] NULL,
* which is shorter and traditional. If it's a rowtype input
* but we're applying a scalar test, must print IS [NOT]
* DISTINCT FROM NULL to be semantically correct.
*/
if (ntest->argisrow ||
!type_is_rowtype(exprType((Node *) ntest->arg)))
{
Fix assorted fallout from IS [NOT] NULL patch. Commits 4452000f3 et al established semantics for NullTest.argisrow that are a bit different from its initial conception: rather than being merely a cache of whether we've determined the input to have composite type, the flag now has the further meaning that we should apply field-by-field testing as per the standard's definition of IS [NOT] NULL. If argisrow is false and yet the input has composite type, the construct instead has the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print such cases correctly. In the case of ruleutils.c, this merely results in cosmetic changes in EXPLAIN output, since the case can't currently arise in stored rules. However, it represents a live bug for deparse.c, which would formerly have sent a remote query that had semantics different from the local behavior. (From the user's standpoint, this means that testing a remote nested-composite column for null-ness could have had unexpected recursive behavior much like that fixed in 4452000f3.) In a related but somewhat independent fix, make plancat.c set argisrow to false in all NullTest expressions constructed to represent "attnotnull" constructs. Since attnotnull is actually enforced as a simple null-value check, this is a more accurate representation of the semantics; we were previously overpromising what it meant for composite columns, which might possibly lead to incorrect planner optimizations. (It seems that what the SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so arguably we are violating the spec and should fix attnotnull to do the other thing. If we ever do, this part should get reverted.) Back-patch, same as the previous commit. Discussion: <10682.1469566308@sss.pgh.pa.us>
2016-07-28 22:09:15 +02:00
switch (ntest->nulltesttype)
{
case IS_NULL:
appendStringInfoString(buf, " IS NULL");
break;
case IS_NOT_NULL:
appendStringInfoString(buf, " IS NOT NULL");
break;
default:
elog(ERROR, "unrecognized nulltesttype: %d",
(int) ntest->nulltesttype);
}
}
else
{
switch (ntest->nulltesttype)
{
case IS_NULL:
appendStringInfoString(buf, " IS NOT DISTINCT FROM NULL");
break;
case IS_NOT_NULL:
appendStringInfoString(buf, " IS DISTINCT FROM NULL");
break;
default:
elog(ERROR, "unrecognized nulltesttype: %d",
(int) ntest->nulltesttype);
}
}
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
break;
case T_BooleanTest:
{
BooleanTest *btest = (BooleanTest *) node;
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren((Node *) btest->arg, context, false, node);
switch (btest->booltesttype)
{
case IS_TRUE:
appendStringInfoString(buf, " IS TRUE");
break;
case IS_NOT_TRUE:
appendStringInfoString(buf, " IS NOT TRUE");
break;
case IS_FALSE:
appendStringInfoString(buf, " IS FALSE");
break;
case IS_NOT_FALSE:
appendStringInfoString(buf, " IS NOT FALSE");
break;
case IS_UNKNOWN:
appendStringInfoString(buf, " IS UNKNOWN");
break;
case IS_NOT_UNKNOWN:
appendStringInfoString(buf, " IS NOT UNKNOWN");
break;
default:
elog(ERROR, "unrecognized booltesttype: %d",
(int) btest->booltesttype);
}
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
break;
case T_CoerceToDomain:
{
CoerceToDomain *ctest = (CoerceToDomain *) node;
Node *arg = (Node *) ctest->arg;
if (ctest->coercionformat == COERCE_IMPLICIT_CAST &&
!showimplicit)
{
/* don't show the implicit cast */
get_rule_expr(arg, context, false);
}
else
{
get_coercion_expr(arg, context,
ctest->resulttype,
ctest->resulttypmod,
node);
}
}
break;
case T_CoerceToDomainValue:
appendStringInfoString(buf, "VALUE");
break;
case T_SetToDefault:
appendStringInfoString(buf, "DEFAULT");
break;
case T_CurrentOfExpr:
{
CurrentOfExpr *cexpr = (CurrentOfExpr *) node;
if (cexpr->cursor_name)
appendStringInfo(buf, "CURRENT OF %s",
quote_identifier(cexpr->cursor_name));
else
appendStringInfo(buf, "CURRENT OF $%d",
cexpr->cursor_param);
}
break;
Code review for NextValueExpr expression node type. Add missing infrastructure for this node type, notably in ruleutils.c where its lack could demonstrably cause EXPLAIN to fail. Add outfuncs/readfuncs support. (outfuncs support is useful today for debugging purposes. The readfuncs support may never be needed, since at present it would only matter for parallel query and NextValueExpr should never appear in a parallelizable query; but it seems like a bad idea to have a primnode type that isn't fully supported here.) Teach planner infrastructure that NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node with cost cpu_operator_cost. Given its limited scope of usage, there *might* be no live bug today from the lack of that knowledge, but it's certainly going to bite us on the rear someday. Teach pg_stat_statements about the new node type, too. While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction, XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost. Failing to do this for SQLValueFunction was an oversight in my commit 0bb51aa96. The others are longer-standing oversights, but no time like the present to fix them. (In principle, CoerceToDomain could have cost much higher than this, but it doesn't presently seem worth trying to examine the domain's constraints here.) Modify execExprInterp.c to execute NextValueExpr as an out-of-line function; it seems quite unlikely to me that it's worth insisting that it be inlined in all expression eval methods. Besides, providing the out-of-line function doesn't stop anyone from inlining if they want to. Adjust some places where NextValueExpr support had been inserted with the aid of a dartboard rather than keeping it in the same order as elsewhere. Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
2017-07-14 21:25:43 +02:00
case T_NextValueExpr:
{
NextValueExpr *nvexpr = (NextValueExpr *) node;
/*
* This isn't exactly nextval(), but that seems close enough
* for EXPLAIN's purposes.
*/
appendStringInfoString(buf, "nextval(");
simple_quote_literal(buf,
generate_relation_name(nvexpr->seqid,
NIL));
appendStringInfoChar(buf, ')');
}
break;
case T_InferenceElem:
{
InferenceElem *iexpr = (InferenceElem *) node;
bool save_varprefix;
bool need_parens;
/*
* InferenceElem can only refer to target relation, so a
* prefix is not useful, and indeed would cause parse errors.
*/
save_varprefix = context->varprefix;
context->varprefix = false;
/*
* Parenthesize the element unless it's a simple Var or a bare
* function call. Follows pg_get_indexdef_worker().
*/
need_parens = !IsA(iexpr->expr, Var);
if (IsA(iexpr->expr, FuncExpr) &&
((FuncExpr *) iexpr->expr)->funcformat ==
COERCE_EXPLICIT_CALL)
need_parens = false;
if (need_parens)
appendStringInfoChar(buf, '(');
get_rule_expr((Node *) iexpr->expr,
context, false);
if (need_parens)
appendStringInfoChar(buf, ')');
context->varprefix = save_varprefix;
if (iexpr->infercollid)
appendStringInfo(buf, " COLLATE %s",
generate_collation_name(iexpr->infercollid));
/* Add the operator class name, if not default */
if (iexpr->inferopclass)
{
Oid inferopclass = iexpr->inferopclass;
Oid inferopcinputtype = get_opclass_input_type(iexpr->inferopclass);
get_opclass_name(inferopclass, inferopcinputtype, buf);
}
}
break;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
case T_PartitionBoundSpec:
{
PartitionBoundSpec *spec = (PartitionBoundSpec *) node;
ListCell *cell;
char *sep;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
if (spec->is_default)
{
appendStringInfoString(buf, "DEFAULT");
break;
}
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
switch (spec->strategy)
{
case PARTITION_STRATEGY_HASH:
Assert(spec->modulus > 0 && spec->remainder >= 0);
Assert(spec->modulus > spec->remainder);
appendStringInfoString(buf, "FOR VALUES");
appendStringInfo(buf, " WITH (modulus %d, remainder %d)",
spec->modulus, spec->remainder);
break;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
case PARTITION_STRATEGY_LIST:
Assert(spec->listdatums != NIL);
Code review focused on new node types added by partitioning support. Fix failure to check that we got a plain Const from const-simplification of a coercion request. This is the cause of bug #14666 from Tian Bing: there is an int4 to money cast, but it's only stable not immutable (because of dependence on lc_monetary), resulting in a FuncExpr that the code was miserably unequipped to deal with, or indeed even to notice that it was failing to deal with. Add test cases around this coercion behavior. In view of the above, sprinkle the code liberally with castNode() macros, in hope of catching the next such bug a bit sooner. Also, change some functions that were randomly declared to take Node* to take more specific pointer types. And change some struct fields that were declared Node* but could be given more specific types, allowing removal of assorted explicit casts. Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting. Likewise check only-one-key-for-list-partitioning restriction in a less random place. Avoid not-per-project-style usages like !strcmp(...). Fix assorted failures to avoid scribbling on the input of parse transformation. I'm not sure how necessary this is, but it's entirely silly for these functions to be expending cycles to avoid that and not getting it right. Add guards against partitioning on system columns. Put backend/nodes/ support code into an order that matches handling of these node types elsewhere. Annotate the fact that somebody added location fields to PartitionBoundSpec and PartitionRangeDatum but forgot to handle them in outfuncs.c/readfuncs.c. This is fairly harmless for production purposes (since readfuncs.c would just substitute -1 anyway) but it's still bogus. It's not worth forcing a post-beta1 initdb just to fix this, but if we have another reason to force initdb before 10.0, we should go back and clean this up. Contrariwise, somebody added location fields to PartitionElem and PartitionSpec but forgot to teach exprLocation() about them. Consolidate duplicative code in transformPartitionBound(). Improve a couple of error messages. Improve assorted commentary. Re-pgindent the files touched by this patch; this affects a few comment blocks that must have been added quite recently. Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org
2017-05-29 05:20:28 +02:00
appendStringInfoString(buf, "FOR VALUES IN (");
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
sep = "";
foreach(cell, spec->listdatums)
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
{
Const *val = lfirst_node(Const, cell);
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
appendStringInfoString(buf, sep);
get_const_expr(val, context, -1);
sep = ", ";
}
appendStringInfoChar(buf, ')');
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
break;
case PARTITION_STRATEGY_RANGE:
Assert(spec->lowerdatums != NIL &&
spec->upperdatums != NIL &&
list_length(spec->lowerdatums) ==
list_length(spec->upperdatums));
appendStringInfo(buf, "FOR VALUES FROM %s TO %s",
get_range_partbound_string(spec->lowerdatums),
get_range_partbound_string(spec->upperdatums));
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
break;
default:
elog(ERROR, "unrecognized partition strategy: %d",
(int) spec->strategy);
break;
}
}
break;
case T_JsonValueExpr:
{
JsonValueExpr *jve = (JsonValueExpr *) node;
get_rule_expr((Node *) jve->raw_expr, context, false);
get_json_format(jve->format, context->buf);
}
break;
case T_JsonConstructorExpr:
get_json_constructor((JsonConstructorExpr *) node, context, false);
break;
case T_JsonIsPredicate:
{
JsonIsPredicate *pred = (JsonIsPredicate *) node;
if (!PRETTY_PAREN(context))
appendStringInfoChar(context->buf, '(');
get_rule_expr_paren(pred->expr, context, true, node);
appendStringInfoString(context->buf, " IS JSON");
/* TODO: handle FORMAT clause */
switch (pred->item_type)
{
case JS_TYPE_SCALAR:
appendStringInfoString(context->buf, " SCALAR");
break;
case JS_TYPE_ARRAY:
appendStringInfoString(context->buf, " ARRAY");
break;
case JS_TYPE_OBJECT:
appendStringInfoString(context->buf, " OBJECT");
break;
default:
break;
}
if (pred->unique_keys)
appendStringInfoString(context->buf, " WITH UNIQUE KEYS");
if (!PRETTY_PAREN(context))
appendStringInfoChar(context->buf, ')');
}
break;
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
case T_JsonExpr:
{
JsonExpr *jexpr = (JsonExpr *) node;
switch (jexpr->op)
{
case JSON_EXISTS_OP:
appendStringInfoString(buf, "JSON_EXISTS(");
break;
case JSON_QUERY_OP:
appendStringInfoString(buf, "JSON_QUERY(");
break;
case JSON_VALUE_OP:
appendStringInfoString(buf, "JSON_VALUE(");
break;
default:
elog(ERROR, "unrecognized JsonExpr op: %d",
(int) jexpr->op);
}
get_rule_expr(jexpr->formatted_expr, context, showimplicit);
appendStringInfoString(buf, ", ");
get_json_path_spec(jexpr->path_spec, context, showimplicit);
if (jexpr->passing_values)
{
ListCell *lc1,
*lc2;
bool needcomma = false;
appendStringInfoString(buf, " PASSING ");
forboth(lc1, jexpr->passing_names,
lc2, jexpr->passing_values)
{
if (needcomma)
appendStringInfoString(buf, ", ");
needcomma = true;
get_rule_expr((Node *) lfirst(lc2), context, showimplicit);
appendStringInfo(buf, " AS %s",
((String *) lfirst_node(String, lc1))->sval);
}
}
if (jexpr->op != JSON_EXISTS_OP ||
jexpr->returning->typid != BOOLOID)
get_json_returning(jexpr->returning, context->buf,
jexpr->op == JSON_QUERY_OP);
get_json_expr_options(jexpr, context,
jexpr->op != JSON_EXISTS_OP ?
JSON_BEHAVIOR_NULL :
JSON_BEHAVIOR_FALSE);
appendStringInfoString(buf, ")");
}
break;
case T_List:
{
char *sep;
ListCell *l;
sep = "";
foreach(l, (List *) node)
{
appendStringInfoString(buf, sep);
get_rule_expr((Node *) lfirst(l), context, showimplicit);
sep = ", ";
}
}
break;
case T_TableFunc:
get_tablefunc((TableFunc *) node, context, showimplicit);
break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
break;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/*
* get_rule_expr_toplevel - Parse back a toplevel expression
*
* Same as get_rule_expr(), except that if the expr is just a Var, we pass
* istoplevel = true not false to get_variable(). This causes whole-row Vars
* to get printed with decoration that will prevent expansion of "*".
* We need to use this in contexts such as ROW() and VALUES(), where the
* parser would expand "foo.*" appearing at top level. (In principle we'd
* use this in get_target_list() too, but that has additional worries about
* whether to print AS, so it needs to invoke get_variable() directly anyway.)
*/
static void
get_rule_expr_toplevel(Node *node, deparse_context *context,
bool showimplicit)
{
if (node && IsA(node, Var))
(void) get_variable((Var *) node, 0, true, context);
else
get_rule_expr(node, context, showimplicit);
}
/*
* get_rule_list_toplevel - Parse back a list of toplevel expressions
*
* Apply get_rule_expr_toplevel() to each element of a List.
*
* This adds commas between the expressions, but caller is responsible
* for printing surrounding decoration.
*/
static void
get_rule_list_toplevel(List *lst, deparse_context *context,
bool showimplicit)
{
const char *sep;
ListCell *lc;
sep = "";
foreach(lc, lst)
{
Node *e = (Node *) lfirst(lc);
appendStringInfoString(context->buf, sep);
get_rule_expr_toplevel(e, context, showimplicit);
sep = ", ";
}
}
/*
* get_rule_expr_funccall - Parse back a function-call expression
*
* Same as get_rule_expr(), except that we guarantee that the output will
* look like a function call, or like one of the things the grammar treats as
* equivalent to a function call (see the func_expr_windowless production).
* This is needed in places where the grammar uses func_expr_windowless and
* you can't substitute a parenthesized a_expr. If what we have isn't going
* to look like a function call, wrap it in a dummy CAST() expression, which
* will satisfy the grammar --- and, indeed, is likely what the user wrote to
* produce such a thing.
*/
static void
get_rule_expr_funccall(Node *node, deparse_context *context,
bool showimplicit)
{
if (looks_like_function(node))
get_rule_expr(node, context, showimplicit);
else
{
StringInfo buf = context->buf;
appendStringInfoString(buf, "CAST(");
/* no point in showing any top-level implicit cast */
get_rule_expr(node, context, false);
appendStringInfo(buf, " AS %s)",
format_type_with_typemod(exprType(node),
exprTypmod(node)));
}
}
/*
* Helper function to identify node types that satisfy func_expr_windowless.
* If in doubt, "false" is always a safe answer.
*/
static bool
looks_like_function(Node *node)
{
if (node == NULL)
return false; /* probably shouldn't happen */
switch (nodeTag(node))
{
case T_FuncExpr:
/* OK, unless it's going to deparse as a cast */
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
return (((FuncExpr *) node)->funcformat == COERCE_EXPLICIT_CALL ||
((FuncExpr *) node)->funcformat == COERCE_SQL_SYNTAX);
case T_NullIfExpr:
case T_CoalesceExpr:
case T_MinMaxExpr:
case T_SQLValueFunction:
case T_XmlExpr:
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
case T_JsonExpr:
/* these are all accepted by func_expr_common_subexpr */
return true;
default:
break;
}
return false;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* get_oper_expr - Parse back an OpExpr node
*/
static void
get_oper_expr(OpExpr *expr, deparse_context *context)
{
StringInfo buf = context->buf;
Oid opno = expr->opno;
List *args = expr->args;
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
if (list_length(args) == 2)
{
/* binary operator */
Node *arg1 = (Node *) linitial(args);
Node *arg2 = (Node *) lsecond(args);
2003-08-04 02:43:34 +02:00
get_rule_expr_paren(arg1, context, true, (Node *) expr);
appendStringInfo(buf, " %s ",
generate_operator_name(opno,
exprType(arg1),
exprType(arg2)));
get_rule_expr_paren(arg2, context, true, (Node *) expr);
}
else
{
/* prefix operator */
Node *arg = (Node *) linitial(args);
appendStringInfo(buf, "%s ",
generate_operator_name(opno,
InvalidOid,
exprType(arg)));
get_rule_expr_paren(arg, context, true, (Node *) expr);
}
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
/*
* get_func_expr - Parse back a FuncExpr node
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
*/
static void
get_func_expr(FuncExpr *expr, deparse_context *context,
bool showimplicit)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
Oid funcoid = expr->funcid;
Oid argtypes[FUNC_MAX_ARGS];
int nargs;
List *argnames;
bool use_variadic;
ListCell *l;
/*
* If the function call came from an implicit coercion, then just show the
* first argument --- unless caller wants to see implicit coercions.
*/
if (expr->funcformat == COERCE_IMPLICIT_CAST && !showimplicit)
{
get_rule_expr_paren((Node *) linitial(expr->args), context,
false, (Node *) expr);
return;
}
/*
* If the function call came from a cast, then show the first argument
* plus an explicit cast operation.
*/
if (expr->funcformat == COERCE_EXPLICIT_CAST ||
expr->funcformat == COERCE_IMPLICIT_CAST)
{
Node *arg = linitial(expr->args);
Oid rettype = expr->funcresulttype;
int32 coercedTypmod;
/* Get the typmod if this is a length-coercion function */
(void) exprIsLengthCoercion((Node *) expr, &coercedTypmod);
get_coercion_expr(arg, context,
rettype, coercedTypmod,
(Node *) expr);
return;
}
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
/*
* If the function was called using one of the SQL spec's random special
* syntaxes, try to reproduce that. If we don't recognize the function,
* fall through.
*/
if (expr->funcformat == COERCE_SQL_SYNTAX)
{
if (get_func_sql_syntax(expr, context))
return;
}
/*
* Normal function: display as proname(args). First we need to extract
* the argument datatypes.
*/
if (list_length(expr->args) > FUNC_MAX_ARGS)
ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_ARGUMENTS),
errmsg("too many arguments")));
nargs = 0;
argnames = NIL;
foreach(l, expr->args)
{
Node *arg = (Node *) lfirst(l);
if (IsA(arg, NamedArgExpr))
argnames = lappend(argnames, ((NamedArgExpr *) arg)->name);
argtypes[nargs] = exprType(arg);
nargs++;
}
2002-09-04 22:31:48 +02:00
appendStringInfo(buf, "%s(",
generate_function_name(funcoid, nargs,
argnames, argtypes,
expr->funcvariadic,
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
&use_variadic,
context->special_exprkind));
nargs = 0;
foreach(l, expr->args)
{
if (nargs++ > 0)
appendStringInfoString(buf, ", ");
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
if (use_variadic && lnext(expr->args, l) == NULL)
appendStringInfoString(buf, "VARIADIC ");
get_rule_expr((Node *) lfirst(l), context, true);
}
appendStringInfoChar(buf, ')');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/*
Revert SQL/JSON features The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
2022-09-01 23:07:14 +02:00
* get_agg_expr - Parse back an Aggref node
*/
static void
Revert SQL/JSON features The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
2022-09-01 23:07:14 +02:00
get_agg_expr(Aggref *aggref, deparse_context *context,
Aggref *original_aggref)
{
get_agg_expr_helper(aggref, context, original_aggref, NULL, NULL,
false);
}
/*
* get_agg_expr_helper - subroutine for get_agg_expr and
* get_json_agg_constructor
*/
static void
get_agg_expr_helper(Aggref *aggref, deparse_context *context,
Aggref *original_aggref, const char *funcname,
const char *options, bool is_json_objectagg)
{
StringInfo buf = context->buf;
Oid argtypes[FUNC_MAX_ARGS];
int nargs;
bool use_variadic = false;
/*
* For a combining aggregate, we look up and deparse the corresponding
* partial aggregate instead. This is necessary because our input
* argument list has been replaced; the new argument list always has just
* one element, which will point to a partial Aggref that supplies us with
* transition states to combine.
*/
if (DO_AGGSPLIT_COMBINE(aggref->aggsplit))
{
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
TargetEntry *tle;
Assert(list_length(aggref->args) == 1);
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
tle = linitial_node(TargetEntry, aggref->args);
resolve_special_varno((Node *) tle->expr, context,
get_agg_combine_expr, original_aggref);
return;
}
/*
* Mark as PARTIAL, if appropriate. We look to the original aggref so as
* to avoid printing this when recursing from the code just above.
*/
if (DO_AGGSPLIT_SKIPFINAL(original_aggref->aggsplit))
appendStringInfoString(buf, "PARTIAL ");
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
/* Extract the argument types as seen by the parser */
nargs = get_aggregate_argtypes(aggref, argtypes);
if (!funcname)
funcname = generate_function_name(aggref->aggfnoid, nargs, NIL,
argtypes, aggref->aggvariadic,
&use_variadic,
context->special_exprkind);
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
/* Print the aggregate name, schema-qualified if needed */
appendStringInfo(buf, "%s(%s", funcname,
(aggref->aggdistinct != NIL) ? "DISTINCT " : "");
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
{
/*
* Ordered-set aggregates do not use "*" syntax. Also, we needn't
* worry about inserting VARIADIC. So we can just dump the direct
* args as-is.
*/
Assert(!aggref->aggvariadic);
get_rule_expr((Node *) aggref->aggdirectargs, context, true);
Assert(aggref->aggorder != NIL);
appendStringInfoString(buf, ") WITHIN GROUP (ORDER BY ");
get_rule_orderby(aggref->aggorder, aggref->args, false, context);
}
else
{
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
/* aggstar can be set only in zero-argument aggregates */
if (aggref->aggstar)
appendStringInfoChar(buf, '*');
else
{
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
ListCell *l;
int i;
i = 0;
foreach(l, aggref->args)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
Node *arg = (Node *) tle->expr;
Assert(!IsA(arg, NamedArgExpr));
if (tle->resjunk)
continue;
if (i++ > 0)
{
if (is_json_objectagg)
{
/*
* the ABSENT ON NULL and WITH UNIQUE args are printed
* separately, so ignore them here
*/
if (i > 2)
break;
appendStringInfoString(buf, " : ");
}
else
appendStringInfoString(buf, ", ");
}
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
if (use_variadic && i == nargs)
appendStringInfoString(buf, "VARIADIC ");
get_rule_expr(arg, context, true);
}
}
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
2013-12-23 22:11:35 +01:00
if (aggref->aggorder != NIL)
{
appendStringInfoString(buf, " ORDER BY ");
get_rule_orderby(aggref->aggorder, aggref->args, false, context);
}
}
if (options)
appendStringInfoString(buf, options);
if (aggref->aggfilter != NULL)
{
appendStringInfoString(buf, ") FILTER (WHERE ");
get_rule_expr((Node *) aggref->aggfilter, context, false);
}
appendStringInfoChar(buf, ')');
}
/*
* This is a helper function for get_agg_expr(). It's used when we deparse
* a combining Aggref; resolve_special_varno locates the corresponding partial
* Aggref and then calls this.
*/
static void
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
get_agg_combine_expr(Node *node, deparse_context *context, void *callback_arg)
{
Aggref *aggref;
Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit 55a1954da), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit 1cc29fe7c, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
2019-12-11 23:05:18 +01:00
Aggref *original_aggref = callback_arg;
if (!IsA(node, Aggref))
elog(ERROR, "combining Aggref does not point to an Aggref");
aggref = (Aggref *) node;
get_agg_expr(aggref, context, original_aggref);
}
/*
Revert SQL/JSON features The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
2022-09-01 23:07:14 +02:00
* get_windowfunc_expr - Parse back a WindowFunc node
*/
static void
Revert SQL/JSON features The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
2022-09-01 23:07:14 +02:00
get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
{
get_windowfunc_expr_helper(wfunc, context, NULL, NULL, false);
}
/*
* get_windowfunc_expr_helper - subroutine for get_windowfunc_expr and
* get_json_agg_constructor
*/
static void
get_windowfunc_expr_helper(WindowFunc *wfunc, deparse_context *context,
const char *funcname, const char *options,
bool is_json_objectagg)
{
StringInfo buf = context->buf;
Oid argtypes[FUNC_MAX_ARGS];
int nargs;
List *argnames;
ListCell *l;
if (list_length(wfunc->args) > FUNC_MAX_ARGS)
ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_ARGUMENTS),
errmsg("too many arguments")));
nargs = 0;
argnames = NIL;
foreach(l, wfunc->args)
{
Node *arg = (Node *) lfirst(l);
if (IsA(arg, NamedArgExpr))
argnames = lappend(argnames, ((NamedArgExpr *) arg)->name);
argtypes[nargs] = exprType(arg);
nargs++;
}
if (!funcname)
funcname = generate_function_name(wfunc->winfnoid, nargs, argnames,
argtypes, false, NULL,
context->special_exprkind);
appendStringInfo(buf, "%s(", funcname);
/* winstar can be set only in zero-argument aggregates */
if (wfunc->winstar)
appendStringInfoChar(buf, '*');
else
{
if (is_json_objectagg)
{
get_rule_expr((Node *) linitial(wfunc->args), context, false);
appendStringInfoString(buf, " : ");
get_rule_expr((Node *) lsecond(wfunc->args), context, false);
}
else
get_rule_expr((Node *) wfunc->args, context, true);
}
if (options)
appendStringInfoString(buf, options);
if (wfunc->aggfilter != NULL)
{
appendStringInfoString(buf, ") FILTER (WHERE ");
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
appendStringInfoString(buf, ") OVER ");
foreach(l, context->windowClause)
{
WindowClause *wc = (WindowClause *) lfirst(l);
if (wc->winref == wfunc->winref)
{
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
get_rule_windowspec(wc, context->windowTList, context);
break;
}
}
if (l == NULL)
{
if (context->windowClause)
elog(ERROR, "could not find window clause for winref %u",
wfunc->winref);
/*
* In EXPLAIN, we don't have window context information available, so
* we have to settle for this:
*/
appendStringInfoString(buf, "(?)");
}
}
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
/*
* get_func_sql_syntax - Parse back a SQL-syntax function call
*
* Returns true if we successfully deparsed, false if we did not
* recognize the function.
*/
static bool
get_func_sql_syntax(FuncExpr *expr, deparse_context *context)
{
StringInfo buf = context->buf;
Oid funcoid = expr->funcid;
switch (funcoid)
{
case F_TIMEZONE_INTERVAL_TIMESTAMP:
case F_TIMEZONE_INTERVAL_TIMESTAMPTZ:
case F_TIMEZONE_INTERVAL_TIMETZ:
case F_TIMEZONE_TEXT_TIMESTAMP:
case F_TIMEZONE_TEXT_TIMESTAMPTZ:
case F_TIMEZONE_TEXT_TIMETZ:
/* AT TIME ZONE ... note reversed argument order */
appendStringInfoChar(buf, '(');
get_rule_expr_paren((Node *) lsecond(expr->args), context, false,
(Node *) expr);
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
appendStringInfoString(buf, " AT TIME ZONE ");
get_rule_expr_paren((Node *) linitial(expr->args), context, false,
(Node *) expr);
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
appendStringInfoChar(buf, ')');
return true;
case F_TIMEZONE_TIMESTAMP:
case F_TIMEZONE_TIMESTAMPTZ:
case F_TIMEZONE_TIMETZ:
/* AT LOCAL */
appendStringInfoChar(buf, '(');
get_rule_expr_paren((Node *) linitial(expr->args), context, false,
(Node *) expr);
appendStringInfoString(buf, " AT LOCAL)");
return true;
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
case F_OVERLAPS_TIMESTAMPTZ_INTERVAL_TIMESTAMPTZ_INTERVAL:
case F_OVERLAPS_TIMESTAMPTZ_INTERVAL_TIMESTAMPTZ_TIMESTAMPTZ:
case F_OVERLAPS_TIMESTAMPTZ_TIMESTAMPTZ_TIMESTAMPTZ_INTERVAL:
case F_OVERLAPS_TIMESTAMPTZ_TIMESTAMPTZ_TIMESTAMPTZ_TIMESTAMPTZ:
case F_OVERLAPS_TIMESTAMP_INTERVAL_TIMESTAMP_INTERVAL:
case F_OVERLAPS_TIMESTAMP_INTERVAL_TIMESTAMP_TIMESTAMP:
case F_OVERLAPS_TIMESTAMP_TIMESTAMP_TIMESTAMP_INTERVAL:
case F_OVERLAPS_TIMESTAMP_TIMESTAMP_TIMESTAMP_TIMESTAMP:
case F_OVERLAPS_TIMETZ_TIMETZ_TIMETZ_TIMETZ:
case F_OVERLAPS_TIME_INTERVAL_TIME_INTERVAL:
case F_OVERLAPS_TIME_INTERVAL_TIME_TIME:
case F_OVERLAPS_TIME_TIME_TIME_INTERVAL:
case F_OVERLAPS_TIME_TIME_TIME_TIME:
/* (x1, x2) OVERLAPS (y1, y2) */
appendStringInfoString(buf, "((");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoString(buf, ", ");
get_rule_expr((Node *) lsecond(expr->args), context, false);
appendStringInfoString(buf, ") OVERLAPS (");
get_rule_expr((Node *) lthird(expr->args), context, false);
appendStringInfoString(buf, ", ");
get_rule_expr((Node *) lfourth(expr->args), context, false);
appendStringInfoString(buf, "))");
return true;
Change return type of EXTRACT to numeric The previous implementation of EXTRACT mapped internally to date_part(), which returned type double precision (since it was implemented long before the numeric type existed). This can lead to imprecise output in some cases, so returning numeric would be preferrable. Changing the return type of an existing function is a bit risky, so instead we do the following: We implement a new set of functions, which are now called "extract", in parallel to the existing date_part functions. They work the same way internally but use numeric instead of float8. The EXTRACT construct is now mapped by the parser to these new extract functions. That way, dumps of views etc. from old versions (which would use date_part) continue to work unchanged, but new uses will map to the new extract functions. Additionally, the reverse compilation of EXTRACT now reproduces the original syntax, using the new mechanism introduced in 40c24bfef92530bd846e111c1742c2a54441c62c. The following minor changes of behavior result from the new implementation: - The column name from an isolated EXTRACT call is now "extract" instead of "date_part". - Extract from date now rejects inappropriate field names such as HOUR. It was previously mapped internally to extract from timestamp, so it would silently accept everything appropriate for timestamp. - Return values when extracting fields with possibly fractional values, such as second and epoch, now have the full scale that the value has internally (so, for example, '1.000000' instead of just '1'). Reported-by: Petr Fedorov <petr.fedorov@phystech.edu> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2021-04-06 07:17:13 +02:00
case F_EXTRACT_TEXT_DATE:
case F_EXTRACT_TEXT_TIME:
case F_EXTRACT_TEXT_TIMETZ:
case F_EXTRACT_TEXT_TIMESTAMP:
case F_EXTRACT_TEXT_TIMESTAMPTZ:
case F_EXTRACT_TEXT_INTERVAL:
/* EXTRACT (x FROM y) */
appendStringInfoString(buf, "EXTRACT(");
{
Const *con = (Const *) linitial(expr->args);
Assert(IsA(con, Const) &&
con->consttype == TEXTOID &&
!con->constisnull);
appendStringInfoString(buf, TextDatumGetCString(con->constvalue));
}
appendStringInfoString(buf, " FROM ");
get_rule_expr((Node *) lsecond(expr->args), context, false);
appendStringInfoChar(buf, ')');
return true;
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
case F_IS_NORMALIZED:
/* IS xxx NORMALIZED */
appendStringInfoChar(buf, '(');
get_rule_expr_paren((Node *) linitial(expr->args), context, false,
(Node *) expr);
appendStringInfoString(buf, " IS");
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
if (list_length(expr->args) == 2)
{
Const *con = (Const *) lsecond(expr->args);
Assert(IsA(con, Const) &&
con->consttype == TEXTOID &&
!con->constisnull);
appendStringInfo(buf, " %s",
TextDatumGetCString(con->constvalue));
}
appendStringInfoString(buf, " NORMALIZED)");
return true;
case F_PG_COLLATION_FOR:
/* COLLATION FOR */
appendStringInfoString(buf, "COLLATION FOR (");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoChar(buf, ')');
return true;
case F_NORMALIZE:
/* NORMALIZE() */
appendStringInfoString(buf, "NORMALIZE(");
get_rule_expr((Node *) linitial(expr->args), context, false);
if (list_length(expr->args) == 2)
{
Const *con = (Const *) lsecond(expr->args);
Assert(IsA(con, Const) &&
con->consttype == TEXTOID &&
!con->constisnull);
appendStringInfo(buf, ", %s",
TextDatumGetCString(con->constvalue));
}
appendStringInfoChar(buf, ')');
return true;
case F_OVERLAY_BIT_BIT_INT4:
case F_OVERLAY_BIT_BIT_INT4_INT4:
case F_OVERLAY_BYTEA_BYTEA_INT4:
case F_OVERLAY_BYTEA_BYTEA_INT4_INT4:
case F_OVERLAY_TEXT_TEXT_INT4:
case F_OVERLAY_TEXT_TEXT_INT4_INT4:
/* OVERLAY() */
appendStringInfoString(buf, "OVERLAY(");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoString(buf, " PLACING ");
get_rule_expr((Node *) lsecond(expr->args), context, false);
appendStringInfoString(buf, " FROM ");
get_rule_expr((Node *) lthird(expr->args), context, false);
if (list_length(expr->args) == 4)
{
appendStringInfoString(buf, " FOR ");
get_rule_expr((Node *) lfourth(expr->args), context, false);
}
appendStringInfoChar(buf, ')');
return true;
case F_POSITION_BIT_BIT:
case F_POSITION_BYTEA_BYTEA:
case F_POSITION_TEXT_TEXT:
/* POSITION() ... extra parens since args are b_expr not a_expr */
appendStringInfoString(buf, "POSITION((");
get_rule_expr((Node *) lsecond(expr->args), context, false);
appendStringInfoString(buf, ") IN (");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoString(buf, "))");
return true;
case F_SUBSTRING_BIT_INT4:
case F_SUBSTRING_BIT_INT4_INT4:
case F_SUBSTRING_BYTEA_INT4:
case F_SUBSTRING_BYTEA_INT4_INT4:
case F_SUBSTRING_TEXT_INT4:
case F_SUBSTRING_TEXT_INT4_INT4:
/* SUBSTRING FROM/FOR (i.e., integer-position variants) */
appendStringInfoString(buf, "SUBSTRING(");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoString(buf, " FROM ");
get_rule_expr((Node *) lsecond(expr->args), context, false);
if (list_length(expr->args) == 3)
{
appendStringInfoString(buf, " FOR ");
get_rule_expr((Node *) lthird(expr->args), context, false);
}
appendStringInfoChar(buf, ')');
return true;
case F_SUBSTRING_TEXT_TEXT_TEXT:
/* SUBSTRING SIMILAR/ESCAPE */
appendStringInfoString(buf, "SUBSTRING(");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoString(buf, " SIMILAR ");
get_rule_expr((Node *) lsecond(expr->args), context, false);
appendStringInfoString(buf, " ESCAPE ");
get_rule_expr((Node *) lthird(expr->args), context, false);
appendStringInfoChar(buf, ')');
return true;
case F_BTRIM_BYTEA_BYTEA:
case F_BTRIM_TEXT:
case F_BTRIM_TEXT_TEXT:
/* TRIM() */
appendStringInfoString(buf, "TRIM(BOTH");
if (list_length(expr->args) == 2)
{
appendStringInfoChar(buf, ' ');
get_rule_expr((Node *) lsecond(expr->args), context, false);
}
appendStringInfoString(buf, " FROM ");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoChar(buf, ')');
return true;
case F_LTRIM_BYTEA_BYTEA:
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
case F_LTRIM_TEXT:
case F_LTRIM_TEXT_TEXT:
/* TRIM() */
appendStringInfoString(buf, "TRIM(LEADING");
if (list_length(expr->args) == 2)
{
appendStringInfoChar(buf, ' ');
get_rule_expr((Node *) lsecond(expr->args), context, false);
}
appendStringInfoString(buf, " FROM ");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoChar(buf, ')');
return true;
case F_RTRIM_BYTEA_BYTEA:
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
case F_RTRIM_TEXT:
case F_RTRIM_TEXT_TEXT:
/* TRIM() */
appendStringInfoString(buf, "TRIM(TRAILING");
if (list_length(expr->args) == 2)
{
appendStringInfoChar(buf, ' ');
get_rule_expr((Node *) lsecond(expr->args), context, false);
}
appendStringInfoString(buf, " FROM ");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoChar(buf, ')');
return true;
case F_SYSTEM_USER:
appendStringInfoString(buf, "SYSTEM_USER");
return true;
Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-11-04 18:34:50 +01:00
case F_XMLEXISTS:
/* XMLEXISTS ... extra parens because args are c_expr */
appendStringInfoString(buf, "XMLEXISTS((");
get_rule_expr((Node *) linitial(expr->args), context, false);
appendStringInfoString(buf, ") PASSING (");
get_rule_expr((Node *) lsecond(expr->args), context, false);
appendStringInfoString(buf, "))");
return true;
}
return false;
}
/* ----------
* get_coercion_expr
*
* Make a string representation of a value coerced to a specific type
* ----------
*/
static void
get_coercion_expr(Node *arg, deparse_context *context,
Oid resulttype, int32 resulttypmod,
Node *parentNode)
{
StringInfo buf = context->buf;
/*
* Since parse_coerce.c doesn't immediately collapse application of
* length-coercion functions to constants, what we'll typically see in
* such cases is a Const with typmod -1 and a length-coercion function
* right above it. Avoid generating redundant output. However, beware of
* suppressing casts when the user actually wrote something like
* 'foo'::text::char(3).
*
* Note: it might seem that we are missing the possibility of needing to
* print a COLLATE clause for such a Const. However, a Const could only
* have nondefault collation in a post-constant-folding tree, in which the
* length coercion would have been folded too. See also the special
* handling of CollateExpr in coerce_to_target_type(): any collation
* marking will be above the coercion node, not below it.
*/
if (arg && IsA(arg, Const) &&
((Const *) arg)->consttype == resulttype &&
((Const *) arg)->consttypmod == -1)
{
/* Show the constant without normal ::typename decoration */
get_const_expr((Const *) arg, context, -1);
}
else
{
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr_paren(arg, context, false, parentNode);
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
/*
* Never emit resulttype(arg) functional notation. A pg_proc entry could
* take precedence, and a resulttype in pg_temp would require schema
* qualification that format_type_with_typemod() would usually omit. We've
* standardized on arg::resulttype, but CAST(arg AS resulttype) notation
* would work fine.
*/
appendStringInfo(buf, "::%s",
format_type_with_typemod(resulttype, resulttypmod));
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/* ----------
* get_const_expr
*
* Make a string representation of a Const
*
* showtype can be -1 to never show "::typename" decoration, or +1 to always
* show it, or 0 to show it only if the constant wouldn't be assumed to be
* the right type by default.
*
* If the Const's collation isn't default for its type, show that too.
* We mustn't do this when showtype is -1 (since that means the caller will
* print "::typename", and we can't put a COLLATE clause in between). It's
* caller's responsibility that collation isn't missed in such cases.
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
* ----------
*/
static void
get_const_expr(Const *constval, deparse_context *context, int showtype)
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
{
StringInfo buf = context->buf;
Oid typoutput;
bool typIsVarlena;
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
char *extval;
bool needlabel = false;
if (constval->constisnull)
{
/*
* Always label the type of a NULL constant to prevent misdecisions
* about type when reparsing.
*/
appendStringInfoString(buf, "NULL");
if (showtype >= 0)
{
appendStringInfo(buf, "::%s",
format_type_with_typemod(constval->consttype,
constval->consttypmod));
get_const_collation(constval, context);
}
return;
}
getTypeOutputInfo(constval->consttype,
&typoutput, &typIsVarlena);
extval = OidOutputFunctionCall(typoutput, constval->constvalue);
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
switch (constval->consttype)
{
case INT4OID:
/*
* INT4 can be printed without any decoration, unless it is
* negative; in that case print it as '-nnn'::integer to ensure
* that the output will re-parse as a constant, not as a constant
* plus operator. In most cases we could get away with printing
* (-nnn) instead, because of the way that gram.y handles negative
* literals; but that doesn't work for INT_MIN, and it doesn't
* seem that much prettier anyway.
*/
if (extval[0] != '-')
appendStringInfoString(buf, extval);
else
{
appendStringInfo(buf, "'%s'", extval);
needlabel = true; /* we must attach a cast */
}
break;
case NUMERICOID:
/*
* NUMERIC can be printed without quotes if it looks like a float
* constant (not an integer, and not Infinity or NaN) and doesn't
* have a leading sign (for the same reason as for INT4).
*/
if (isdigit((unsigned char) extval[0]) &&
strcspn(extval, "eE.") != strlen(extval))
{
appendStringInfoString(buf, extval);
}
else
{
appendStringInfo(buf, "'%s'", extval);
needlabel = true; /* we must attach a cast */
}
break;
case BOOLOID:
if (strcmp(extval, "t") == 0)
appendStringInfoString(buf, "true");
else
appendStringInfoString(buf, "false");
break;
default:
simple_quote_literal(buf, extval);
break;
}
pfree(extval);
if (showtype < 0)
return;
/*
* For showtype == 0, append ::typename unless the constant will be
* implicitly typed as the right type when it is read in.
*
* XXX this code has to be kept in sync with the behavior of the parser,
* especially make_const.
*/
switch (constval->consttype)
{
case BOOLOID:
case UNKNOWNOID:
/* These types can be left unlabeled */
needlabel = false;
break;
case INT4OID:
/* We determined above whether a label is needed */
break;
case NUMERICOID:
2007-11-15 22:14:46 +01:00
/*
* Float-looking constants will be typed as numeric, which we
* checked above; but if there's a nondefault typmod we need to
* show it.
*/
needlabel |= (constval->consttypmod >= 0);
break;
default:
needlabel = true;
break;
}
if (needlabel || showtype > 0)
appendStringInfo(buf, "::%s",
format_type_with_typemod(constval->consttype,
constval->consttypmod));
get_const_collation(constval, context);
}
/*
* helper for get_const_expr: append COLLATE if needed
*/
static void
get_const_collation(Const *constval, deparse_context *context)
{
StringInfo buf = context->buf;
if (OidIsValid(constval->constcollid))
{
Oid typcollation = get_typcollation(constval->consttype);
if (constval->constcollid != typcollation)
{
appendStringInfo(buf, " COLLATE %s",
generate_collation_name(constval->constcollid));
}
}
}
Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-21 09:06:27 +01:00
/*
* get_json_path_spec - Parse back a JSON path specification
*/
static void
get_json_path_spec(Node *path_spec, deparse_context *context, bool showimplicit)
{
if (IsA(path_spec, Const))
get_const_expr((Const *) path_spec, context, -1);
else
get_rule_expr(path_spec, context, showimplicit);
}
/*
* get_json_format - Parse back a JsonFormat node
*/
static void
get_json_format(JsonFormat *format, StringInfo buf)
{
if (format->format_type == JS_FORMAT_DEFAULT)
return;
appendStringInfoString(buf,
format->format_type == JS_FORMAT_JSONB ?
" FORMAT JSONB" : " FORMAT JSON");
if (format->encoding != JS_ENC_DEFAULT)
{
const char *encoding;
encoding =
format->encoding == JS_ENC_UTF16 ? "UTF16" :
format->encoding == JS_ENC_UTF32 ? "UTF32" : "UTF8";
appendStringInfo(buf, " ENCODING %s", encoding);
}
}
/*
* get_json_returning - Parse back a JsonReturning structure
*/
static void
get_json_returning(JsonReturning *returning, StringInfo buf,
bool json_format_by_default)
{
if (!OidIsValid(returning->typid))
return;
appendStringInfo(buf, " RETURNING %s",
format_type_with_typemod(returning->typid,
returning->typmod));
if (!json_format_by_default ||
returning->format->format_type !=
(returning->typid == JSONBOID ? JS_FORMAT_JSONB : JS_FORMAT_JSON))
get_json_format(returning->format, buf);
}
/*
* get_json_constructor - Parse back a JsonConstructorExpr node
*/
static void
get_json_constructor(JsonConstructorExpr *ctor, deparse_context *context,
bool showimplicit)
{
StringInfo buf = context->buf;
const char *funcname;
bool is_json_object;
int curridx;
ListCell *lc;
if (ctor->type == JSCTOR_JSON_OBJECTAGG)
{
get_json_agg_constructor(ctor, context, "JSON_OBJECTAGG", true);
return;
}
else if (ctor->type == JSCTOR_JSON_ARRAYAGG)
{
get_json_agg_constructor(ctor, context, "JSON_ARRAYAGG", false);
return;
}
switch (ctor->type)
{
case JSCTOR_JSON_OBJECT:
funcname = "JSON_OBJECT";
break;
case JSCTOR_JSON_ARRAY:
funcname = "JSON_ARRAY";
break;
2023-07-20 15:21:43 +02:00
case JSCTOR_JSON_PARSE:
funcname = "JSON";
break;
case JSCTOR_JSON_SCALAR:
funcname = "JSON_SCALAR";
break;
case JSCTOR_JSON_SERIALIZE:
funcname = "JSON_SERIALIZE";
break;
default:
elog(ERROR, "invalid JsonConstructorType %d", ctor->type);
}
appendStringInfo(buf, "%s(", funcname);
is_json_object = ctor->type == JSCTOR_JSON_OBJECT;
foreach(lc, ctor->args)
{
curridx = foreach_current_index(lc);
if (curridx > 0)
{
const char *sep;
sep = (is_json_object && (curridx % 2) != 0) ? " : " : ", ";
appendStringInfoString(buf, sep);
}
get_rule_expr((Node *) lfirst(lc), context, true);
}
get_json_constructor_options(ctor, buf);
appendStringInfoChar(buf, ')');
}
/*
* Append options, if any, to the JSON constructor being deparsed
*/
static void
get_json_constructor_options(JsonConstructorExpr *ctor, StringInfo buf)
{
if (ctor->absent_on_null)
{
if (ctor->type == JSCTOR_JSON_OBJECT ||
ctor->type == JSCTOR_JSON_OBJECTAGG)
appendStringInfoString(buf, " ABSENT ON NULL");
}
else
{
if (ctor->type == JSCTOR_JSON_ARRAY ||
ctor->type == JSCTOR_JSON_ARRAYAGG)
appendStringInfoString(buf, " NULL ON NULL");
}
if (ctor->unique)
appendStringInfoString(buf, " WITH UNIQUE KEYS");
2023-07-20 15:21:43 +02:00
/*
* Append RETURNING clause if needed; JSON() and JSON_SCALAR() don't
* support one.
*/
if (ctor->type != JSCTOR_JSON_PARSE && ctor->type != JSCTOR_JSON_SCALAR)
get_json_returning(ctor->returning, buf, true);
}
/*
* get_json_agg_constructor - Parse back an aggregate JsonConstructorExpr node
*/
static void
get_json_agg_constructor(JsonConstructorExpr *ctor, deparse_context *context,
const char *funcname, bool is_json_objectagg)
{
StringInfoData options;
initStringInfo(&options);
get_json_constructor_options(ctor, &options);
if (IsA(ctor->func, Aggref))
get_agg_expr_helper((Aggref *) ctor->func, context,
(Aggref *) ctor->func,
funcname, options.data, is_json_objectagg);
else if (IsA(ctor->func, WindowFunc))
get_windowfunc_expr_helper((WindowFunc *) ctor->func, context,
funcname, options.data,
is_json_objectagg);
else
elog(ERROR, "invalid JsonConstructorExpr underlying node type: %d",
nodeTag(ctor->func));
}
/*
* simple_quote_literal - Format a string as a SQL literal, append to buf
*/
static void
simple_quote_literal(StringInfo buf, const char *val)
{
const char *valptr;
/*
* We form the string literal according to the prevailing setting of
* standard_conforming_strings; we never use E''. User is responsible for
* making sure result is used correctly.
*/
appendStringInfoChar(buf, '\'');
for (valptr = val; *valptr; valptr++)
{
char ch = *valptr;
if (SQL_STR_DOUBLE(ch, !standard_conforming_strings))
appendStringInfoChar(buf, ch);
appendStringInfoChar(buf, ch);
}
appendStringInfoChar(buf, '\'');
}
/* ----------
* get_sublink_expr - Parse back a sublink
* ----------
*/
static void
get_sublink_expr(SubLink *sublink, deparse_context *context)
{
StringInfo buf = context->buf;
Query *query = (Query *) (sublink->subselect);
char *opname = NULL;
bool need_paren;
if (sublink->subLinkType == ARRAY_SUBLINK)
appendStringInfoString(buf, "ARRAY(");
else
appendStringInfoChar(buf, '(');
/*
* Note that we print the name of only the first operator, when there are
* multiple combining operators. This is an approximation that could go
* wrong in various scenarios (operators in different schemas, renamed
* operators, etc) but there is not a whole lot we can do about it, since
* the syntax allows only one operator to be shown.
*/
if (sublink->testexpr)
{
if (IsA(sublink->testexpr, OpExpr))
{
/* single combining operator */
OpExpr *opexpr = (OpExpr *) sublink->testexpr;
get_rule_expr(linitial(opexpr->args), context, true);
opname = generate_operator_name(opexpr->opno,
exprType(linitial(opexpr->args)),
exprType(lsecond(opexpr->args)));
}
else if (IsA(sublink->testexpr, BoolExpr))
{
/* multiple combining operators, = or <> cases */
char *sep;
ListCell *l;
appendStringInfoChar(buf, '(');
sep = "";
foreach(l, ((BoolExpr *) sublink->testexpr)->args)
{
OpExpr *opexpr = lfirst_node(OpExpr, l);
appendStringInfoString(buf, sep);
get_rule_expr(linitial(opexpr->args), context, true);
if (!opname)
opname = generate_operator_name(opexpr->opno,
exprType(linitial(opexpr->args)),
exprType(lsecond(opexpr->args)));
sep = ", ";
}
appendStringInfoChar(buf, ')');
}
else if (IsA(sublink->testexpr, RowCompareExpr))
{
/* multiple combining operators, < <= > >= cases */
RowCompareExpr *rcexpr = (RowCompareExpr *) sublink->testexpr;
appendStringInfoChar(buf, '(');
get_rule_expr((Node *) rcexpr->largs, context, true);
opname = generate_operator_name(linitial_oid(rcexpr->opnos),
exprType(linitial(rcexpr->largs)),
exprType(linitial(rcexpr->rargs)));
appendStringInfoChar(buf, ')');
}
else
elog(ERROR, "unrecognized testexpr type: %d",
(int) nodeTag(sublink->testexpr));
}
need_paren = true;
switch (sublink->subLinkType)
{
case EXISTS_SUBLINK:
appendStringInfoString(buf, "EXISTS ");
break;
case ANY_SUBLINK:
if (strcmp(opname, "=") == 0) /* Represent = ANY as IN */
appendStringInfoString(buf, " IN ");
else
appendStringInfo(buf, " %s ANY ", opname);
break;
case ALL_SUBLINK:
appendStringInfo(buf, " %s ALL ", opname);
break;
case ROWCOMPARE_SUBLINK:
appendStringInfo(buf, " %s ", opname);
break;
case EXPR_SUBLINK:
case MULTIEXPR_SUBLINK:
case ARRAY_SUBLINK:
need_paren = false;
break;
case CTE_SUBLINK: /* shouldn't occur in a SubLink */
default:
elog(ERROR, "unrecognized sublink type: %d",
(int) sublink->subLinkType);
break;
}
if (need_paren)
appendStringInfoChar(buf, '(');
get_query_def(query, buf, context->namespaces, NULL, false,
context->prettyFlags, context->wrapColumn,
context->indentLevel);
if (need_paren)
appendStringInfoString(buf, "))");
else
appendStringInfoChar(buf, ')');
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
}
/* ----------
Add basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-04-04 12:57:08 +02:00
* get_xmltable - Parse back a XMLTABLE function
* ----------
*/
static void
Add basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-04-04 12:57:08 +02:00
get_xmltable(TableFunc *tf, deparse_context *context, bool showimplicit)
{
StringInfo buf = context->buf;
appendStringInfoString(buf, "XMLTABLE(");
if (tf->ns_uris != NIL)
{
ListCell *lc1,
*lc2;
bool first = true;
appendStringInfoString(buf, "XMLNAMESPACES (");
forboth(lc1, tf->ns_uris, lc2, tf->ns_names)
{
Node *expr = (Node *) lfirst(lc1);
String *ns_node = lfirst_node(String, lc2);
if (!first)
appendStringInfoString(buf, ", ");
else
first = false;
if (ns_node != NULL)
{
get_rule_expr(expr, context, showimplicit);
appendStringInfo(buf, " AS %s", strVal(ns_node));
}
else
{
appendStringInfoString(buf, "DEFAULT ");
get_rule_expr(expr, context, showimplicit);
}
}
appendStringInfoString(buf, "), ");
}
appendStringInfoChar(buf, '(');
get_rule_expr((Node *) tf->rowexpr, context, showimplicit);
appendStringInfoString(buf, ") PASSING (");
get_rule_expr((Node *) tf->docexpr, context, showimplicit);
appendStringInfoChar(buf, ')');
if (tf->colexprs != NIL)
{
ListCell *l1;
ListCell *l2;
ListCell *l3;
ListCell *l4;
ListCell *l5;
int colnum = 0;
appendStringInfoString(buf, " COLUMNS ");
forfive(l1, tf->colnames, l2, tf->coltypes, l3, tf->coltypmods,
l4, tf->colexprs, l5, tf->coldefexprs)
{
char *colname = strVal(lfirst(l1));
Oid typid = lfirst_oid(l2);
int32 typmod = lfirst_int(l3);
Node *colexpr = (Node *) lfirst(l4);
Node *coldefexpr = (Node *) lfirst(l5);
bool ordinality = (tf->ordinalitycol == colnum);
bool notnull = bms_is_member(colnum, tf->notnulls);
if (colnum > 0)
appendStringInfoString(buf, ", ");
colnum++;
appendStringInfo(buf, "%s %s", quote_identifier(colname),
ordinality ? "FOR ORDINALITY" :
format_type_with_typemod(typid, typmod));
if (ordinality)
continue;
if (coldefexpr != NULL)
{
appendStringInfoString(buf, " DEFAULT (");
get_rule_expr((Node *) coldefexpr, context, showimplicit);
appendStringInfoChar(buf, ')');
}
if (colexpr != NULL)
{
appendStringInfoString(buf, " PATH (");
get_rule_expr((Node *) colexpr, context, showimplicit);
appendStringInfoChar(buf, ')');
}
if (notnull)
appendStringInfoString(buf, " NOT NULL");
}
}
appendStringInfoChar(buf, ')');
}
Add basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-04-04 12:57:08 +02:00
/*
* get_json_table_columns - Parse back JSON_TABLE columns
*/
static void
get_json_table_columns(TableFunc *tf, deparse_context *context,
bool showimplicit)
{
StringInfo buf = context->buf;
JsonExpr *jexpr = castNode(JsonExpr, tf->docexpr);
ListCell *lc_colname;
ListCell *lc_coltype;
ListCell *lc_coltypmod;
ListCell *lc_colvalexpr;
int colnum = 0;
appendStringInfoChar(buf, ' ');
appendContextKeyword(context, "COLUMNS (", 0, 0, 0);
if (PRETTY_INDENT(context))
context->indentLevel += PRETTYINDENT_VAR;
forfour(lc_colname, tf->colnames,
lc_coltype, tf->coltypes,
lc_coltypmod, tf->coltypmods,
lc_colvalexpr, tf->colvalexprs)
{
char *colname = strVal(lfirst(lc_colname));
JsonExpr *colexpr;
Oid typid;
int32 typmod;
bool ordinality;
JsonBehaviorType default_behavior;
typid = lfirst_oid(lc_coltype);
typmod = lfirst_int(lc_coltypmod);
colexpr = castNode(JsonExpr, lfirst(lc_colvalexpr));
if (colnum > 0)
appendStringInfoString(buf, ", ");
colnum++;
ordinality = !colexpr;
appendContextKeyword(context, "", 0, 0, 0);
appendStringInfo(buf, "%s %s", quote_identifier(colname),
ordinality ? "FOR ORDINALITY" :
format_type_with_typemod(typid, typmod));
if (ordinality)
continue;
if (colexpr->op == JSON_EXISTS_OP)
{
appendStringInfoString(buf, " EXISTS");
default_behavior = JSON_BEHAVIOR_FALSE;
}
else
{
if (colexpr->op == JSON_QUERY_OP)
{
char typcategory;
bool typispreferred;
get_type_category_preferred(typid, &typcategory, &typispreferred);
if (typcategory == TYPCATEGORY_STRING)
appendStringInfoString(buf,
colexpr->format->format_type == JS_FORMAT_JSONB ?
" FORMAT JSONB" : " FORMAT JSON");
}
default_behavior = JSON_BEHAVIOR_NULL;
}
if (jexpr->on_error->btype == JSON_BEHAVIOR_ERROR)
default_behavior = JSON_BEHAVIOR_ERROR;
appendStringInfoString(buf, " PATH ");
get_json_path_spec(colexpr->path_spec, context, showimplicit);
get_json_expr_options(colexpr, context, default_behavior);
}
if (PRETTY_INDENT(context))
context->indentLevel -= PRETTYINDENT_VAR;
appendContextKeyword(context, ")", 0, 0, 0);
}
/* ----------
* get_json_table - Parse back a JSON_TABLE function
* ----------
*/
static void
get_json_table(TableFunc *tf, deparse_context *context, bool showimplicit)
{
StringInfo buf = context->buf;
JsonExpr *jexpr = castNode(JsonExpr, tf->docexpr);
JsonTablePathScan *root = castNode(JsonTablePathScan, tf->plan);
appendStringInfoString(buf, "JSON_TABLE(");
if (PRETTY_INDENT(context))
context->indentLevel += PRETTYINDENT_VAR;
appendContextKeyword(context, "", 0, 0, 0);
get_rule_expr(jexpr->formatted_expr, context, showimplicit);
appendStringInfoString(buf, ", ");
get_const_expr(root->path->value, context, -1);
appendStringInfo(buf, " AS %s", quote_identifier(root->path->name));
if (jexpr->passing_values)
{
ListCell *lc1,
*lc2;
bool needcomma = false;
appendStringInfoChar(buf, ' ');
appendContextKeyword(context, "PASSING ", 0, 0, 0);
if (PRETTY_INDENT(context))
context->indentLevel += PRETTYINDENT_VAR;
forboth(lc1, jexpr->passing_names,
lc2, jexpr->passing_values)
{
if (needcomma)
appendStringInfoString(buf, ", ");
needcomma = true;
appendContextKeyword(context, "", 0, 0, 0);
get_rule_expr((Node *) lfirst(lc2), context, false);
appendStringInfo(buf, " AS %s",
quote_identifier((lfirst_node(String, lc1))->sval)
);
}
if (PRETTY_INDENT(context))
context->indentLevel -= PRETTYINDENT_VAR;
}
get_json_table_columns(tf, context, showimplicit);
if (jexpr->on_error->btype != JSON_BEHAVIOR_EMPTY)
get_json_behavior(jexpr->on_error, context, "ERROR");
if (PRETTY_INDENT(context))
context->indentLevel -= PRETTYINDENT_VAR;
appendContextKeyword(context, ")", 0, 0, 0);
}
/* ----------
* get_tablefunc - Parse back a table function
* ----------
*/
static void
get_tablefunc(TableFunc *tf, deparse_context *context, bool showimplicit)
{
/* XMLTABLE and JSON_TABLE are the only existing implementations. */
if (tf->functype == TFT_XMLTABLE)
get_xmltable(tf, context, showimplicit);
else if (tf->functype == TFT_JSON_TABLE)
get_json_table(tf, context, showimplicit);
}
/* ----------
* get_from_clause - Parse back a FROM clause
*
* "prefix" is the keyword that denotes the start of the list of FROM
* elements. It is FROM when used to parse back SELECT and UPDATE, but
* is USING when parsing back DELETE.
* ----------
*/
static void
get_from_clause(Query *query, const char *prefix, deparse_context *context)
{
StringInfo buf = context->buf;
bool first = true;
ListCell *l;
/*
* We use the query's jointree as a guide to what to print. However, we
* must ignore auto-added RTEs that are marked not inFromCl. (These can
* only appear at the top level of the jointree, so it's sufficient to
* check here.) This check also ensures we ignore the rule pseudo-RTEs
* for NEW and OLD.
*/
foreach(l, query->jointree->fromlist)
{
Node *jtnode = (Node *) lfirst(l);
if (IsA(jtnode, RangeTblRef))
{
int varno = ((RangeTblRef *) jtnode)->rtindex;
RangeTblEntry *rte = rt_fetch(varno, query->rtable);
if (!rte->inFromCl)
continue;
}
if (first)
{
appendContextKeyword(context, prefix,
-PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
first = false;
get_from_clause_item(jtnode, query, context);
}
else
{
StringInfoData itembuf;
appendStringInfoString(buf, ", ");
/*
* Put the new FROM item's text into itembuf so we can decide
* after we've got it whether or not it needs to go on a new line.
*/
initStringInfo(&itembuf);
context->buf = &itembuf;
get_from_clause_item(jtnode, query, context);
/* Restore context's output buffer */
context->buf = buf;
/* Consider line-wrapping if enabled */
if (PRETTY_INDENT(context) && context->wrapColumn >= 0)
{
/* Does the new item start with a new line? */
if (itembuf.len > 0 && itembuf.data[0] == '\n')
{
/* If so, we shouldn't add anything */
/* instead, remove any trailing spaces currently in buf */
removeStringInfoSpaces(buf);
}
else
{
char *trailing_nl;
/* Locate the start of the current line in the buffer */
trailing_nl = strrchr(buf->data, '\n');
if (trailing_nl == NULL)
trailing_nl = buf->data;
else
trailing_nl++;
/*
* Add a newline, plus some indentation, if the new item
* would cause an overflow.
*/
if (strlen(trailing_nl) + itembuf.len > context->wrapColumn)
appendContextKeyword(context, "", -PRETTYINDENT_STD,
PRETTYINDENT_STD,
PRETTYINDENT_VAR);
}
}
/* Add the new item */
appendBinaryStringInfo(buf, itembuf.data, itembuf.len);
/* clean up */
pfree(itembuf.data);
}
}
}
static void
get_from_clause_item(Node *jtnode, Query *query, deparse_context *context)
{
StringInfo buf = context->buf;
deparse_namespace *dpns = (deparse_namespace *) linitial(context->namespaces);
if (IsA(jtnode, RangeTblRef))
{
int varno = ((RangeTblRef *) jtnode)->rtindex;
RangeTblEntry *rte = rt_fetch(varno, query->rtable);
deparse_columns *colinfo = deparse_columns_fetch(varno, dpns);
RangeTblFunction *rtfunc1 = NULL;
if (rte->lateral)
appendStringInfoString(buf, "LATERAL ");
/* Print the FROM item proper */
switch (rte->rtekind)
{
case RTE_RELATION:
/* Normal relation RTE */
appendStringInfo(buf, "%s%s",
only_marker(rte),
generate_relation_name(rte->relid,
context->namespaces));
break;
case RTE_SUBQUERY:
/* Subquery RTE */
appendStringInfoChar(buf, '(');
get_query_def(rte->subquery, buf, context->namespaces, NULL,
true,
context->prettyFlags, context->wrapColumn,
context->indentLevel);
appendStringInfoChar(buf, ')');
break;
case RTE_FUNCTION:
/* Function RTE */
rtfunc1 = (RangeTblFunction *) linitial(rte->functions);
/*
* Omit ROWS FROM() syntax for just one function, unless it
* has both a coldeflist and WITH ORDINALITY. If it has both,
* we must use ROWS FROM() syntax to avoid ambiguity about
* whether the coldeflist includes the ordinality column.
*/
if (list_length(rte->functions) == 1 &&
(rtfunc1->funccolnames == NIL || !rte->funcordinality))
{
get_rule_expr_funccall(rtfunc1->funcexpr, context, true);
/* we'll print the coldeflist below, if it has one */
}
else
{
bool all_unnest;
ListCell *lc;
/*
* If all the function calls in the list are to unnest,
* and none need a coldeflist, then collapse the list back
* down to UNNEST(args). (If we had more than one
* built-in unnest function, this would get more
* difficult.)
*
* XXX This is pretty ugly, since it makes not-terribly-
* future-proof assumptions about what the parser would do
* with the output; but the alternative is to emit our
* nonstandard ROWS FROM() notation for what might have
* been a perfectly spec-compliant multi-argument
* UNNEST().
*/
all_unnest = true;
foreach(lc, rte->functions)
{
RangeTblFunction *rtfunc = (RangeTblFunction *) lfirst(lc);
if (!IsA(rtfunc->funcexpr, FuncExpr) ||
((FuncExpr *) rtfunc->funcexpr)->funcid != F_UNNEST_ANYARRAY ||
rtfunc->funccolnames != NIL)
{
all_unnest = false;
break;
}
}
if (all_unnest)
{
List *allargs = NIL;
foreach(lc, rte->functions)
{
RangeTblFunction *rtfunc = (RangeTblFunction *) lfirst(lc);
List *args = ((FuncExpr *) rtfunc->funcexpr)->args;
allargs = list_concat(allargs, args);
}
appendStringInfoString(buf, "UNNEST(");
get_rule_expr((Node *) allargs, context, true);
appendStringInfoChar(buf, ')');
}
else
{
int funcno = 0;
appendStringInfoString(buf, "ROWS FROM(");
foreach(lc, rte->functions)
{
RangeTblFunction *rtfunc = (RangeTblFunction *) lfirst(lc);
if (funcno > 0)
appendStringInfoString(buf, ", ");
get_rule_expr_funccall(rtfunc->funcexpr, context, true);
if (rtfunc->funccolnames != NIL)
{
/* Reconstruct the column definition list */
appendStringInfoString(buf, " AS ");
get_from_clause_coldeflist(rtfunc,
NULL,
context);
}
funcno++;
}
appendStringInfoChar(buf, ')');
}
/* prevent printing duplicate coldeflist below */
rtfunc1 = NULL;
}
if (rte->funcordinality)
appendStringInfoString(buf, " WITH ORDINALITY");
break;
case RTE_TABLEFUNC:
get_tablefunc(rte->tablefunc, context, true);
break;
case RTE_VALUES:
/* Values list RTE */
appendStringInfoChar(buf, '(');
get_values_def(rte->values_lists, context);
appendStringInfoChar(buf, ')');
break;
case RTE_CTE:
appendStringInfoString(buf, quote_identifier(rte->ctename));
break;
default:
elog(ERROR, "unrecognized RTE kind: %d", (int) rte->rtekind);
break;
}
/* Print the relation alias, if needed */
get_rte_alias(rte, varno, false, context);
/* Print the column definitions or aliases, if needed */
if (rtfunc1 && rtfunc1->funccolnames != NIL)
{
/* Reconstruct the columndef list, which is also the aliases */
get_from_clause_coldeflist(rtfunc1, colinfo, context);
}
else
{
/* Else print column aliases as needed */
get_column_alias_list(colinfo, context);
}
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
/* Tablesample clause must go after any alias */
if (rte->rtekind == RTE_RELATION && rte->tablesample)
get_tablesample_def(rte->tablesample, context);
}
else if (IsA(jtnode, JoinExpr))
{
JoinExpr *j = (JoinExpr *) jtnode;
deparse_columns *colinfo = deparse_columns_fetch(j->rtindex, dpns);
bool need_paren_on_right;
need_paren_on_right = PRETTY_PAREN(context) &&
!IsA(j->rarg, RangeTblRef) &&
!(IsA(j->rarg, JoinExpr) && ((JoinExpr *) j->rarg)->alias != NULL);
if (!PRETTY_PAREN(context) || j->alias != NULL)
appendStringInfoChar(buf, '(');
get_from_clause_item(j->larg, query, context);
switch (j->jointype)
{
case JOIN_INNER:
if (j->quals)
appendContextKeyword(context, " JOIN ",
-PRETTYINDENT_STD,
PRETTYINDENT_STD,
PRETTYINDENT_JOIN);
else
appendContextKeyword(context, " CROSS JOIN ",
-PRETTYINDENT_STD,
PRETTYINDENT_STD,
PRETTYINDENT_JOIN);
break;
case JOIN_LEFT:
appendContextKeyword(context, " LEFT JOIN ",
-PRETTYINDENT_STD,
PRETTYINDENT_STD,
PRETTYINDENT_JOIN);
break;
case JOIN_FULL:
appendContextKeyword(context, " FULL JOIN ",
-PRETTYINDENT_STD,
PRETTYINDENT_STD,
PRETTYINDENT_JOIN);
break;
case JOIN_RIGHT:
appendContextKeyword(context, " RIGHT JOIN ",
-PRETTYINDENT_STD,
PRETTYINDENT_STD,
PRETTYINDENT_JOIN);
break;
default:
elog(ERROR, "unrecognized join type: %d",
(int) j->jointype);
}
if (need_paren_on_right)
appendStringInfoChar(buf, '(');
get_from_clause_item(j->rarg, query, context);
if (need_paren_on_right)
appendStringInfoChar(buf, ')');
if (j->usingClause)
{
ListCell *lc;
bool first = true;
appendStringInfoString(buf, " USING (");
/* Use the assigned names, not what's in usingClause */
foreach(lc, colinfo->usingNames)
{
char *colname = (char *) lfirst(lc);
if (first)
first = false;
else
appendStringInfoString(buf, ", ");
appendStringInfoString(buf, quote_identifier(colname));
}
appendStringInfoChar(buf, ')');
if (j->join_using_alias)
appendStringInfo(buf, " AS %s",
quote_identifier(j->join_using_alias->aliasname));
}
else if (j->quals)
{
appendStringInfoString(buf, " ON ");
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, '(');
get_rule_expr(j->quals, context, false);
if (!PRETTY_PAREN(context))
appendStringInfoChar(buf, ')');
}
else if (j->jointype != JOIN_INNER)
{
/* If we didn't say CROSS JOIN above, we must provide an ON */
appendStringInfoString(buf, " ON TRUE");
}
if (!PRETTY_PAREN(context) || j->alias != NULL)
appendStringInfoChar(buf, ')');
/* Yes, it's correct to put alias after the right paren ... */
if (j->alias != NULL)
{
/*
* Note that it's correct to emit an alias clause if and only if
* there was one originally. Otherwise we'd be converting a named
* join to unnamed or vice versa, which creates semantic
* subtleties we don't want. However, we might print a different
* alias name than was there originally.
*/
appendStringInfo(buf, " %s",
quote_identifier(get_rtable_name(j->rtindex,
context)));
get_column_alias_list(colinfo, context);
}
}
else
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(jtnode));
}
/*
* get_rte_alias - print the relation's alias, if needed
*
* If printed, the alias is preceded by a space, or by " AS " if use_as is true.
*/
static void
get_rte_alias(RangeTblEntry *rte, int varno, bool use_as,
deparse_context *context)
{
deparse_namespace *dpns = (deparse_namespace *) linitial(context->namespaces);
char *refname = get_rtable_name(varno, context);
deparse_columns *colinfo = deparse_columns_fetch(varno, dpns);
bool printalias = false;
if (rte->alias != NULL)
{
/* Always print alias if user provided one */
printalias = true;
}
else if (colinfo->printaliases)
{
/* Always print alias if we need to print column aliases */
printalias = true;
}
else if (rte->rtekind == RTE_RELATION)
{
/*
* No need to print alias if it's same as relation name (this would
* normally be the case, but not if set_rtable_names had to resolve a
* conflict).
*/
if (strcmp(refname, get_relation_name(rte->relid)) != 0)
printalias = true;
}
else if (rte->rtekind == RTE_FUNCTION)
{
/*
* For a function RTE, always print alias. This covers possible
* renaming of the function and/or instability of the FigureColname
* rules for things that aren't simple functions. Note we'd need to
* force it anyway for the columndef list case.
*/
printalias = true;
}
else if (rte->rtekind == RTE_SUBQUERY ||
rte->rtekind == RTE_VALUES)
{
/*
* For a subquery, always print alias. This makes the output
* SQL-spec-compliant, even though we allow such aliases to be omitted
* on input.
*/
printalias = true;
}
else if (rte->rtekind == RTE_CTE)
{
/*
* No need to print alias if it's same as CTE name (this would
* normally be the case, but not if set_rtable_names had to resolve a
* conflict).
*/
if (strcmp(refname, rte->ctename) != 0)
printalias = true;
}
if (printalias)
appendStringInfo(context->buf, "%s%s",
use_as ? " AS " : " ",
quote_identifier(refname));
}
/*
* get_column_alias_list - print column alias list for an RTE
*
* Caller must already have printed the relation's alias name.
*/
static void
get_column_alias_list(deparse_columns *colinfo, deparse_context *context)
{
StringInfo buf = context->buf;
int i;
bool first = true;
/* Don't print aliases if not needed */
if (!colinfo->printaliases)
return;
for (i = 0; i < colinfo->num_new_cols; i++)
{
char *colname = colinfo->new_colnames[i];
if (first)
{
appendStringInfoChar(buf, '(');
first = false;
}
else
appendStringInfoString(buf, ", ");
appendStringInfoString(buf, quote_identifier(colname));
}
if (!first)
appendStringInfoChar(buf, ')');
}
/*
* get_from_clause_coldeflist - reproduce FROM clause coldeflist
*
* When printing a top-level coldeflist (which is syntactically also the
* relation's column alias list), use column names from colinfo. But when
* printing a coldeflist embedded inside ROWS FROM(), we prefer to use the
* original coldeflist's names, which are available in rtfunc->funccolnames.
* Pass NULL for colinfo to select the latter behavior.
*
* The coldeflist is appended immediately (no space) to buf. Caller is
* responsible for ensuring that an alias or AS is present before it.
*/
static void
get_from_clause_coldeflist(RangeTblFunction *rtfunc,
deparse_columns *colinfo,
deparse_context *context)
{
StringInfo buf = context->buf;
ListCell *l1;
ListCell *l2;
ListCell *l3;
ListCell *l4;
int i;
appendStringInfoChar(buf, '(');
i = 0;
forfour(l1, rtfunc->funccoltypes,
l2, rtfunc->funccoltypmods,
l3, rtfunc->funccolcollations,
l4, rtfunc->funccolnames)
{
Oid atttypid = lfirst_oid(l1);
int32 atttypmod = lfirst_int(l2);
Oid attcollation = lfirst_oid(l3);
char *attname;
if (colinfo)
attname = colinfo->colnames[i];
else
attname = strVal(lfirst(l4));
Assert(attname); /* shouldn't be any dropped columns here */
if (i > 0)
appendStringInfoString(buf, ", ");
appendStringInfo(buf, "%s %s",
quote_identifier(attname),
format_type_with_typemod(atttypid, atttypmod));
if (OidIsValid(attcollation) &&
attcollation != get_typcollation(atttypid))
appendStringInfo(buf, " COLLATE %s",
generate_collation_name(attcollation));
i++;
}
appendStringInfoChar(buf, ')');
}
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
/*
* get_tablesample_def - print a TableSampleClause
*/
static void
get_tablesample_def(TableSampleClause *tablesample, deparse_context *context)
{
StringInfo buf = context->buf;
Oid argtypes[1];
int nargs;
ListCell *l;
/*
* We should qualify the handler's function name if it wouldn't be
* resolved by lookup in the current search path.
*/
argtypes[0] = INTERNALOID;
appendStringInfo(buf, " TABLESAMPLE %s (",
generate_function_name(tablesample->tsmhandler, 1,
NIL, argtypes,
false, NULL, EXPR_KIND_NONE));
nargs = 0;
foreach(l, tablesample->args)
{
if (nargs++ > 0)
appendStringInfoString(buf, ", ");
get_rule_expr((Node *) lfirst(l), context, false);
}
appendStringInfoChar(buf, ')');
if (tablesample->repeatable != NULL)
{
appendStringInfoString(buf, " REPEATABLE (");
get_rule_expr((Node *) tablesample->repeatable, context, false);
appendStringInfoChar(buf, ')');
}
}
/*
* get_opclass_name - fetch name of an index operator class
*
* The opclass name is appended (after a space) to buf.
*
* Output is suppressed if the opclass is the default for the given
* actual_datatype. (If you don't want this behavior, just pass
* InvalidOid for actual_datatype.)
*/
static void
get_opclass_name(Oid opclass, Oid actual_datatype,
StringInfo buf)
{
HeapTuple ht_opc;
Form_pg_opclass opcrec;
char *opcname;
char *nspname;
ht_opc = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
if (!HeapTupleIsValid(ht_opc))
elog(ERROR, "cache lookup failed for opclass %u", opclass);
opcrec = (Form_pg_opclass) GETSTRUCT(ht_opc);
if (!OidIsValid(actual_datatype) ||
GetDefaultOpClass(actual_datatype, opcrec->opcmethod) != opclass)
{
/* Okay, we need the opclass name. Do we need to qualify it? */
opcname = NameStr(opcrec->opcname);
if (OpclassIsVisible(opclass))
appendStringInfo(buf, " %s", quote_identifier(opcname));
else
{
nspname = get_namespace_name_or_temp(opcrec->opcnamespace);
appendStringInfo(buf, " %s.%s",
quote_identifier(nspname),
quote_identifier(opcname));
}
}
ReleaseSysCache(ht_opc);
}
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
/*
* generate_opclass_name
* Compute the name to display for an opclass specified by OID
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
*
* The result includes all necessary quoting and schema-prefixing.
*/
char *
generate_opclass_name(Oid opclass)
{
StringInfoData buf;
initStringInfo(&buf);
get_opclass_name(opclass, InvalidOid, &buf);
return &buf.data[1]; /* get_opclass_name() prepends space */
}
/*
* processIndirection - take care of array and subfield assignment
*
* We strip any top-level FieldStore or assignment SubscriptingRef nodes that
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
* appear in the input, printing them as decoration for the base column
* name (which we assume the caller just printed). We might also need to
* strip CoerceToDomain nodes, but only ones that appear above assignment
* nodes.
*
* Returns the subexpression that's to be assigned.
*/
static Node *
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
processIndirection(Node *node, deparse_context *context)
{
StringInfo buf = context->buf;
CoerceToDomain *cdomain = NULL;
for (;;)
{
if (node == NULL)
break;
if (IsA(node, FieldStore))
{
FieldStore *fstore = (FieldStore *) node;
Oid typrelid;
char *fieldname;
/* lookup tuple type */
typrelid = get_typ_typrelid(fstore->resulttype);
if (!OidIsValid(typrelid))
elog(ERROR, "argument type %s of FieldStore is not a tuple type",
format_type_be(fstore->resulttype));
2004-08-29 07:07:03 +02:00
/*
* Print the field name. There should only be one target field in
* stored rules. There could be more than that in executable
* target lists, but this function cannot be used for that case.
*/
Assert(list_length(fstore->fieldnums) == 1);
fieldname = get_attname(typrelid,
linitial_int(fstore->fieldnums), false);
Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 22:37:03 +02:00
appendStringInfo(buf, ".%s", quote_identifier(fieldname));
2004-08-29 07:07:03 +02:00
/*
* We ignore arg since it should be an uninteresting reference to
* the target column or subcolumn.
*/
node = (Node *) linitial(fstore->newvals);
}
else if (IsA(node, SubscriptingRef))
{
SubscriptingRef *sbsref = (SubscriptingRef *) node;
2001-03-22 05:01:46 +01:00
if (sbsref->refassgnexpr == NULL)
break;
printSubscripts(sbsref, context);
2004-08-29 07:07:03 +02:00
/*
* We ignore refexpr since it should be an uninteresting reference
* to the target column or subcolumn.
*/
node = (Node *) sbsref->refassgnexpr;
}
else if (IsA(node, CoerceToDomain))
{
cdomain = (CoerceToDomain *) node;
/* If it's an explicit domain coercion, we're done */
if (cdomain->coercionformat != COERCE_IMPLICIT_CAST)
break;
/* Tentatively descend past the CoerceToDomain */
node = (Node *) cdomain->arg;
}
else
break;
}
/*
* If we descended past a CoerceToDomain whose argument turned out not to
* be a FieldStore or array assignment, back up to the CoerceToDomain.
* (This is not enough to be fully correct if there are nested implicit
* CoerceToDomains, but such cases shouldn't ever occur.)
*/
if (cdomain && node == (Node *) cdomain->arg)
node = (Node *) cdomain;
return node;
}
static void
printSubscripts(SubscriptingRef *sbsref, deparse_context *context)
{
StringInfo buf = context->buf;
ListCell *lowlist_item;
ListCell *uplist_item;
lowlist_item = list_head(sbsref->reflowerindexpr); /* could be NULL */
foreach(uplist_item, sbsref->refupperindexpr)
{
appendStringInfoChar(buf, '[');
if (lowlist_item)
{
/* If subexpression is NULL, get_rule_expr prints nothing */
get_rule_expr((Node *) lfirst(lowlist_item), context, false);
appendStringInfoChar(buf, ':');
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
lowlist_item = lnext(sbsref->reflowerindexpr, lowlist_item);
}
/* If subexpression is NULL, get_rule_expr prints nothing */
get_rule_expr((Node *) lfirst(uplist_item), context, false);
appendStringInfoChar(buf, ']');
}
}
/*
* quote_identifier - Quote an identifier only if needed
*
* When quotes are needed, we palloc the required space; slightly
* space-wasteful but well worth it for notational simplicity.
*/
const char *
quote_identifier(const char *ident)
{
/*
* Can avoid quoting if ident starts with a lowercase letter or underscore
* and contains only lowercase letters, digits, and underscores, *and* is
* not any SQL keyword. Otherwise, supply quotes.
*/
int nquotes = 0;
bool safe;
const char *ptr;
char *result;
char *optr;
/*
* would like to use <ctype.h> macros here, but they might yield unwanted
* locale-specific results...
*/
safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_');
for (ptr = ident; *ptr; ptr++)
{
char ch = *ptr;
if ((ch >= 'a' && ch <= 'z') ||
(ch >= '0' && ch <= '9') ||
(ch == '_'))
{
/* okay */
}
else
{
safe = false;
if (ch == '"')
nquotes++;
}
}
if (quote_all_identifiers)
safe = false;
if (safe)
{
/*
* Check for keyword. We quote keywords except for unreserved ones.
* (In some cases we could avoid quoting a col_name or type_func_name
* keyword, but it seems much harder than it's worth to tell that.)
*
* Note: ScanKeywordLookup() does case-insensitive comparison, but
* that's fine, since we already know we have all-lower-case.
*/
Replace the data structure used for keyword lookup. Previously, ScanKeywordLookup was passed an array of string pointers. This had some performance deficiencies: the strings themselves might be scattered all over the place depending on the compiler (and some quick checking shows that at least with gcc-on-Linux, they indeed weren't reliably close together). That led to very cache-unfriendly behavior as the binary search touched strings in many different pages. Also, depending on the platform, the string pointers might need to be adjusted at program start, so that they couldn't be simple constant data. And the ScanKeyword struct had been designed with an eye to 32-bit machines originally; on 64-bit it requires 16 bytes per keyword, making it even more cache-unfriendly. Redesign so that the keyword strings themselves are allocated consecutively (as part of one big char-string constant), thereby eliminating the touch-lots-of-unrelated-pages syndrome. And get rid of the ScanKeyword array in favor of three separate arrays: uint16 offsets into the keyword array, uint16 token codes, and uint8 keyword categories. That reduces the overhead per keyword to 5 bytes instead of 16 (even less in programs that only need one of the token codes and categories); moreover, the binary search only touches the offsets array, further reducing its cache footprint. This also lets us put the token codes somewhere else than the keyword strings are, which avoids some unpleasant build dependencies. While we're at it, wrap the data used by ScanKeywordLookup into a struct that can be treated as an opaque type by most callers. That doesn't change things much right now, but it will make it less painful to switch to a hash-based lookup method, as is being discussed in the mailing list thread. Most of the change here is associated with adding a generator script that can build the new data structure from the same list-of-PG_KEYWORD header representation we used before. The PG_KEYWORD lists that plpgsql and ecpg used to embed in their scanner .c files have to be moved into headers, and the Makefiles have to be taught to invoke the generator script. This work is also necessary if we're to consider hash-based lookup, since the generator script is what would be responsible for constructing a hash table. Aside from saving a few kilobytes in each program that includes the keyword table, this seems to speed up raw parsing (flex+bison) by a few percent. So it's worth doing even as it stands, though we think we can gain even more with a follow-on patch to switch to hash-based lookup. John Naylor, with further hacking by me Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
2019-01-06 23:02:57 +01:00
int kwnum = ScanKeywordLookup(ident, &ScanKeywords);
Replace the data structure used for keyword lookup. Previously, ScanKeywordLookup was passed an array of string pointers. This had some performance deficiencies: the strings themselves might be scattered all over the place depending on the compiler (and some quick checking shows that at least with gcc-on-Linux, they indeed weren't reliably close together). That led to very cache-unfriendly behavior as the binary search touched strings in many different pages. Also, depending on the platform, the string pointers might need to be adjusted at program start, so that they couldn't be simple constant data. And the ScanKeyword struct had been designed with an eye to 32-bit machines originally; on 64-bit it requires 16 bytes per keyword, making it even more cache-unfriendly. Redesign so that the keyword strings themselves are allocated consecutively (as part of one big char-string constant), thereby eliminating the touch-lots-of-unrelated-pages syndrome. And get rid of the ScanKeyword array in favor of three separate arrays: uint16 offsets into the keyword array, uint16 token codes, and uint8 keyword categories. That reduces the overhead per keyword to 5 bytes instead of 16 (even less in programs that only need one of the token codes and categories); moreover, the binary search only touches the offsets array, further reducing its cache footprint. This also lets us put the token codes somewhere else than the keyword strings are, which avoids some unpleasant build dependencies. While we're at it, wrap the data used by ScanKeywordLookup into a struct that can be treated as an opaque type by most callers. That doesn't change things much right now, but it will make it less painful to switch to a hash-based lookup method, as is being discussed in the mailing list thread. Most of the change here is associated with adding a generator script that can build the new data structure from the same list-of-PG_KEYWORD header representation we used before. The PG_KEYWORD lists that plpgsql and ecpg used to embed in their scanner .c files have to be moved into headers, and the Makefiles have to be taught to invoke the generator script. This work is also necessary if we're to consider hash-based lookup, since the generator script is what would be responsible for constructing a hash table. Aside from saving a few kilobytes in each program that includes the keyword table, this seems to speed up raw parsing (flex+bison) by a few percent. So it's worth doing even as it stands, though we think we can gain even more with a follow-on patch to switch to hash-based lookup. John Naylor, with further hacking by me Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
2019-01-06 23:02:57 +01:00
if (kwnum >= 0 && ScanKeywordCategories[kwnum] != UNRESERVED_KEYWORD)
safe = false;
}
if (safe)
return ident; /* no change needed */
result = (char *) palloc(strlen(ident) + nquotes + 2 + 1);
optr = result;
*optr++ = '"';
for (ptr = ident; *ptr; ptr++)
{
char ch = *ptr;
if (ch == '"')
*optr++ = '"';
*optr++ = ch;
}
*optr++ = '"';
*optr = '\0';
return result;
}
This is the final state of the rule system for 6.4 after the patch is applied: Rewrite rules on relation level work fine now. Event qualifications on insert/update/delete rules work fine now. I added the new keyword OLD to reference the CURRENT tuple. CURRENT will be removed in 6.5. Update rules can reference NEW and OLD in the rule qualification and the actions. Insert/update/delete rules on views can be established to let them behave like real tables. For insert/update/delete rules multiple actions are supported now. The actions can also be surrounded by parantheses to make psql happy. Multiple actions are required if update to a view requires updates to multiple tables. Regular users are permitted to create/drop rules on tables they have RULE permissions for (DefineQueryRewrite() is now able to get around the access restrictions on pg_rewrite). This enables view creation for regular users too. This required an extra boolean parameter to pg_parse_and_plan() that tells to set skipAcl on all rangetable entries of the resulting queries. There is a new function pg_exec_query_acl_override() that could be used by backend utilities to use this facility. All rule actions (not only views) inherit the permissions of the event relations owner. Sample: User A creates tables T1 and T2, creates rules that log INSERT/UPDATE/DELETE on T1 in T2 (like in the regression tests for rules I created) and grants ALL but RULE on T1 to user B. User B can now fully access T1 and the logging happens in T2. But user B cannot access T2 at all, only the rule actions can. And due to missing RULE permissions on T1, user B cannot disable logging. Rules on the attribute level are disabled (they don't work properly and since regular users are now permitted to create rules I decided to disable them). Rules on select must have exactly one action that is a select (so select rules must be a view definition). UPDATE NEW/OLD rules are disabled (still broken, but triggers can do it). There are two new system views (pg_rule and pg_view) that show the definition of the rules or views so the db admin can see what the users do. They use two new functions pg_get_ruledef() and pg_get_viewdef() that are builtins. The functions pg_get_ruledef() and pg_get_viewdef() could be used to implement rule and view support in pg_dump. PostgreSQL is now the only database system I know, that has rewrite rules on the query level. All others (where I found a rule statement at all) use stored database procedures or the like (triggers as we call them) for active rules (as some call them). Future of the rule system: The now disabled parts of the rule system (attribute level, multiple actions on select and update new stuff) require a complete new rewrite handler from scratch. The old one is too badly wired up. After 6.4 I'll start to work on a new rewrite handler, that fully supports the attribute level rules, multiple actions on select and update new. This will be available for 6.5 so we get full rewrite rule capabilities. Jan
1998-08-24 03:38:11 +02:00
/*
* quote_qualified_identifier - Quote a possibly-qualified identifier
*
* Return a name of the form qualifier.ident, or just ident if qualifier
* is NULL, quoting each component if necessary. The result is palloc'd.
*/
char *
quote_qualified_identifier(const char *qualifier,
const char *ident)
{
StringInfoData buf;
initStringInfo(&buf);
if (qualifier)
appendStringInfo(&buf, "%s.", quote_identifier(qualifier));
appendStringInfoString(&buf, quote_identifier(ident));
return buf.data;
}
/*
* get_relation_name
* Get the unqualified name of a relation specified by OID
*
* This differs from the underlying get_rel_name() function in that it will
* throw error instead of silently returning NULL if the OID is bad.
*/
static char *
get_relation_name(Oid relid)
{
char *relname = get_rel_name(relid);
if (!relname)
elog(ERROR, "cache lookup failed for relation %u", relid);
return relname;
}
/*
* generate_relation_name
* Compute the name to display for a relation specified by OID
*
* The result includes all necessary quoting and schema-prefixing.
*
* If namespaces isn't NIL, it must be a list of deparse_namespace nodes.
* We will forcibly qualify the relation name if it equals any CTE name
* visible in the namespace list.
*/
static char *
generate_relation_name(Oid relid, List *namespaces)
{
HeapTuple tp;
Form_pg_class reltup;
bool need_qual;
ListCell *nslist;
char *relname;
char *nspname;
char *result;
tp = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for relation %u", relid);
reltup = (Form_pg_class) GETSTRUCT(tp);
relname = NameStr(reltup->relname);
/* Check for conflicting CTE name */
need_qual = false;
foreach(nslist, namespaces)
{
deparse_namespace *dpns = (deparse_namespace *) lfirst(nslist);
ListCell *ctlist;
foreach(ctlist, dpns->ctes)
{
CommonTableExpr *cte = (CommonTableExpr *) lfirst(ctlist);
if (strcmp(cte->ctename, relname) == 0)
{
need_qual = true;
break;
}
}
if (need_qual)
break;
}
/* Otherwise, qualify the name if not visible in search path */
if (!need_qual)
need_qual = !RelationIsVisible(relid);
if (need_qual)
nspname = get_namespace_name_or_temp(reltup->relnamespace);
else
nspname = NULL;
result = quote_qualified_identifier(nspname, relname);
ReleaseSysCache(tp);
return result;
}
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
/*
* generate_qualified_relation_name
* Compute the name to display for a relation specified by OID
*
* As above, but unconditionally schema-qualify the name.
*/
static char *
generate_qualified_relation_name(Oid relid)
{
HeapTuple tp;
Form_pg_class reltup;
char *relname;
char *nspname;
char *result;
tp = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for relation %u", relid);
reltup = (Form_pg_class) GETSTRUCT(tp);
relname = NameStr(reltup->relname);
nspname = get_namespace_name_or_temp(reltup->relnamespace);
Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit 5ed6546cf, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit 5f173040. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.
2015-11-20 20:55:28 +01:00
if (!nspname)
elog(ERROR, "cache lookup failed for namespace %u",
reltup->relnamespace);
result = quote_qualified_identifier(nspname, relname);
ReleaseSysCache(tp);
return result;
}
/*
* generate_function_name
* Compute the name to display for a function specified by OID,
* given that it is being called with the specified actual arg names and
* types. (Those matter because of ambiguous-function resolution rules.)
*
* If we're dealing with a potentially variadic function (in practice, this
* means a FuncExpr or Aggref, not some other way of calling a function), then
Fix non-equivalence of VARIADIC and non-VARIADIC function call formats. For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...) and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the former is converted to the latter at parse time. They have indeed been equivalent, in all releases before 9.3. However, commit 75b39e790 made an ill-considered decision to record which syntax had been used in FuncExpr nodes, and then to make equal() test that in checking node equality --- which caused the syntaxes to not be seen as equivalent by the planner. This is the underlying cause of bug #9817 from Dmitry Ryabov. It might seem that a quick fix would be to make equal() disregard FuncExpr.funcvariadic, but the same commit made that untenable, because the field actually *is* semantically significant for some VARIADIC ANY functions. This patch instead adopts the approach of redefining funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument is a variadic array, whether it got that way by parser intervention or was supplied explicitly by the user. Therefore the value will always be true for non-ANY variadic functions, restoring the principle of equivalence. (However, the planner will continue to consider use of VARIADIC as a meaningful difference for VARIADIC ANY functions, even though some such functions might disregard it.) In HEAD, this change lets us simplify the decompilation logic in ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether to print VARIADIC. However, in 9.3 we have to continue to cope with existing stored rules/views that might contain the previous definition. Fortunately, this just means no change in ruleutils.c, since its existing behavior effectively ignores funcvariadic for all cases other than VARIADIC ANY functions. In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic changed meanings; this is sort of pro forma, since I don't believe any built-in views are affected. Unfortunately, this patch doesn't magically fix everything for affected 9.3 users. After installing 9.3.5, they might need to recreate their rules/views/indexes containing variadic function calls in order to get everything consistent with the new definition. As in the cited bug, the symptom of a problem would be failure to use a nominally matching index that has a variadic function call in its definition. We'll need to mention this in the 9.3.5 release notes.
2014-04-04 04:02:24 +02:00
* has_variadic must specify whether variadic arguments have been merged,
* and *use_variadic_p will be set to indicate whether to print VARIADIC in
* the output. For non-FuncExpr cases, has_variadic should be false and
* use_variadic_p can be NULL.
*
* The result includes all necessary quoting and schema-prefixing.
*/
static char *
generate_function_name(Oid funcid, int nargs, List *argnames, Oid *argtypes,
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
bool has_variadic, bool *use_variadic_p,
ParseExprKind special_exprkind)
{
char *result;
HeapTuple proctup;
Form_pg_proc procform;
char *proname;
bool use_variadic;
char *nspname;
FuncDetailCode p_result;
Oid p_funcid;
Oid p_rettype;
bool p_retset;
int p_nvargs;
Oid p_vatype;
Oid *p_true_typeids;
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
bool force_qualify = false;
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
elog(ERROR, "cache lookup failed for function %u", funcid);
procform = (Form_pg_proc) GETSTRUCT(proctup);
proname = NameStr(procform->proname);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
/*
* Due to parser hacks to avoid needing to reserve CUBE, we need to force
* qualification in some special cases.
*/
if (special_exprkind == EXPR_KIND_GROUP_BY)
{
if (strcmp(proname, "cube") == 0 || strcmp(proname, "rollup") == 0)
force_qualify = true;
}
/*
* Determine whether VARIADIC should be printed. We must do this first
* since it affects the lookup rules in func_get_detail().
*
* We always print VARIADIC if the function has a merged variadic-array
* argument. Note that this is always the case for functions taking a
* VARIADIC argument type other than VARIADIC ANY. If we omitted VARIADIC
* and printed the array elements as separate arguments, the call could
* match a newer non-VARIADIC function.
*/
if (use_variadic_p)
{
Fix non-equivalence of VARIADIC and non-VARIADIC function call formats. For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...) and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the former is converted to the latter at parse time. They have indeed been equivalent, in all releases before 9.3. However, commit 75b39e790 made an ill-considered decision to record which syntax had been used in FuncExpr nodes, and then to make equal() test that in checking node equality --- which caused the syntaxes to not be seen as equivalent by the planner. This is the underlying cause of bug #9817 from Dmitry Ryabov. It might seem that a quick fix would be to make equal() disregard FuncExpr.funcvariadic, but the same commit made that untenable, because the field actually *is* semantically significant for some VARIADIC ANY functions. This patch instead adopts the approach of redefining funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument is a variadic array, whether it got that way by parser intervention or was supplied explicitly by the user. Therefore the value will always be true for non-ANY variadic functions, restoring the principle of equivalence. (However, the planner will continue to consider use of VARIADIC as a meaningful difference for VARIADIC ANY functions, even though some such functions might disregard it.) In HEAD, this change lets us simplify the decompilation logic in ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether to print VARIADIC. However, in 9.3 we have to continue to cope with existing stored rules/views that might contain the previous definition. Fortunately, this just means no change in ruleutils.c, since its existing behavior effectively ignores funcvariadic for all cases other than VARIADIC ANY functions. In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic changed meanings; this is sort of pro forma, since I don't believe any built-in views are affected. Unfortunately, this patch doesn't magically fix everything for affected 9.3 users. After installing 9.3.5, they might need to recreate their rules/views/indexes containing variadic function calls in order to get everything consistent with the new definition. As in the cited bug, the symptom of a problem would be failure to use a nominally matching index that has a variadic function call in its definition. We'll need to mention this in the 9.3.5 release notes.
2014-04-04 04:02:24 +02:00
/* Parser should not have set funcvariadic unless fn is variadic */
Assert(!has_variadic || OidIsValid(procform->provariadic));
use_variadic = has_variadic;
*use_variadic_p = use_variadic;
}
else
{
Fix non-equivalence of VARIADIC and non-VARIADIC function call formats. For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...) and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the former is converted to the latter at parse time. They have indeed been equivalent, in all releases before 9.3. However, commit 75b39e790 made an ill-considered decision to record which syntax had been used in FuncExpr nodes, and then to make equal() test that in checking node equality --- which caused the syntaxes to not be seen as equivalent by the planner. This is the underlying cause of bug #9817 from Dmitry Ryabov. It might seem that a quick fix would be to make equal() disregard FuncExpr.funcvariadic, but the same commit made that untenable, because the field actually *is* semantically significant for some VARIADIC ANY functions. This patch instead adopts the approach of redefining funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument is a variadic array, whether it got that way by parser intervention or was supplied explicitly by the user. Therefore the value will always be true for non-ANY variadic functions, restoring the principle of equivalence. (However, the planner will continue to consider use of VARIADIC as a meaningful difference for VARIADIC ANY functions, even though some such functions might disregard it.) In HEAD, this change lets us simplify the decompilation logic in ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether to print VARIADIC. However, in 9.3 we have to continue to cope with existing stored rules/views that might contain the previous definition. Fortunately, this just means no change in ruleutils.c, since its existing behavior effectively ignores funcvariadic for all cases other than VARIADIC ANY functions. In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic changed meanings; this is sort of pro forma, since I don't believe any built-in views are affected. Unfortunately, this patch doesn't magically fix everything for affected 9.3 users. After installing 9.3.5, they might need to recreate their rules/views/indexes containing variadic function calls in order to get everything consistent with the new definition. As in the cited bug, the symptom of a problem would be failure to use a nominally matching index that has a variadic function call in its definition. We'll need to mention this in the 9.3.5 release notes.
2014-04-04 04:02:24 +02:00
Assert(!has_variadic);
use_variadic = false;
}
/*
* The idea here is to schema-qualify only if the parser would fail to
* resolve the correct function given the unqualified func name with the
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
* specified argtypes and VARIADIC flag. But if we already decided to
* force qualification, then we can skip the lookup and pretend we didn't
* find it.
*/
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
if (!force_qualify)
p_result = func_get_detail(list_make1(makeString(proname)),
NIL, argnames, nargs, argtypes,
Reconsider the handling of procedure OUT parameters. Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
2021-06-10 23:11:36 +02:00
!use_variadic, true, false,
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
&p_funcid, &p_rettype,
&p_retset, &p_nvargs, &p_vatype,
&p_true_typeids, NULL);
else
{
p_result = FUNCDETAIL_NOTFOUND;
p_funcid = InvalidOid;
}
if ((p_result == FUNCDETAIL_NORMAL ||
p_result == FUNCDETAIL_AGGREGATE ||
p_result == FUNCDETAIL_WINDOWFUNC) &&
p_funcid == funcid)
nspname = NULL;
else
nspname = get_namespace_name_or_temp(procform->pronamespace);
result = quote_qualified_identifier(nspname, proname);
ReleaseSysCache(proctup);
return result;
}
/*
* generate_operator_name
* Compute the name to display for an operator specified by OID,
* given that it is being called with the specified actual arg types.
* (Arg types matter because of ambiguous-operator resolution rules.
* Pass InvalidOid for unused arg of a unary operator.)
*
* The result includes all necessary quoting and schema-prefixing,
* plus the OPERATOR() decoration needed to use a qualified operator name
* in an expression.
*/
static char *
generate_operator_name(Oid operid, Oid arg1, Oid arg2)
{
StringInfoData buf;
HeapTuple opertup;
Form_pg_operator operform;
char *oprname;
char *nspname;
Operator p_result;
initStringInfo(&buf);
opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(operid));
if (!HeapTupleIsValid(opertup))
elog(ERROR, "cache lookup failed for operator %u", operid);
operform = (Form_pg_operator) GETSTRUCT(opertup);
oprname = NameStr(operform->oprname);
/*
* The idea here is to schema-qualify only if the parser would fail to
* resolve the correct operator given the unqualified op name with the
* specified argtypes.
*/
switch (operform->oprkind)
{
case 'b':
p_result = oper(NULL, list_make1(makeString(oprname)), arg1, arg2,
true, -1);
break;
case 'l':
p_result = left_oper(NULL, list_make1(makeString(oprname)), arg2,
true, -1);
break;
default:
elog(ERROR, "unrecognized oprkind: %d", operform->oprkind);
p_result = NULL; /* keep compiler quiet */
break;
}
if (p_result != NULL && oprid(p_result) == operid)
nspname = NULL;
else
{
nspname = get_namespace_name_or_temp(operform->oprnamespace);
appendStringInfo(&buf, "OPERATOR(%s.", quote_identifier(nspname));
}
appendStringInfoString(&buf, oprname);
if (nspname)
appendStringInfoChar(&buf, ')');
if (p_result != NULL)
ReleaseSysCache(p_result);
ReleaseSysCache(opertup);
return buf.data;
}
/*
* generate_operator_clause --- generate a binary-operator WHERE clause
*
* This is used for internally-generated-and-executed SQL queries, where
* precision is essential and readability is secondary. The basic
* requirement is to append "leftop op rightop" to buf, where leftop and
* rightop are given as strings and are assumed to yield types leftoptype
* and rightoptype; the operator is identified by OID. The complexity
* comes from needing to be sure that the parser will select the desired
* operator when the query is parsed. We always name the operator using
* OPERATOR(schema.op) syntax, so as to avoid search-path uncertainties.
* We have to emit casts too, if either input isn't already the input type
* of the operator; else we are at the mercy of the parser's heuristics for
* ambiguous-operator resolution. The caller must ensure that leftop and
* rightop are suitable arguments for a cast operation; it's best to insert
* parentheses if they aren't just variables or parameters.
*/
void
generate_operator_clause(StringInfo buf,
const char *leftop, Oid leftoptype,
Oid opoid,
const char *rightop, Oid rightoptype)
{
HeapTuple opertup;
Form_pg_operator operform;
char *oprname;
char *nspname;
opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(opoid));
if (!HeapTupleIsValid(opertup))
elog(ERROR, "cache lookup failed for operator %u", opoid);
operform = (Form_pg_operator) GETSTRUCT(opertup);
Assert(operform->oprkind == 'b');
oprname = NameStr(operform->oprname);
nspname = get_namespace_name(operform->oprnamespace);
appendStringInfoString(buf, leftop);
if (leftoptype != operform->oprleft)
add_cast_to(buf, operform->oprleft);
appendStringInfo(buf, " OPERATOR(%s.", quote_identifier(nspname));
appendStringInfoString(buf, oprname);
appendStringInfo(buf, ") %s", rightop);
if (rightoptype != operform->oprright)
add_cast_to(buf, operform->oprright);
ReleaseSysCache(opertup);
}
/*
* Add a cast specification to buf. We spell out the type name the hard way,
* intentionally not using format_type_be(). This is to avoid corner cases
* for CHARACTER, BIT, and perhaps other types, where specifying the type
* using SQL-standard syntax results in undesirable data truncation. By
* doing it this way we can be certain that the cast will have default (-1)
* target typmod.
*/
static void
add_cast_to(StringInfo buf, Oid typid)
{
HeapTuple typetup;
Form_pg_type typform;
char *typname;
char *nspname;
typetup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
if (!HeapTupleIsValid(typetup))
elog(ERROR, "cache lookup failed for type %u", typid);
typform = (Form_pg_type) GETSTRUCT(typetup);
typname = NameStr(typform->typname);
nspname = get_namespace_name_or_temp(typform->typnamespace);
appendStringInfo(buf, "::%s.%s",
quote_identifier(nspname), quote_identifier(typname));
ReleaseSysCache(typetup);
}
/*
* generate_qualified_type_name
* Compute the name to display for a type specified by OID
*
* This is different from format_type_be() in that we unconditionally
* schema-qualify the name. That also means no special syntax for
* SQL-standard type names ... although in current usage, this should
* only get used for domains, so such cases wouldn't occur anyway.
*/
static char *
generate_qualified_type_name(Oid typid)
{
HeapTuple tp;
Form_pg_type typtup;
char *typname;
char *nspname;
char *result;
tp = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for type %u", typid);
typtup = (Form_pg_type) GETSTRUCT(tp);
typname = NameStr(typtup->typname);
nspname = get_namespace_name_or_temp(typtup->typnamespace);
if (!nspname)
elog(ERROR, "cache lookup failed for namespace %u",
typtup->typnamespace);
result = quote_qualified_identifier(nspname, typname);
ReleaseSysCache(tp);
return result;
}
/*
* generate_collation_name
* Compute the name to display for a collation specified by OID
*
* The result includes all necessary quoting and schema-prefixing.
*/
char *
generate_collation_name(Oid collid)
{
HeapTuple tp;
Form_pg_collation colltup;
char *collname;
char *nspname;
char *result;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u", collid);
colltup = (Form_pg_collation) GETSTRUCT(tp);
collname = NameStr(colltup->collname);
if (!CollationIsVisible(collid))
nspname = get_namespace_name_or_temp(colltup->collnamespace);
else
nspname = NULL;
result = quote_qualified_identifier(nspname, collname);
ReleaseSysCache(tp);
return result;
}
/*
* Given a C string, produce a TEXT datum.
*
* We assume that the input was palloc'd and may be freed.
*/
static text *
string_to_text(char *str)
{
text *result;
result = cstring_to_text(str);
pfree(str);
return result;
}
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
/*
* Generate a C string representing a relation options from text[] datum.
*/
static void
get_reloptions(StringInfo buf, Datum reloptions)
{
Datum *options;
int noptions;
int i;
deconstruct_array_builtin(DatumGetArrayTypeP(reloptions), TEXTOID,
&options, NULL, &noptions);
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
for (i = 0; i < noptions; i++)
{
char *option = TextDatumGetCString(options[i]);
char *name;
char *separator;
char *value;
/*
* Each array element should have the form name=value. If the "=" is
* missing for some reason, treat it like an empty value.
*/
name = option;
separator = strchr(option, '=');
if (separator)
{
*separator = '\0';
value = separator + 1;
}
else
value = "";
if (i > 0)
appendStringInfoString(buf, ", ");
appendStringInfo(buf, "%s=", quote_identifier(name));
/*
* In general we need to quote the value; but to avoid unnecessary
* clutter, do not quote if it is an identifier that would not need
* quoting. (We could also allow numbers, but that is a bit trickier
* than it looks --- for example, are leading zeroes significant? We
* don't want to assume very much here about what custom reloptions
* might mean.)
*/
if (quote_identifier(value) == value)
appendStringInfoString(buf, value);
else
simple_quote_literal(buf, value);
pfree(option);
}
}
/*
* Generate a C string representing a relation's reloptions, or NULL if none.
*/
static char *
flatten_reloptions(Oid relid)
{
char *result = NULL;
HeapTuple tuple;
Datum reloptions;
bool isnull;
tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for relation %u", relid);
reloptions = SysCacheGetAttr(RELOID, tuple,
Anum_pg_class_reloptions, &isnull);
if (!isnull)
{
StringInfoData buf;
initStringInfo(&buf);
Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
get_reloptions(&buf, reloptions);
result = buf.data;
}
ReleaseSysCache(tuple);
return result;
}
/*
* get_range_partbound_string
* A C string representation of one range partition bound
*/
char *
get_range_partbound_string(List *bound_datums)
{
deparse_context context;
StringInfo buf = makeStringInfo();
ListCell *cell;
char *sep;
memset(&context, 0, sizeof(deparse_context));
context.buf = buf;
appendStringInfoChar(buf, '(');
sep = "";
foreach(cell, bound_datums)
{
PartitionRangeDatum *datum =
lfirst_node(PartitionRangeDatum, cell);
appendStringInfoString(buf, sep);
if (datum->kind == PARTITION_RANGE_DATUM_MINVALUE)
appendStringInfoString(buf, "MINVALUE");
else if (datum->kind == PARTITION_RANGE_DATUM_MAXVALUE)
appendStringInfoString(buf, "MAXVALUE");
else
{
Const *val = castNode(Const, datum->value);
get_const_expr(val, &context, -1);
}
sep = ", ";
}
appendStringInfoChar(buf, ')');
return buf->data;
}
/*
* get_list_partvalue_string
* A C string representation of one list partition value
*/
char *
get_list_partvalue_string(Const *val)
{
deparse_context context;
StringInfo buf = makeStringInfo();
memset(&context, 0, sizeof(deparse_context));
context.buf = buf;
get_const_expr(val, &context, -1);
return buf->data;
}