postgresql/src/test/regress/expected/update.out

945 lines
37 KiB
Plaintext
Raw Normal View History

--
-- UPDATE syntax tests
--
CREATE TABLE update_test (
a INT DEFAULT 10,
b INT,
c TEXT
);
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
CREATE TABLE upsert_test (
a INT PRIMARY KEY,
b TEXT
);
INSERT INTO update_test VALUES (5, 10, 'foo');
INSERT INTO update_test(b, a) VALUES (15, 10);
SELECT * FROM update_test;
a | b | c
----+----+-----
5 | 10 | foo
10 | 15 |
(2 rows)
UPDATE update_test SET a = DEFAULT, b = DEFAULT;
SELECT * FROM update_test;
a | b | c
----+---+-----
10 | | foo
10 | |
(2 rows)
-- aliases for the UPDATE target table
UPDATE update_test AS t SET b = 10 WHERE t.a = 10;
SELECT * FROM update_test;
a | b | c
----+----+-----
10 | 10 | foo
10 | 10 |
(2 rows)
UPDATE update_test t SET b = t.b + 10 WHERE t.a = 10;
SELECT * FROM update_test;
a | b | c
----+----+-----
10 | 20 | foo
10 | 20 |
(2 rows)
--
-- Test VALUES in FROM
--
UPDATE update_test SET a=v.i FROM (VALUES(100, 20)) AS v(i, j)
WHERE update_test.b = v.j;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 20 | foo
100 | 20 |
(2 rows)
-- fail, wrong data type:
UPDATE update_test SET a = v.* FROM (VALUES(100, 20)) AS v(i, j)
WHERE update_test.b = v.j;
ERROR: column "a" is of type integer but expression is of type record
LINE 1: UPDATE update_test SET a = v.* FROM (VALUES(100, 20)) AS v(i...
^
HINT: You will need to rewrite or cast the expression.
--
-- Test multiple-set-clause syntax
--
INSERT INTO update_test SELECT a,b+1,c FROM update_test;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 20 | foo
100 | 20 |
100 | 21 | foo
100 | 21 |
(4 rows)
UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'foo';
SELECT * FROM update_test;
a | b | c
-----+----+-------
100 | 20 |
100 | 21 |
10 | 31 | bugle
10 | 32 | bugle
(4 rows)
UPDATE update_test SET (c,b) = ('car', a+b), a = a + 1 WHERE a = 10;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 20 |
100 | 21 |
11 | 41 | car
11 | 42 | car
(4 rows)
-- fail, multi assignment to same column:
UPDATE update_test SET (c,b) = ('car', a+b), b = a + 1 WHERE a = 10;
ERROR: multiple assignments to same column "b"
-- uncorrelated sub-select:
UPDATE update_test
SET (b,a) = (select a,b from update_test where b = 41 and c = 'car')
WHERE a = 100 AND b = 20;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 21 |
11 | 41 | car
11 | 42 | car
41 | 11 |
(4 rows)
-- correlated sub-select:
UPDATE update_test o
SET (b,a) = (select a+1,b from update_test i
where i.a=o.a and i.b=o.b and i.c is not distinct from o.c);
SELECT * FROM update_test;
a | b | c
----+-----+-----
21 | 101 |
41 | 12 | car
42 | 12 | car
11 | 42 |
(4 rows)
-- fail, multiple rows supplied:
UPDATE update_test SET (b,a) = (select a+1,b from update_test);
ERROR: more than one row returned by a subquery used as an expression
-- set to null if no rows supplied:
UPDATE update_test SET (b,a) = (select a+1,b from update_test where a = 1000)
WHERE a = 11;
SELECT * FROM update_test;
a | b | c
----+-----+-----
21 | 101 |
41 | 12 | car
42 | 12 | car
| |
(4 rows)
Improve handling of "UPDATE ... SET (column_list) = row_constructor". Previously, the right-hand side of a multiple-column assignment, if it wasn't a sub-SELECT, had to be a simple parenthesized expression list, because gram.y was responsible for "bursting" the construct into independent column assignments. This had the minor defect that you couldn't write ROW (though you should be able to, since the standard says this is a row constructor), and the rather larger defect that unlike other uses of row constructors, we would not expand a "foo.*" item into multiple columns. Fix that by changing the RHS to be just "a_expr" in the grammar, leaving it to transformMultiAssignRef to separate the elements of a RowExpr; which it will do only after performing standard transformation of the RowExpr, so that "foo.*" behaves as expected. The key reason we didn't do that before was the hard-wired handling of DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by allowing DEFAULT in any a_expr and having parse analysis throw an error if SetToDefault is found in an unexpected place. That's an improvement anyway since the error can be more specific than just "syntax error". The SQL standard suggests that the RHS could be any a_expr yielding a suitable row value. This patch doesn't really move the goal posts in that respect --- you're still limited to RowExpr or a sub-SELECT --- but it does fix the grammar restriction, so it provides some tangible progress towards a full implementation. And the limitation is now documented by an explicit error message rather than an unhelpful "syntax error". Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
-- *-expansion should work in this context:
UPDATE update_test SET (a,b) = ROW(v.*) FROM (VALUES(21, 100)) AS v(i, j)
WHERE update_test.a = v.i;
Improve handling of "UPDATE ... SET (column_list) = row_constructor". Previously, the right-hand side of a multiple-column assignment, if it wasn't a sub-SELECT, had to be a simple parenthesized expression list, because gram.y was responsible for "bursting" the construct into independent column assignments. This had the minor defect that you couldn't write ROW (though you should be able to, since the standard says this is a row constructor), and the rather larger defect that unlike other uses of row constructors, we would not expand a "foo.*" item into multiple columns. Fix that by changing the RHS to be just "a_expr" in the grammar, leaving it to transformMultiAssignRef to separate the elements of a RowExpr; which it will do only after performing standard transformation of the RowExpr, so that "foo.*" behaves as expected. The key reason we didn't do that before was the hard-wired handling of DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by allowing DEFAULT in any a_expr and having parse analysis throw an error if SetToDefault is found in an unexpected place. That's an improvement anyway since the error can be more specific than just "syntax error". The SQL standard suggests that the RHS could be any a_expr yielding a suitable row value. This patch doesn't really move the goal posts in that respect --- you're still limited to RowExpr or a sub-SELECT --- but it does fix the grammar restriction, so it provides some tangible progress towards a full implementation. And the limitation is now documented by an explicit error message rather than an unhelpful "syntax error". Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
-- you might expect this to work, but syntactically it's not a RowExpr:
UPDATE update_test SET (a,b) = (v.*) FROM (VALUES(21, 101)) AS v(i, j)
WHERE update_test.a = v.i;
Improve handling of "UPDATE ... SET (column_list) = row_constructor". Previously, the right-hand side of a multiple-column assignment, if it wasn't a sub-SELECT, had to be a simple parenthesized expression list, because gram.y was responsible for "bursting" the construct into independent column assignments. This had the minor defect that you couldn't write ROW (though you should be able to, since the standard says this is a row constructor), and the rather larger defect that unlike other uses of row constructors, we would not expand a "foo.*" item into multiple columns. Fix that by changing the RHS to be just "a_expr" in the grammar, leaving it to transformMultiAssignRef to separate the elements of a RowExpr; which it will do only after performing standard transformation of the RowExpr, so that "foo.*" behaves as expected. The key reason we didn't do that before was the hard-wired handling of DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by allowing DEFAULT in any a_expr and having parse analysis throw an error if SetToDefault is found in an unexpected place. That's an improvement anyway since the error can be more specific than just "syntax error". The SQL standard suggests that the RHS could be any a_expr yielding a suitable row value. This patch doesn't really move the goal posts in that respect --- you're still limited to RowExpr or a sub-SELECT --- but it does fix the grammar restriction, so it provides some tangible progress towards a full implementation. And the limitation is now documented by an explicit error message rather than an unhelpful "syntax error". Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression
LINE 1: UPDATE update_test SET (a,b) = (v.*) FROM (VALUES(21, 101)) ...
^
-- if an alias for the target table is specified, don't allow references
-- to the original table name
UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
ERROR: invalid reference to FROM-clause entry for table "update_test"
LINE 1: UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a...
^
HINT: Perhaps you meant to reference the table alias "t".
-- Make sure that we can update to a TOASTed value.
UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
SELECT a, b, char_length(c) FROM update_test;
a | b | char_length
----+-----+-------------
| |
Improve handling of "UPDATE ... SET (column_list) = row_constructor". Previously, the right-hand side of a multiple-column assignment, if it wasn't a sub-SELECT, had to be a simple parenthesized expression list, because gram.y was responsible for "bursting" the construct into independent column assignments. This had the minor defect that you couldn't write ROW (though you should be able to, since the standard says this is a row constructor), and the rather larger defect that unlike other uses of row constructors, we would not expand a "foo.*" item into multiple columns. Fix that by changing the RHS to be just "a_expr" in the grammar, leaving it to transformMultiAssignRef to separate the elements of a RowExpr; which it will do only after performing standard transformation of the RowExpr, so that "foo.*" behaves as expected. The key reason we didn't do that before was the hard-wired handling of DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by allowing DEFAULT in any a_expr and having parse analysis throw an error if SetToDefault is found in an unexpected place. That's an improvement anyway since the error can be more specific than just "syntax error". The SQL standard suggests that the RHS could be any a_expr yielding a suitable row value. This patch doesn't really move the goal posts in that respect --- you're still limited to RowExpr or a sub-SELECT --- but it does fix the grammar restriction, so it provides some tangible progress towards a full implementation. And the limitation is now documented by an explicit error message rather than an unhelpful "syntax error". Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
21 | 100 |
41 | 12 | 10000
42 | 12 | 10000
(4 rows)
Repair bogus handling of multi-assignment Params in upper plan levels. Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
-- Check multi-assignment with a Result node to handle a one-time filter.
EXPLAIN (VERBOSE, COSTS OFF)
UPDATE update_test t
SET (a, b) = (SELECT b, a FROM update_test s WHERE s.a = t.a)
WHERE CURRENT_USER = SESSION_USER;
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
QUERY PLAN
-------------------------------------------------------------
Repair bogus handling of multi-assignment Params in upper plan levels. Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
Update on public.update_test t
-> Result
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
Output: $1, $2, (SubPlan 1 (returns $1,$2)), t.ctid
Repair bogus handling of multi-assignment Params in upper plan levels. Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
One-Time Filter: (CURRENT_USER = SESSION_USER)
-> Seq Scan on public.update_test t
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
Output: t.a, t.ctid
Repair bogus handling of multi-assignment Params in upper plan levels. Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
SubPlan 1 (returns $1,$2)
-> Seq Scan on public.update_test s
Output: s.b, s.a
Filter: (s.a = t.a)
(10 rows)
UPDATE update_test t
SET (a, b) = (SELECT b, a FROM update_test s WHERE s.a = t.a)
WHERE CURRENT_USER = SESSION_USER;
SELECT a, b, char_length(c) FROM update_test;
a | b | char_length
-----+----+-------------
| |
100 | 21 |
12 | 41 | 10000
12 | 42 | 10000
(4 rows)
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
-- Test ON CONFLICT DO UPDATE
INSERT INTO upsert_test VALUES(1, 'Boo');
-- uncorrelated sub-select:
WITH aaa AS (SELECT 1 AS a, 'Foo' AS b) INSERT INTO upsert_test
VALUES (1, 'Bar') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b, a FROM aaa) RETURNING *;
a | b
---+-----
1 | Foo
(1 row)
-- correlated sub-select:
INSERT INTO upsert_test VALUES (1, 'Baz') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Correlated', a from upsert_test i WHERE i.a = upsert_test.a)
RETURNING *;
a | b
---+-----------------
1 | Foo, Correlated
(1 row)
-- correlated sub-select (EXCLUDED.* alias):
INSERT INTO upsert_test VALUES (1, 'Bat') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Excluded', a from upsert_test i WHERE i.a = excluded.a)
RETURNING *;
a | b
---+---------------------------
1 | Foo, Correlated, Excluded
(1 row)
-- ON CONFLICT using system attributes in RETURNING, testing both the
-- inserting and updating paths. See bug report at:
-- https://www.postgresql.org/message-id/73436355-6432-49B1-92ED-1FE4F7E7E100%40finefun.com.au
INSERT INTO upsert_test VALUES (2, 'Beeble') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Excluded', a from upsert_test i WHERE i.a = excluded.a)
RETURNING tableoid::regclass, xmin = pg_current_xact_id()::xid AS xmin_correct, xmax = 0 AS xmax_correct;
tableoid | xmin_correct | xmax_correct
-------------+--------------+--------------
upsert_test | t | t
(1 row)
-- currently xmax is set after a conflict - that's probably not good,
-- but it seems worthwhile to have to be explicit if that changes.
INSERT INTO upsert_test VALUES (2, 'Brox') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Excluded', a from upsert_test i WHERE i.a = excluded.a)
RETURNING tableoid::regclass, xmin = pg_current_xact_id()::xid AS xmin_correct, xmax = pg_current_xact_id()::xid AS xmax_correct;
tableoid | xmin_correct | xmax_correct
-------------+--------------+--------------
upsert_test | t | t
(1 row)
DROP TABLE update_test;
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
DROP TABLE upsert_test;
---------------------------
-- UPDATE with row movement
---------------------------
-- When a partitioned table receives an UPDATE to the partitioned key and the
-- new values no longer meet the partition's bound, the row must be moved to
-- the correct partition for the new partition key (if one exists). We must
-- also ensure that updatable views on partitioned tables properly enforce any
-- WITH CHECK OPTION that is defined. The situation with triggers in this case
-- also requires thorough testing as partition key updates causing row
-- movement convert UPDATEs into DELETE+INSERT.
CREATE TABLE range_parted (
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
a text,
b bigint,
c numeric,
d int,
e varchar
) PARTITION BY RANGE (a, b);
-- Create partitions intentionally in descending bound order, so as to test
-- that update-row-movement works with the leaf partitions not in bound order.
CREATE TABLE part_b_20_b_30 (e varchar, c numeric, a text, b bigint, d int);
ALTER TABLE range_parted ATTACH PARTITION part_b_20_b_30 FOR VALUES FROM ('b', 20) TO ('b', 30);
CREATE TABLE part_b_10_b_20 (e varchar, c numeric, a text, b bigint, d int) PARTITION BY RANGE (c);
CREATE TABLE part_b_1_b_10 PARTITION OF range_parted FOR VALUES FROM ('b', 1) TO ('b', 10);
ALTER TABLE range_parted ATTACH PARTITION part_b_10_b_20 FOR VALUES FROM ('b', 10) TO ('b', 20);
CREATE TABLE part_a_10_a_20 PARTITION OF range_parted FOR VALUES FROM ('a', 10) TO ('a', 20);
CREATE TABLE part_a_1_a_10 PARTITION OF range_parted FOR VALUES FROM ('a', 1) TO ('a', 10);
-- Check that partition-key UPDATE works sanely on a partitioned table that
-- does not have any child partitions.
UPDATE part_b_10_b_20 set b = b - 6;
-- Create some more partitions following the above pattern of descending bound
-- order, but let's make the situation a bit more complex by having the
-- attribute numbers of the columns vary from their parent partition.
CREATE TABLE part_c_100_200 (e varchar, c numeric, a text, b bigint, d int) PARTITION BY range (abs(d));
ALTER TABLE part_c_100_200 DROP COLUMN e, DROP COLUMN c, DROP COLUMN a;
ALTER TABLE part_c_100_200 ADD COLUMN c numeric, ADD COLUMN e varchar, ADD COLUMN a text;
ALTER TABLE part_c_100_200 DROP COLUMN b;
ALTER TABLE part_c_100_200 ADD COLUMN b bigint;
CREATE TABLE part_d_1_15 PARTITION OF part_c_100_200 FOR VALUES FROM (1) TO (15);
CREATE TABLE part_d_15_20 PARTITION OF part_c_100_200 FOR VALUES FROM (15) TO (20);
ALTER TABLE part_b_10_b_20 ATTACH PARTITION part_c_100_200 FOR VALUES FROM (100) TO (200);
CREATE TABLE part_c_1_100 (e varchar, d int, c numeric, b bigint, a text);
ALTER TABLE part_b_10_b_20 ATTACH PARTITION part_c_1_100 FOR VALUES FROM (1) TO (100);
\set init_range_parted 'truncate range_parted; insert into range_parted VALUES (''a'', 1, 1, 1), (''a'', 10, 200, 1), (''b'', 12, 96, 1), (''b'', 13, 97, 2), (''b'', 15, 105, 16), (''b'', 17, 105, 19)'
\set show_data 'select tableoid::regclass::text COLLATE "C" partname, * from range_parted ORDER BY 1, 2, 3, 4, 5, 6'
:init_range_parted;
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_d_15_20 | b | 15 | 105 | 16 |
part_d_15_20 | b | 17 | 105 | 19 |
(6 rows)
-- The order of subplans should be in bound order
EXPLAIN (costs off) UPDATE range_parted set c = c - 50 WHERE c > 97;
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
QUERY PLAN
-------------------------------------------------------
Update on range_parted
Fix EXPLAIN's column alias output for mismatched child tables. If an inheritance/partitioning parent table is assigned some column alias names in the query, EXPLAIN mapped those aliases onto the child tables' columns by physical position, resulting in bogus output if a child table's columns aren't one-for-one with the parent's. To fix, make expand_single_inheritance_child() generate a correctly re-mapped column alias list, rather than just copying the parent RTE's alias node. (We have to fill the alias field, not just adjust the eref field, because ruleutils.c will ignore eref in favor of looking at the real column names.) This means that child tables will now always have alias fields in plan rtables, where before they might not have. That results in a rather substantial set of regression test output changes: EXPLAIN will now always show child tables with aliases that match the parent table (usually with "_N" appended for uniqueness). But that seems like a net positive for understandability, since the parent alias corresponds to something that actually appeared in the original query, while the child table names didn't. (Note that this does not change anything for cases where an explicit table alias was written in the query for the parent table; it just makes cases without such aliases behave similarly to that.) Hence, while we could avoid these subsidiary changes if we made inherit.c more complicated, we choose not to. Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-03 01:08:10 +01:00
Update on part_a_1_a_10 range_parted_1
Update on part_a_10_a_20 range_parted_2
Update on part_b_1_b_10 range_parted_3
Update on part_c_1_100 range_parted_4
Update on part_d_1_15 range_parted_5
Update on part_d_15_20 range_parted_6
Update on part_b_20_b_30 range_parted_7
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
-> Append
-> Seq Scan on part_a_1_a_10 range_parted_1
Filter: (c > '97'::numeric)
-> Seq Scan on part_a_10_a_20 range_parted_2
Filter: (c > '97'::numeric)
-> Seq Scan on part_b_1_b_10 range_parted_3
Filter: (c > '97'::numeric)
-> Seq Scan on part_c_1_100 range_parted_4
Filter: (c > '97'::numeric)
-> Seq Scan on part_d_1_15 range_parted_5
Filter: (c > '97'::numeric)
-> Seq Scan on part_d_15_20 range_parted_6
Filter: (c > '97'::numeric)
-> Seq Scan on part_b_20_b_30 range_parted_7
Filter: (c > '97'::numeric)
(23 rows)
-- fail, row movement happens only within the partition subtree.
UPDATE part_c_100_200 set c = c - 20, d = c WHERE c = 105;
ERROR: new row for relation "part_c_100_200" violates partition constraint
DETAIL: Failing row contains (105, 85, null, b, 15).
-- fail, no partition key update, so no attempt to move tuple,
-- but "a = 'a'" violates partition constraint enforced by root partition)
UPDATE part_b_10_b_20 set a = 'a';
ERROR: new row for relation "part_c_1_100" violates partition constraint
DETAIL: Failing row contains (null, 1, 96, 12, a).
-- ok, partition key update, no constraint violation
UPDATE range_parted set d = d - 10 WHERE d > 10;
-- ok, no partition key update, no constraint violation
UPDATE range_parted set e = d;
-- No row found
UPDATE part_c_1_100 set c = c + 20 WHERE c = 98;
-- ok, row movement
UPDATE part_b_10_b_20 set c = c + 20 returning c, b, a;
c | b | a
-----+----+---
116 | 12 | b
117 | 13 | b
125 | 15 | b
125 | 17 | b
(4 rows)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+---+---
part_a_10_a_20 | a | 10 | 200 | 1 | 1
part_a_1_a_10 | a | 1 | 1 | 1 | 1
part_d_1_15 | b | 12 | 116 | 1 | 1
part_d_1_15 | b | 13 | 117 | 2 | 2
part_d_1_15 | b | 15 | 125 | 6 | 6
part_d_1_15 | b | 17 | 125 | 9 | 9
(6 rows)
-- fail, row movement happens only within the partition subtree.
UPDATE part_b_10_b_20 set b = b - 6 WHERE c > 116 returning *;
ERROR: new row for relation "part_d_1_15" violates partition constraint
DETAIL: Failing row contains (2, 117, 2, b, 7).
-- ok, row movement, with subset of rows moved into different partition.
UPDATE range_parted set b = b - 6 WHERE c > 116 returning a, b + c;
a | ?column?
---+----------
a | 204
b | 124
b | 134
b | 136
(4 rows)
:show_data;
partname | a | b | c | d | e
---------------+---+----+-----+---+---
part_a_1_a_10 | a | 1 | 1 | 1 | 1
part_a_1_a_10 | a | 4 | 200 | 1 | 1
part_b_1_b_10 | b | 7 | 117 | 2 | 2
part_b_1_b_10 | b | 9 | 125 | 6 | 6
part_d_1_15 | b | 11 | 125 | 9 | 9
part_d_1_15 | b | 12 | 116 | 1 | 1
(6 rows)
-- Common table needed for multiple test scenarios.
CREATE TABLE mintab(c1 int);
INSERT into mintab VALUES (120);
-- update partition key using updatable view.
CREATE VIEW upview AS SELECT * FROM range_parted WHERE (select c > c1 FROM mintab) WITH CHECK OPTION;
-- ok
UPDATE upview set c = 199 WHERE b = 4;
-- fail, check option violation
UPDATE upview set c = 120 WHERE b = 4;
ERROR: new row violates check option for view "upview"
DETAIL: Failing row contains (a, 4, 120, 1, 1).
-- fail, row movement with check option violation
UPDATE upview set a = 'b', b = 15, c = 120 WHERE b = 4;
ERROR: new row violates check option for view "upview"
DETAIL: Failing row contains (b, 15, 120, 1, 1).
-- ok, row movement, check option passes
UPDATE upview set a = 'b', b = 15 WHERE b = 4;
:show_data;
partname | a | b | c | d | e
---------------+---+----+-----+---+---
part_a_1_a_10 | a | 1 | 1 | 1 | 1
part_b_1_b_10 | b | 7 | 117 | 2 | 2
part_b_1_b_10 | b | 9 | 125 | 6 | 6
part_d_1_15 | b | 11 | 125 | 9 | 9
part_d_1_15 | b | 12 | 116 | 1 | 1
part_d_1_15 | b | 15 | 199 | 1 | 1
(6 rows)
-- cleanup
DROP VIEW upview;
-- RETURNING having whole-row vars.
:init_range_parted;
UPDATE range_parted set c = 95 WHERE a = 'b' and b > 10 and c > 100 returning (range_parted), *;
range_parted | a | b | c | d | e
---------------+---+----+----+----+---
(b,15,95,16,) | b | 15 | 95 | 16 |
(b,17,95,19,) | b | 17 | 95 | 19 |
(2 rows)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_c_1_100 | b | 15 | 95 | 16 |
part_c_1_100 | b | 17 | 95 | 19 |
(6 rows)
-- Transition tables with update row movement
:init_range_parted;
CREATE FUNCTION trans_updatetrigfunc() RETURNS trigger LANGUAGE plpgsql AS
$$
begin
raise notice 'trigger = %, old table = %, new table = %',
TG_NAME,
(select string_agg(old_table::text, ', ' ORDER BY a) FROM old_table),
(select string_agg(new_table::text, ', ' ORDER BY a) FROM new_table);
return null;
end;
$$;
CREATE TRIGGER trans_updatetrig
AFTER UPDATE ON range_parted REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE trans_updatetrigfunc();
UPDATE range_parted set c = (case when c = 96 then 110 else c + 1 end ) WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,12,96,1,), (b,13,97,2,), (b,15,105,16,), (b,17,105,19,), new table = (b,12,110,1,), (b,13,98,2,), (b,15,106,16,), (b,17,106,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 13 | 98 | 2 |
part_d_15_20 | b | 15 | 106 | 16 |
part_d_15_20 | b | 17 | 106 | 19 |
part_d_1_15 | b | 12 | 110 | 1 |
(6 rows)
:init_range_parted;
-- Enabling OLD TABLE capture for both DELETE as well as UPDATE stmt triggers
-- should not cause DELETEd rows to be captured twice. Similar thing for
-- INSERT triggers and inserted rows.
CREATE TRIGGER trans_deletetrig
AFTER DELETE ON range_parted REFERENCING OLD TABLE AS old_table
FOR EACH STATEMENT EXECUTE PROCEDURE trans_updatetrigfunc();
CREATE TRIGGER trans_inserttrig
AFTER INSERT ON range_parted REFERENCING NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE trans_updatetrigfunc();
UPDATE range_parted set c = c + 50 WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,12,96,1,), (b,13,97,2,), (b,15,105,16,), (b,17,105,19,), new table = (b,12,146,1,), (b,13,147,2,), (b,15,155,16,), (b,17,155,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_d_15_20 | b | 15 | 155 | 16 |
part_d_15_20 | b | 17 | 155 | 19 |
part_d_1_15 | b | 12 | 146 | 1 |
part_d_1_15 | b | 13 | 147 | 2 |
(6 rows)
DROP TRIGGER trans_deletetrig ON range_parted;
DROP TRIGGER trans_inserttrig ON range_parted;
-- Don't drop trans_updatetrig yet. It is required below.
-- Test with transition tuple conversion happening for rows moved into the
-- new partition. This requires a trigger that references transition table
-- (we already have trans_updatetrig). For inserted rows, the conversion
-- is not usually needed, because the original tuple is already compatible with
-- the desired transition tuple format. But conversion happens when there is a
-- BR trigger because the trigger can change the inserted row. So install a
-- BR triggers on those child partitions where the rows will be moved.
CREATE FUNCTION func_parted_mod_b() RETURNS trigger AS $$
BEGIN
NEW.b = NEW.b + 1;
return NEW;
END $$ language plpgsql;
CREATE TRIGGER trig_c1_100 BEFORE UPDATE OR INSERT ON part_c_1_100
FOR EACH ROW EXECUTE PROCEDURE func_parted_mod_b();
CREATE TRIGGER trig_d1_15 BEFORE UPDATE OR INSERT ON part_d_1_15
FOR EACH ROW EXECUTE PROCEDURE func_parted_mod_b();
CREATE TRIGGER trig_d15_20 BEFORE UPDATE OR INSERT ON part_d_15_20
FOR EACH ROW EXECUTE PROCEDURE func_parted_mod_b();
:init_range_parted;
UPDATE range_parted set c = (case when c = 96 then 110 else c + 1 end) WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,13,96,1,), (b,14,97,2,), (b,16,105,16,), (b,18,105,19,), new table = (b,15,110,1,), (b,15,98,2,), (b,17,106,16,), (b,19,106,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 15 | 98 | 2 |
part_d_15_20 | b | 17 | 106 | 16 |
part_d_15_20 | b | 19 | 106 | 19 |
part_d_1_15 | b | 15 | 110 | 1 |
(6 rows)
:init_range_parted;
UPDATE range_parted set c = c + 50 WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,13,96,1,), (b,14,97,2,), (b,16,105,16,), (b,18,105,19,), new table = (b,15,146,1,), (b,16,147,2,), (b,17,155,16,), (b,19,155,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_d_15_20 | b | 17 | 155 | 16 |
part_d_15_20 | b | 19 | 155 | 19 |
part_d_1_15 | b | 15 | 146 | 1 |
part_d_1_15 | b | 16 | 147 | 2 |
(6 rows)
-- Case where per-partition tuple conversion map array is allocated, but the
-- map is not required for the particular tuple that is routed, thanks to
-- matching table attributes of the partition and the target table.
:init_range_parted;
UPDATE range_parted set b = 15 WHERE b = 1;
NOTICE: trigger = trans_updatetrig, old table = (a,1,1,1,), new table = (a,15,1,1,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_10_a_20 | a | 15 | 1 | 1 |
part_c_1_100 | b | 13 | 96 | 1 |
part_c_1_100 | b | 14 | 97 | 2 |
part_d_15_20 | b | 16 | 105 | 16 |
part_d_15_20 | b | 18 | 105 | 19 |
(6 rows)
DROP TRIGGER trans_updatetrig ON range_parted;
DROP TRIGGER trig_c1_100 ON part_c_1_100;
DROP TRIGGER trig_d1_15 ON part_d_1_15;
DROP TRIGGER trig_d15_20 ON part_d_15_20;
DROP FUNCTION func_parted_mod_b();
-- RLS policies with update-row-movement
-----------------------------------------
ALTER TABLE range_parted ENABLE ROW LEVEL SECURITY;
CREATE USER regress_range_parted_user;
GRANT ALL ON range_parted, mintab TO regress_range_parted_user;
CREATE POLICY seeall ON range_parted AS PERMISSIVE FOR SELECT USING (true);
CREATE POLICY policy_range_parted ON range_parted for UPDATE USING (true) WITH CHECK (c % 2 = 0);
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- This should fail with RLS violation error while moving row from
-- part_a_10_a_20 to part_d_1_15, because we are setting 'c' to an odd number.
UPDATE range_parted set a = 'b', c = 151 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy for table "range_parted"
RESET SESSION AUTHORIZATION;
-- Create a trigger on part_d_1_15
CREATE FUNCTION func_d_1_15() RETURNS trigger AS $$
BEGIN
NEW.c = NEW.c + 1; -- Make even numbers odd, or vice versa
return NEW;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER trig_d_1_15 BEFORE INSERT ON part_d_1_15
FOR EACH ROW EXECUTE PROCEDURE func_d_1_15();
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- Here, RLS checks should succeed while moving row from part_a_10_a_20 to
-- part_d_1_15. Even though the UPDATE is setting 'c' to an odd number, the
-- trigger at the destination partition again makes it an even number.
UPDATE range_parted set a = 'b', c = 151 WHERE a = 'a' and c = 200;
RESET SESSION AUTHORIZATION;
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- This should fail with RLS violation error. Even though the UPDATE is setting
-- 'c' to an even number, the trigger at the destination partition again makes
-- it an odd number.
UPDATE range_parted set a = 'b', c = 150 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy for table "range_parted"
-- Cleanup
RESET SESSION AUTHORIZATION;
DROP TRIGGER trig_d_1_15 ON part_d_1_15;
DROP FUNCTION func_d_1_15();
-- Policy expression contains SubPlan
RESET SESSION AUTHORIZATION;
:init_range_parted;
CREATE POLICY policy_range_parted_subplan on range_parted
AS RESTRICTIVE for UPDATE USING (true)
WITH CHECK ((SELECT range_parted.c <= c1 FROM mintab));
SET SESSION AUTHORIZATION regress_range_parted_user;
-- fail, mintab has row with c1 = 120
UPDATE range_parted set a = 'b', c = 122 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy "policy_range_parted_subplan" for table "range_parted"
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
-- ok
UPDATE range_parted set a = 'b', c = 120 WHERE a = 'a' and c = 200;
-- RLS policy expression contains whole row.
RESET SESSION AUTHORIZATION;
:init_range_parted;
CREATE POLICY policy_range_parted_wholerow on range_parted AS RESTRICTIVE for UPDATE USING (true)
WITH CHECK (range_parted = row('b', 10, 112, 1, NULL)::range_parted);
SET SESSION AUTHORIZATION regress_range_parted_user;
-- ok, should pass the RLS check
UPDATE range_parted set a = 'b', c = 112 WHERE a = 'a' and c = 200;
RESET SESSION AUTHORIZATION;
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- fail, the whole row RLS check should fail
UPDATE range_parted set a = 'b', c = 116 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy "policy_range_parted_wholerow" for table "range_parted"
-- Cleanup
RESET SESSION AUTHORIZATION;
DROP POLICY policy_range_parted ON range_parted;
DROP POLICY policy_range_parted_subplan ON range_parted;
DROP POLICY policy_range_parted_wholerow ON range_parted;
REVOKE ALL ON range_parted, mintab FROM regress_range_parted_user;
DROP USER regress_range_parted_user;
DROP TABLE mintab;
-- statement triggers with update row movement
---------------------------------------------------
:init_range_parted;
CREATE FUNCTION trigfunc() returns trigger language plpgsql as
$$
begin
raise notice 'trigger = % fired on table % during %',
TG_NAME, TG_TABLE_NAME, TG_OP;
return null;
end;
$$;
-- Triggers on root partition
CREATE TRIGGER parent_delete_trig
AFTER DELETE ON range_parted for each statement execute procedure trigfunc();
CREATE TRIGGER parent_update_trig
AFTER UPDATE ON range_parted for each statement execute procedure trigfunc();
CREATE TRIGGER parent_insert_trig
AFTER INSERT ON range_parted for each statement execute procedure trigfunc();
-- Triggers on leaf partition part_c_1_100
CREATE TRIGGER c1_delete_trig
AFTER DELETE ON part_c_1_100 for each statement execute procedure trigfunc();
CREATE TRIGGER c1_update_trig
AFTER UPDATE ON part_c_1_100 for each statement execute procedure trigfunc();
CREATE TRIGGER c1_insert_trig
AFTER INSERT ON part_c_1_100 for each statement execute procedure trigfunc();
-- Triggers on leaf partition part_d_1_15
CREATE TRIGGER d1_delete_trig
AFTER DELETE ON part_d_1_15 for each statement execute procedure trigfunc();
CREATE TRIGGER d1_update_trig
AFTER UPDATE ON part_d_1_15 for each statement execute procedure trigfunc();
CREATE TRIGGER d1_insert_trig
AFTER INSERT ON part_d_1_15 for each statement execute procedure trigfunc();
-- Triggers on leaf partition part_d_15_20
CREATE TRIGGER d15_delete_trig
AFTER DELETE ON part_d_15_20 for each statement execute procedure trigfunc();
CREATE TRIGGER d15_update_trig
AFTER UPDATE ON part_d_15_20 for each statement execute procedure trigfunc();
CREATE TRIGGER d15_insert_trig
AFTER INSERT ON part_d_15_20 for each statement execute procedure trigfunc();
-- Move all rows from part_c_100_200 to part_c_1_100. None of the delete or
-- insert statement triggers should be fired.
UPDATE range_parted set c = c - 50 WHERE c > 97;
NOTICE: trigger = parent_update_trig fired on table range_parted during UPDATE
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 150 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_c_1_100 | b | 15 | 55 | 16 |
part_c_1_100 | b | 17 | 55 | 19 |
(6 rows)
DROP TRIGGER parent_delete_trig ON range_parted;
DROP TRIGGER parent_update_trig ON range_parted;
DROP TRIGGER parent_insert_trig ON range_parted;
DROP TRIGGER c1_delete_trig ON part_c_1_100;
DROP TRIGGER c1_update_trig ON part_c_1_100;
DROP TRIGGER c1_insert_trig ON part_c_1_100;
DROP TRIGGER d1_delete_trig ON part_d_1_15;
DROP TRIGGER d1_update_trig ON part_d_1_15;
DROP TRIGGER d1_insert_trig ON part_d_1_15;
DROP TRIGGER d15_delete_trig ON part_d_15_20;
DROP TRIGGER d15_update_trig ON part_d_15_20;
DROP TRIGGER d15_insert_trig ON part_d_15_20;
-- Creating default partition for range
:init_range_parted;
create table part_def partition of range_parted default;
\d+ part_def
Table "public.part_def"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+-------------------+-----------+----------+---------+----------+--------------+-------------
a | text | | | | extended | |
b | bigint | | | | plain | |
c | numeric | | | | main | |
d | integer | | | | plain | |
e | character varying | | | | extended | |
Partition of: range_parted DEFAULT
Partition constraint: (NOT ((a IS NOT NULL) AND (b IS NOT NULL) AND (((a = 'a'::text) AND (b >= '1'::bigint) AND (b < '10'::bigint)) OR ((a = 'a'::text) AND (b >= '10'::bigint) AND (b < '20'::bigint)) OR ((a = 'b'::text) AND (b >= '1'::bigint) AND (b < '10'::bigint)) OR ((a = 'b'::text) AND (b >= '10'::bigint) AND (b < '20'::bigint)) OR ((a = 'b'::text) AND (b >= '20'::bigint) AND (b < '30'::bigint)))))
insert into range_parted values ('c', 9);
-- ok
update part_def set a = 'd' where a = 'c';
-- fail
update part_def set a = 'a' where a = 'd';
ERROR: new row for relation "part_def" violates partition constraint
DETAIL: Failing row contains (a, 9, null, null, null).
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_d_15_20 | b | 15 | 105 | 16 |
part_d_15_20 | b | 17 | 105 | 19 |
part_def | d | 9 | | |
(7 rows)
-- Update row movement from non-default to default partition.
-- fail, default partition is not under part_a_10_a_20;
UPDATE part_a_10_a_20 set a = 'ad' WHERE a = 'a';
ERROR: new row for relation "part_a_10_a_20" violates partition constraint
DETAIL: Failing row contains (ad, 10, 200, 1, null).
-- ok
UPDATE range_parted set a = 'ad' WHERE a = 'a';
UPDATE range_parted set a = 'bd' WHERE a = 'b';
:show_data;
partname | a | b | c | d | e
----------+----+----+-----+----+---
part_def | ad | 1 | 1 | 1 |
part_def | ad | 10 | 200 | 1 |
part_def | bd | 12 | 96 | 1 |
part_def | bd | 13 | 97 | 2 |
part_def | bd | 15 | 105 | 16 |
part_def | bd | 17 | 105 | 19 |
part_def | d | 9 | | |
(7 rows)
-- Update row movement from default to non-default partitions.
-- ok
UPDATE range_parted set a = 'a' WHERE a = 'ad';
UPDATE range_parted set a = 'b' WHERE a = 'bd';
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_d_15_20 | b | 15 | 105 | 16 |
part_d_15_20 | b | 17 | 105 | 19 |
part_def | d | 9 | | |
(7 rows)
-- Cleanup: range_parted no longer needed.
DROP TABLE range_parted;
CREATE TABLE list_parted (
a text,
b int
) PARTITION BY list (a);
CREATE TABLE list_part1 PARTITION OF list_parted for VALUES in ('a', 'b');
CREATE TABLE list_default PARTITION OF list_parted default;
INSERT into list_part1 VALUES ('a', 1);
INSERT into list_default VALUES ('d', 10);
-- fail
UPDATE list_default set a = 'a' WHERE a = 'd';
ERROR: new row for relation "list_default" violates partition constraint
DETAIL: Failing row contains (a, 10).
-- ok
UPDATE list_default set a = 'x' WHERE a = 'd';
DROP TABLE list_parted;
--------------
-- Some more update-partition-key test scenarios below. This time use list
-- partitions.
--------------
-- Setup for list partitions
CREATE TABLE list_parted (a numeric, b int, c int8) PARTITION BY list (a);
CREATE TABLE sub_parted PARTITION OF list_parted for VALUES in (1) PARTITION BY list (b);
CREATE TABLE sub_part1(b int, c int8, a numeric);
ALTER TABLE sub_parted ATTACH PARTITION sub_part1 for VALUES in (1);
CREATE TABLE sub_part2(b int, c int8, a numeric);
ALTER TABLE sub_parted ATTACH PARTITION sub_part2 for VALUES in (2);
CREATE TABLE list_part1(a numeric, b int, c int8);
ALTER TABLE list_parted ATTACH PARTITION list_part1 for VALUES in (2,3);
INSERT into list_parted VALUES (2,5,50);
INSERT into list_parted VALUES (3,6,60);
INSERT into sub_parted VALUES (1,1,60);
INSERT into sub_parted VALUES (1,2,10);
-- Test partition constraint violation when intermediate ancestor is used and
-- constraint is inherited from upper root.
UPDATE sub_parted set a = 2 WHERE c = 10;
ERROR: new row for relation "sub_part2" violates partition constraint
DETAIL: Failing row contains (2, 10, 2).
-- Test update-partition-key, where the unpruned partitions do not have their
-- partition keys updated.
SELECT tableoid::regclass::text, * FROM list_parted WHERE a = 2 ORDER BY 1;
tableoid | a | b | c
------------+---+---+----
list_part1 | 2 | 5 | 50
(1 row)
UPDATE list_parted set b = c + a WHERE a = 2;
SELECT tableoid::regclass::text, * FROM list_parted WHERE a = 2 ORDER BY 1;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
(1 row)
-- Test the case where BR UPDATE triggers change the partition key.
CREATE FUNCTION func_parted_mod_b() returns trigger as $$
BEGIN
NEW.b = 2; -- This is changing partition key column.
return NEW;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER parted_mod_b before update on sub_part1
for each row execute procedure func_parted_mod_b();
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part1 | 1 | 1 | 60
sub_part2 | 1 | 2 | 10
(4 rows)
-- This should do the tuple routing even though there is no explicit
-- partition-key update, because there is a trigger on sub_part1.
UPDATE list_parted set c = 70 WHERE b = 1;
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part2 | 1 | 2 | 10
sub_part2 | 1 | 2 | 70
(4 rows)
DROP TRIGGER parted_mod_b ON sub_part1;
-- If BR DELETE trigger prevented DELETE from happening, we should also skip
-- the INSERT if that delete is part of UPDATE=>DELETE+INSERT.
CREATE OR REPLACE FUNCTION func_parted_mod_b() returns trigger as $$
BEGIN
raise notice 'Trigger: Got OLD row %, but returning NULL', OLD;
return NULL;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER trig_skip_delete before delete on sub_part2
for each row execute procedure func_parted_mod_b();
UPDATE list_parted set b = 1 WHERE c = 70;
NOTICE: Trigger: Got OLD row (2,70,1), but returning NULL
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part2 | 1 | 2 | 10
sub_part2 | 1 | 2 | 70
(4 rows)
-- Drop the trigger. Now the row should be moved.
DROP TRIGGER trig_skip_delete ON sub_part2;
UPDATE list_parted set b = 1 WHERE c = 70;
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part1 | 1 | 1 | 70
sub_part2 | 1 | 2 | 10
(4 rows)
DROP FUNCTION func_parted_mod_b();
-- UPDATE partition-key with FROM clause. If join produces multiple output
-- rows for the same row to be modified, we should tuple-route the row only
-- once. There should not be any rows inserted.
CREATE TABLE non_parted (id int);
INSERT into non_parted VALUES (1), (1), (1), (2), (2), (2), (3), (3), (3);
UPDATE list_parted t1 set a = 2 FROM non_parted t2 WHERE t1.a = t2.id and a = 1;
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 1 | 70
list_part1 | 2 | 2 | 10
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
(4 rows)
DROP TABLE non_parted;
-- Cleanup: list_parted no longer needed.
DROP TABLE list_parted;
-- create custom operator class and hash function, for the same reason
-- explained in alter_table.sql
create or replace function dummy_hashint4(a int4, seed int8) returns int8 as
$$ begin return (a + seed); end; $$ language 'plpgsql' immutable;
create operator class custom_opclass for type int4 using hash as
operator 1 = , function 2 dummy_hashint4(int4, int8);
create table hash_parted (
a int,
b int
) partition by hash (a custom_opclass, b custom_opclass);
create table hpart1 partition of hash_parted for values with (modulus 2, remainder 1);
create table hpart2 partition of hash_parted for values with (modulus 4, remainder 2);
create table hpart3 partition of hash_parted for values with (modulus 8, remainder 0);
create table hpart4 partition of hash_parted for values with (modulus 8, remainder 4);
insert into hpart1 values (1, 1);
insert into hpart2 values (2, 5);
insert into hpart4 values (3, 4);
-- fail
update hpart1 set a = 3, b=4 where a = 1;
ERROR: new row for relation "hpart1" violates partition constraint
DETAIL: Failing row contains (3, 4).
-- ok, row movement
update hash_parted set b = b - 1 where b = 1;
-- ok
update hash_parted set b = b + 8 where b = 1;
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
-- cleanup
drop table hash_parted;
drop operator class custom_opclass using hash;
drop function dummy_hashint4(a int4, seed int8);