2003-08-26 20:32:23 +02:00
--
2006-09-04 00:37:06 +02:00
-- UPDATE syntax tests
2003-08-26 20:32:23 +02:00
--
CREATE TABLE update_test (
a INT DEFAULT 10,
2006-09-04 00:37:06 +02:00
b INT,
c TEXT
2003-08-26 20:32:23 +02:00
);
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
CREATE TABLE upsert_test (
a INT PRIMARY KEY,
b TEXT
);
2006-09-04 00:37:06 +02:00
INSERT INTO update_test VALUES (5, 10, 'foo');
INSERT INTO update_test(b, a) VALUES (15, 10);
2003-08-26 20:32:23 +02:00
SELECT * FROM update_test;
2006-09-04 00:37:06 +02:00
a | b | c
----+----+-----
5 | 10 | foo
10 | 15 |
2003-08-26 20:32:23 +02:00
(2 rows)
UPDATE update_test SET a = DEFAULT, b = DEFAULT;
SELECT * FROM update_test;
2006-09-04 00:37:06 +02:00
a | b | c
----+---+-----
10 | | foo
10 | |
2003-08-26 20:32:23 +02:00
(2 rows)
2006-01-22 06:20:35 +01:00
-- aliases for the UPDATE target table
UPDATE update_test AS t SET b = 10 WHERE t.a = 10;
SELECT * FROM update_test;
2006-09-04 00:37:06 +02:00
a | b | c
----+----+-----
10 | 10 | foo
10 | 10 |
2006-01-22 06:20:35 +01:00
(2 rows)
UPDATE update_test t SET b = t.b + 10 WHERE t.a = 10;
SELECT * FROM update_test;
2006-09-04 00:37:06 +02:00
a | b | c
----+----+-----
10 | 20 | foo
10 | 20 |
2006-01-22 06:20:35 +01:00
(2 rows)
2006-08-03 16:54:44 +02:00
--
-- Test VALUES in FROM
--
UPDATE update_test SET a=v.i FROM (VALUES(100, 20)) AS v(i, j)
2006-09-04 00:37:06 +02:00
WHERE update_test.b = v.j;
2006-08-03 16:54:44 +02:00
SELECT * FROM update_test;
2006-09-04 00:37:06 +02:00
a | b | c
-----+----+-----
100 | 20 | foo
100 | 20 |
2006-08-03 16:54:44 +02:00
(2 rows)
2016-11-20 20:26:19 +01:00
-- fail, wrong data type:
UPDATE update_test SET a = v.* FROM (VALUES(100, 20)) AS v(i, j)
WHERE update_test.b = v.j;
ERROR: column "a" is of type integer but expression is of type record
LINE 1: UPDATE update_test SET a = v.* FROM (VALUES(100, 20)) AS v(i...
^
HINT: You will need to rewrite or cast the expression.
2006-09-04 00:37:06 +02:00
--
-- Test multiple-set-clause syntax
--
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
INSERT INTO update_test SELECT a,b+1,c FROM update_test;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 20 | foo
100 | 20 |
100 | 21 | foo
100 | 21 |
(4 rows)
2006-09-04 00:37:06 +02:00
UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'foo';
SELECT * FROM update_test;
a | b | c
-----+----+-------
100 | 20 |
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
100 | 21 |
2006-09-04 00:37:06 +02:00
10 | 31 | bugle
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
10 | 32 | bugle
(4 rows)
2006-09-04 00:37:06 +02:00
UPDATE update_test SET (c,b) = ('car', a+b), a = a + 1 WHERE a = 10;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 20 |
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
100 | 21 |
2006-09-04 00:37:06 +02:00
11 | 41 | car
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
11 | 42 | car
(4 rows)
2006-09-04 00:37:06 +02:00
-- fail, multi assignment to same column:
UPDATE update_test SET (c,b) = ('car', a+b), b = a + 1 WHERE a = 10;
ERROR: multiple assignments to same column "b"
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
-- uncorrelated sub-select:
UPDATE update_test
SET (b,a) = (select a,b from update_test where b = 41 and c = 'car')
WHERE a = 100 AND b = 20;
SELECT * FROM update_test;
a | b | c
-----+----+-----
100 | 21 |
11 | 41 | car
11 | 42 | car
41 | 11 |
(4 rows)
-- correlated sub-select:
UPDATE update_test o
SET (b,a) = (select a+1,b from update_test i
where i.a=o.a and i.b=o.b and i.c is not distinct from o.c);
SELECT * FROM update_test;
a | b | c
----+-----+-----
21 | 101 |
41 | 12 | car
42 | 12 | car
11 | 42 |
(4 rows)
-- fail, multiple rows supplied:
UPDATE update_test SET (b,a) = (select a+1,b from update_test);
ERROR: more than one row returned by a subquery used as an expression
-- set to null if no rows supplied:
UPDATE update_test SET (b,a) = (select a+1,b from update_test where a = 1000)
WHERE a = 11;
SELECT * FROM update_test;
a | b | c
----+-----+-----
21 | 101 |
41 | 12 | car
42 | 12 | car
| |
(4 rows)
Improve handling of "UPDATE ... SET (column_list) = row_constructor".
Previously, the right-hand side of a multiple-column assignment, if it
wasn't a sub-SELECT, had to be a simple parenthesized expression list,
because gram.y was responsible for "bursting" the construct into
independent column assignments. This had the minor defect that you
couldn't write ROW (though you should be able to, since the standard says
this is a row constructor), and the rather larger defect that unlike other
uses of row constructors, we would not expand a "foo.*" item into multiple
columns.
Fix that by changing the RHS to be just "a_expr" in the grammar, leaving
it to transformMultiAssignRef to separate the elements of a RowExpr;
which it will do only after performing standard transformation of the
RowExpr, so that "foo.*" behaves as expected.
The key reason we didn't do that before was the hard-wired handling of
DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by
allowing DEFAULT in any a_expr and having parse analysis throw an error
if SetToDefault is found in an unexpected place. That's an improvement
anyway since the error can be more specific than just "syntax error".
The SQL standard suggests that the RHS could be any a_expr yielding a
suitable row value. This patch doesn't really move the goal posts in that
respect --- you're still limited to RowExpr or a sub-SELECT --- but it does
fix the grammar restriction, so it provides some tangible progress towards
a full implementation. And the limitation is now documented by an explicit
error message rather than an unhelpful "syntax error".
Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
-- *-expansion should work in this context:
UPDATE update_test SET (a,b) = ROW(v.*) FROM (VALUES(21, 100)) AS v(i, j)
2016-11-20 20:26:19 +01:00
WHERE update_test.a = v.i;
Improve handling of "UPDATE ... SET (column_list) = row_constructor".
Previously, the right-hand side of a multiple-column assignment, if it
wasn't a sub-SELECT, had to be a simple parenthesized expression list,
because gram.y was responsible for "bursting" the construct into
independent column assignments. This had the minor defect that you
couldn't write ROW (though you should be able to, since the standard says
this is a row constructor), and the rather larger defect that unlike other
uses of row constructors, we would not expand a "foo.*" item into multiple
columns.
Fix that by changing the RHS to be just "a_expr" in the grammar, leaving
it to transformMultiAssignRef to separate the elements of a RowExpr;
which it will do only after performing standard transformation of the
RowExpr, so that "foo.*" behaves as expected.
The key reason we didn't do that before was the hard-wired handling of
DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by
allowing DEFAULT in any a_expr and having parse analysis throw an error
if SetToDefault is found in an unexpected place. That's an improvement
anyway since the error can be more specific than just "syntax error".
The SQL standard suggests that the RHS could be any a_expr yielding a
suitable row value. This patch doesn't really move the goal posts in that
respect --- you're still limited to RowExpr or a sub-SELECT --- but it does
fix the grammar restriction, so it provides some tangible progress towards
a full implementation. And the limitation is now documented by an explicit
error message rather than an unhelpful "syntax error".
Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
-- you might expect this to work, but syntactically it's not a RowExpr:
UPDATE update_test SET (a,b) = (v.*) FROM (VALUES(21, 101)) AS v(i, j)
2016-11-20 20:26:19 +01:00
WHERE update_test.a = v.i;
Improve handling of "UPDATE ... SET (column_list) = row_constructor".
Previously, the right-hand side of a multiple-column assignment, if it
wasn't a sub-SELECT, had to be a simple parenthesized expression list,
because gram.y was responsible for "bursting" the construct into
independent column assignments. This had the minor defect that you
couldn't write ROW (though you should be able to, since the standard says
this is a row constructor), and the rather larger defect that unlike other
uses of row constructors, we would not expand a "foo.*" item into multiple
columns.
Fix that by changing the RHS to be just "a_expr" in the grammar, leaving
it to transformMultiAssignRef to separate the elements of a RowExpr;
which it will do only after performing standard transformation of the
RowExpr, so that "foo.*" behaves as expected.
The key reason we didn't do that before was the hard-wired handling of
DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by
allowing DEFAULT in any a_expr and having parse analysis throw an error
if SetToDefault is found in an unexpected place. That's an improvement
anyway since the error can be more specific than just "syntax error".
The SQL standard suggests that the RHS could be any a_expr yielding a
suitable row value. This patch doesn't really move the goal posts in that
respect --- you're still limited to RowExpr or a sub-SELECT --- but it does
fix the grammar restriction, so it provides some tangible progress towards
a full implementation. And the limitation is now documented by an explicit
error message rather than an unhelpful "syntax error".
Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression
LINE 1: UPDATE update_test SET (a,b) = (v.*) FROM (VALUES(21, 101)) ...
^
2006-01-22 06:20:35 +01:00
-- if an alias for the target table is specified, don't allow references
-- to the original table name
UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
ERROR: invalid reference to FROM-clause entry for table "update_test"
2006-03-14 23:48:25 +01:00
LINE 1: UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a...
^
2006-01-22 06:20:35 +01:00
HINT: Perhaps you meant to reference the table alias "t".
2011-04-25 15:46:53 +02:00
-- Make sure that we can update to a TOASTed value.
UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
SELECT a, b, char_length(c) FROM update_test;
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
a | b | char_length
----+-----+-------------
| |
Improve handling of "UPDATE ... SET (column_list) = row_constructor".
Previously, the right-hand side of a multiple-column assignment, if it
wasn't a sub-SELECT, had to be a simple parenthesized expression list,
because gram.y was responsible for "bursting" the construct into
independent column assignments. This had the minor defect that you
couldn't write ROW (though you should be able to, since the standard says
this is a row constructor), and the rather larger defect that unlike other
uses of row constructors, we would not expand a "foo.*" item into multiple
columns.
Fix that by changing the RHS to be just "a_expr" in the grammar, leaving
it to transformMultiAssignRef to separate the elements of a RowExpr;
which it will do only after performing standard transformation of the
RowExpr, so that "foo.*" behaves as expected.
The key reason we didn't do that before was the hard-wired handling of
DEFAULT tokens (SetToDefault nodes). This patch deals with that issue by
allowing DEFAULT in any a_expr and having parse analysis throw an error
if SetToDefault is found in an unexpected place. That's an improvement
anyway since the error can be more specific than just "syntax error".
The SQL standard suggests that the RHS could be any a_expr yielding a
suitable row value. This patch doesn't really move the goal posts in that
respect --- you're still limited to RowExpr or a sub-SELECT --- but it does
fix the grammar restriction, so it provides some tangible progress towards
a full implementation. And the limitation is now documented by an explicit
error message rather than an unhelpful "syntax error".
Discussion: <8542.1479742008@sss.pgh.pa.us>
2016-11-22 21:19:57 +01:00
21 | 100 |
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
41 | 12 | 10000
42 | 12 | 10000
(4 rows)
2011-04-25 15:46:53 +02:00
Repair bogus handling of multi-assignment Params in upper plan levels.
Our support for multiple-set-clauses in UPDATE assumes that the Params
referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan
in the targetlist of the plan node that calculates the updated row.
(Yeah, it's a hack...) In some PG branches it's possible that a Result
node gets inserted between the primary calculation of the update tlist
and the ModifyTable node. setrefs.c did the wrong thing in this case
and left the upper-level Params as Params, causing a crash at runtime.
What it should do is replace them with "outer" Vars referencing the child
plan node's output. That's a result of careless ordering of operations
in fix_upper_expr_mutator, so we can fix it just by reordering the code.
Fix fix_join_expr_mutator similarly for consistency, even though join
nodes could never appear in such a context. (In general, it seems
likely to be a bit cheaper to use Vars than Params in such situations
anyway, so this patch might offer a tiny performance improvement.)
The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff
was introduced, so back-patch that far. However, this may be a live
bug only in 9.6.x and 10.x, as the other branches don't seem to want
to calculate the final tlist below the Result node. (That plan shape
change between branches might be a mini-bug in itself, but I'm not
really interested in digging into the reasons for that right now.
Still, add a regression test memorializing what we expect there,
so we'll notice if it changes again.)
Per bug report from Eduards Bezverhijs.
Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
-- Check multi-assignment with a Result node to handle a one-time filter.
EXPLAIN (VERBOSE, COSTS OFF)
UPDATE update_test t
SET (a, b) = (SELECT b, a FROM update_test s WHERE s.a = t.a)
WHERE CURRENT_USER = SESSION_USER;
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
QUERY PLAN
-------------------------------------------------------------
Repair bogus handling of multi-assignment Params in upper plan levels.
Our support for multiple-set-clauses in UPDATE assumes that the Params
referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan
in the targetlist of the plan node that calculates the updated row.
(Yeah, it's a hack...) In some PG branches it's possible that a Result
node gets inserted between the primary calculation of the update tlist
and the ModifyTable node. setrefs.c did the wrong thing in this case
and left the upper-level Params as Params, causing a crash at runtime.
What it should do is replace them with "outer" Vars referencing the child
plan node's output. That's a result of careless ordering of operations
in fix_upper_expr_mutator, so we can fix it just by reordering the code.
Fix fix_join_expr_mutator similarly for consistency, even though join
nodes could never appear in such a context. (In general, it seems
likely to be a bit cheaper to use Vars than Params in such situations
anyway, so this patch might offer a tiny performance improvement.)
The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff
was introduced, so back-patch that far. However, this may be a live
bug only in 9.6.x and 10.x, as the other branches don't seem to want
to calculate the final tlist below the Result node. (That plan shape
change between branches might be a mini-bug in itself, but I'm not
really interested in digging into the reasons for that right now.
Still, add a regression test memorializing what we expect there,
so we'll notice if it changes again.)
Per bug report from Eduards Bezverhijs.
Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
Update on public.update_test t
-> Result
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
Output: $1, $2, (SubPlan 1 (returns $1,$2)), t.ctid
Repair bogus handling of multi-assignment Params in upper plan levels.
Our support for multiple-set-clauses in UPDATE assumes that the Params
referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan
in the targetlist of the plan node that calculates the updated row.
(Yeah, it's a hack...) In some PG branches it's possible that a Result
node gets inserted between the primary calculation of the update tlist
and the ModifyTable node. setrefs.c did the wrong thing in this case
and left the upper-level Params as Params, causing a crash at runtime.
What it should do is replace them with "outer" Vars referencing the child
plan node's output. That's a result of careless ordering of operations
in fix_upper_expr_mutator, so we can fix it just by reordering the code.
Fix fix_join_expr_mutator similarly for consistency, even though join
nodes could never appear in such a context. (In general, it seems
likely to be a bit cheaper to use Vars than Params in such situations
anyway, so this patch might offer a tiny performance improvement.)
The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff
was introduced, so back-patch that far. However, this may be a live
bug only in 9.6.x and 10.x, as the other branches don't seem to want
to calculate the final tlist below the Result node. (That plan shape
change between branches might be a mini-bug in itself, but I'm not
really interested in digging into the reasons for that right now.
Still, add a regression test memorializing what we expect there,
so we'll notice if it changes again.)
Per bug report from Eduards Bezverhijs.
Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
One-Time Filter: (CURRENT_USER = SESSION_USER)
-> Seq Scan on public.update_test t
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
Output: t.a, t.ctid
Repair bogus handling of multi-assignment Params in upper plan levels.
Our support for multiple-set-clauses in UPDATE assumes that the Params
referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan
in the targetlist of the plan node that calculates the updated row.
(Yeah, it's a hack...) In some PG branches it's possible that a Result
node gets inserted between the primary calculation of the update tlist
and the ModifyTable node. setrefs.c did the wrong thing in this case
and left the upper-level Params as Params, causing a crash at runtime.
What it should do is replace them with "outer" Vars referencing the child
plan node's output. That's a result of careless ordering of operations
in fix_upper_expr_mutator, so we can fix it just by reordering the code.
Fix fix_join_expr_mutator similarly for consistency, even though join
nodes could never appear in such a context. (In general, it seems
likely to be a bit cheaper to use Vars than Params in such situations
anyway, so this patch might offer a tiny performance improvement.)
The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff
was introduced, so back-patch that far. However, this may be a live
bug only in 9.6.x and 10.x, as the other branches don't seem to want
to calculate the final tlist below the Result node. (That plan shape
change between branches might be a mini-bug in itself, but I'm not
really interested in digging into the reasons for that right now.
Still, add a regression test memorializing what we expect there,
so we'll notice if it changes again.)
Per bug report from Eduards Bezverhijs.
Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
2018-12-12 19:49:41 +01:00
SubPlan 1 (returns $1,$2)
-> Seq Scan on public.update_test s
Output: s.b, s.a
Filter: (s.a = t.a)
(10 rows)
UPDATE update_test t
SET (a, b) = (SELECT b, a FROM update_test s WHERE s.a = t.a)
WHERE CURRENT_USER = SESSION_USER;
SELECT a, b, char_length(c) FROM update_test;
a | b | char_length
-----+----+-------------
| |
100 | 21 |
12 | 41 | 10000
12 | 42 | 10000
(4 rows)
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
-- Test ON CONFLICT DO UPDATE
INSERT INTO upsert_test VALUES(1, 'Boo');
-- uncorrelated sub-select:
WITH aaa AS (SELECT 1 AS a, 'Foo' AS b) INSERT INTO upsert_test
VALUES (1, 'Bar') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b, a FROM aaa) RETURNING *;
a | b
---+-----
1 | Foo
(1 row)
-- correlated sub-select:
INSERT INTO upsert_test VALUES (1, 'Baz') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Correlated', a from upsert_test i WHERE i.a = upsert_test.a)
RETURNING *;
a | b
---+-----------------
1 | Foo, Correlated
(1 row)
-- correlated sub-select (EXCLUDED.* alias):
INSERT INTO upsert_test VALUES (1, 'Bat') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Excluded', a from upsert_test i WHERE i.a = excluded.a)
RETURNING *;
a | b
---+---------------------------
1 | Foo, Correlated, Excluded
(1 row)
2019-07-25 03:45:58 +02:00
-- ON CONFLICT using system attributes in RETURNING, testing both the
-- inserting and updating paths. See bug report at:
-- https://www.postgresql.org/message-id/73436355-6432-49B1-92ED-1FE4F7E7E100%40finefun.com.au
INSERT INTO upsert_test VALUES (2, 'Beeble') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Excluded', a from upsert_test i WHERE i.a = excluded.a)
2020-04-07 01:33:56 +02:00
RETURNING tableoid::regclass, xmin = pg_current_xact_id()::xid AS xmin_correct, xmax = 0 AS xmax_correct;
2019-07-25 03:45:58 +02:00
tableoid | xmin_correct | xmax_correct
-------------+--------------+--------------
upsert_test | t | t
(1 row)
-- currently xmax is set after a conflict - that's probably not good,
-- but it seems worthwhile to have to be explicit if that changes.
INSERT INTO upsert_test VALUES (2, 'Brox') ON CONFLICT(a)
DO UPDATE SET (b, a) = (SELECT b || ', Excluded', a from upsert_test i WHERE i.a = excluded.a)
2020-04-07 01:33:56 +02:00
RETURNING tableoid::regclass, xmin = pg_current_xact_id()::xid AS xmin_correct, xmax = pg_current_xact_id()::xid AS xmax_correct;
2019-07-25 03:45:58 +02:00
tableoid | xmin_correct | xmax_correct
-------------+--------------+--------------
upsert_test | t | t
(1 row)
2003-08-26 20:32:23 +02:00
DROP TABLE update_test;
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
DROP TABLE upsert_test;
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
---------------------------
-- UPDATE with row movement
---------------------------
-- When a partitioned table receives an UPDATE to the partitioned key and the
-- new values no longer meet the partition's bound, the row must be moved to
-- the correct partition for the new partition key (if one exists). We must
-- also ensure that updatable views on partitioned tables properly enforce any
-- WITH CHECK OPTION that is defined. The situation with triggers in this case
-- also requires thorough testing as partition key updates causing row
-- movement convert UPDATEs into DELETE+INSERT.
CREATE TABLE range_parted (
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
a text,
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
b bigint,
c numeric,
d int,
e varchar
) PARTITION BY RANGE (a, b);
-- Create partitions intentionally in descending bound order, so as to test
-- that update-row-movement works with the leaf partitions not in bound order.
CREATE TABLE part_b_20_b_30 (e varchar, c numeric, a text, b bigint, d int);
ALTER TABLE range_parted ATTACH PARTITION part_b_20_b_30 FOR VALUES FROM ('b', 20) TO ('b', 30);
CREATE TABLE part_b_10_b_20 (e varchar, c numeric, a text, b bigint, d int) PARTITION BY RANGE (c);
CREATE TABLE part_b_1_b_10 PARTITION OF range_parted FOR VALUES FROM ('b', 1) TO ('b', 10);
ALTER TABLE range_parted ATTACH PARTITION part_b_10_b_20 FOR VALUES FROM ('b', 10) TO ('b', 20);
CREATE TABLE part_a_10_a_20 PARTITION OF range_parted FOR VALUES FROM ('a', 10) TO ('a', 20);
CREATE TABLE part_a_1_a_10 PARTITION OF range_parted FOR VALUES FROM ('a', 1) TO ('a', 10);
-- Check that partition-key UPDATE works sanely on a partitioned table that
-- does not have any child partitions.
UPDATE part_b_10_b_20 set b = b - 6;
-- Create some more partitions following the above pattern of descending bound
-- order, but let's make the situation a bit more complex by having the
-- attribute numbers of the columns vary from their parent partition.
CREATE TABLE part_c_100_200 (e varchar, c numeric, a text, b bigint, d int) PARTITION BY range (abs(d));
ALTER TABLE part_c_100_200 DROP COLUMN e, DROP COLUMN c, DROP COLUMN a;
ALTER TABLE part_c_100_200 ADD COLUMN c numeric, ADD COLUMN e varchar, ADD COLUMN a text;
ALTER TABLE part_c_100_200 DROP COLUMN b;
ALTER TABLE part_c_100_200 ADD COLUMN b bigint;
CREATE TABLE part_d_1_15 PARTITION OF part_c_100_200 FOR VALUES FROM (1) TO (15);
CREATE TABLE part_d_15_20 PARTITION OF part_c_100_200 FOR VALUES FROM (15) TO (20);
ALTER TABLE part_b_10_b_20 ATTACH PARTITION part_c_100_200 FOR VALUES FROM (100) TO (200);
CREATE TABLE part_c_1_100 (e varchar, d int, c numeric, b bigint, a text);
ALTER TABLE part_b_10_b_20 ATTACH PARTITION part_c_1_100 FOR VALUES FROM (1) TO (100);
\set init_range_parted 'truncate range_parted; insert into range_parted VALUES (''a'', 1, 1, 1), (''a'', 10, 200, 1), (''b'', 12, 96, 1), (''b'', 13, 97, 2), (''b'', 15, 105, 16), (''b'', 17, 105, 19)'
\set show_data 'select tableoid::regclass::text COLLATE "C" partname, * from range_parted ORDER BY 1, 2, 3, 4, 5, 6'
:init_range_parted;
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_d_15_20 | b | 15 | 105 | 16 |
part_d_15_20 | b | 17 | 105 | 19 |
(6 rows)
-- The order of subplans should be in bound order
EXPLAIN (costs off) UPDATE range_parted set c = c - 50 WHERE c > 97;
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
QUERY PLAN
-------------------------------------------------------
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
Update on range_parted
Fix EXPLAIN's column alias output for mismatched child tables.
If an inheritance/partitioning parent table is assigned some column
alias names in the query, EXPLAIN mapped those aliases onto the
child tables' columns by physical position, resulting in bogus output
if a child table's columns aren't one-for-one with the parent's.
To fix, make expand_single_inheritance_child() generate a correctly
re-mapped column alias list, rather than just copying the parent
RTE's alias node. (We have to fill the alias field, not just
adjust the eref field, because ruleutils.c will ignore eref in
favor of looking at the real column names.)
This means that child tables will now always have alias fields in
plan rtables, where before they might not have. That results in
a rather substantial set of regression test output changes:
EXPLAIN will now always show child tables with aliases that match
the parent table (usually with "_N" appended for uniqueness).
But that seems like a net positive for understandability, since
the parent alias corresponds to something that actually appeared
in the original query, while the child table names didn't.
(Note that this does not change anything for cases where an explicit
table alias was written in the query for the parent table; it
just makes cases without such aliases behave similarly to that.)
Hence, while we could avoid these subsidiary changes if we made
inherit.c more complicated, we choose not to.
Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-03 01:08:10 +01:00
Update on part_a_1_a_10 range_parted_1
Update on part_a_10_a_20 range_parted_2
Update on part_b_1_b_10 range_parted_3
Update on part_c_1_100 range_parted_4
Update on part_d_1_15 range_parted_5
Update on part_d_15_20 range_parted_6
Update on part_b_20_b_30 range_parted_7
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
-> Append
-> Seq Scan on part_a_1_a_10 range_parted_1
Filter: (c > '97'::numeric)
-> Seq Scan on part_a_10_a_20 range_parted_2
Filter: (c > '97'::numeric)
-> Seq Scan on part_b_1_b_10 range_parted_3
Filter: (c > '97'::numeric)
-> Seq Scan on part_c_1_100 range_parted_4
Filter: (c > '97'::numeric)
-> Seq Scan on part_d_1_15 range_parted_5
Filter: (c > '97'::numeric)
-> Seq Scan on part_d_15_20 range_parted_6
Filter: (c > '97'::numeric)
-> Seq Scan on part_b_20_b_30 range_parted_7
Filter: (c > '97'::numeric)
(23 rows)
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
-- fail, row movement happens only within the partition subtree.
UPDATE part_c_100_200 set c = c - 20, d = c WHERE c = 105;
ERROR: new row for relation "part_c_100_200" violates partition constraint
DETAIL: Failing row contains (105, 85, null, b, 15).
-- fail, no partition key update, so no attempt to move tuple,
-- but "a = 'a'" violates partition constraint enforced by root partition)
UPDATE part_b_10_b_20 set a = 'a';
ERROR: new row for relation "part_c_1_100" violates partition constraint
DETAIL: Failing row contains (null, 1, 96, 12, a).
-- ok, partition key update, no constraint violation
UPDATE range_parted set d = d - 10 WHERE d > 10;
-- ok, no partition key update, no constraint violation
UPDATE range_parted set e = d;
-- No row found
UPDATE part_c_1_100 set c = c + 20 WHERE c = 98;
-- ok, row movement
UPDATE part_b_10_b_20 set c = c + 20 returning c, b, a;
c | b | a
-----+----+---
116 | 12 | b
117 | 13 | b
125 | 15 | b
125 | 17 | b
(4 rows)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+---+---
part_a_10_a_20 | a | 10 | 200 | 1 | 1
part_a_1_a_10 | a | 1 | 1 | 1 | 1
part_d_1_15 | b | 12 | 116 | 1 | 1
part_d_1_15 | b | 13 | 117 | 2 | 2
part_d_1_15 | b | 15 | 125 | 6 | 6
part_d_1_15 | b | 17 | 125 | 9 | 9
(6 rows)
-- fail, row movement happens only within the partition subtree.
UPDATE part_b_10_b_20 set b = b - 6 WHERE c > 116 returning *;
ERROR: new row for relation "part_d_1_15" violates partition constraint
DETAIL: Failing row contains (2, 117, 2, b, 7).
-- ok, row movement, with subset of rows moved into different partition.
UPDATE range_parted set b = b - 6 WHERE c > 116 returning a, b + c;
a | ?column?
---+----------
a | 204
b | 124
b | 134
b | 136
(4 rows)
:show_data;
partname | a | b | c | d | e
---------------+---+----+-----+---+---
part_a_1_a_10 | a | 1 | 1 | 1 | 1
part_a_1_a_10 | a | 4 | 200 | 1 | 1
part_b_1_b_10 | b | 7 | 117 | 2 | 2
part_b_1_b_10 | b | 9 | 125 | 6 | 6
part_d_1_15 | b | 11 | 125 | 9 | 9
part_d_1_15 | b | 12 | 116 | 1 | 1
(6 rows)
-- Common table needed for multiple test scenarios.
CREATE TABLE mintab(c1 int);
INSERT into mintab VALUES (120);
-- update partition key using updatable view.
CREATE VIEW upview AS SELECT * FROM range_parted WHERE (select c > c1 FROM mintab) WITH CHECK OPTION;
-- ok
UPDATE upview set c = 199 WHERE b = 4;
-- fail, check option violation
UPDATE upview set c = 120 WHERE b = 4;
ERROR: new row violates check option for view "upview"
DETAIL: Failing row contains (a, 4, 120, 1, 1).
-- fail, row movement with check option violation
UPDATE upview set a = 'b', b = 15, c = 120 WHERE b = 4;
ERROR: new row violates check option for view "upview"
DETAIL: Failing row contains (b, 15, 120, 1, 1).
-- ok, row movement, check option passes
UPDATE upview set a = 'b', b = 15 WHERE b = 4;
:show_data;
partname | a | b | c | d | e
---------------+---+----+-----+---+---
part_a_1_a_10 | a | 1 | 1 | 1 | 1
part_b_1_b_10 | b | 7 | 117 | 2 | 2
part_b_1_b_10 | b | 9 | 125 | 6 | 6
part_d_1_15 | b | 11 | 125 | 9 | 9
part_d_1_15 | b | 12 | 116 | 1 | 1
part_d_1_15 | b | 15 | 199 | 1 | 1
(6 rows)
-- cleanup
DROP VIEW upview;
-- RETURNING having whole-row vars.
:init_range_parted;
UPDATE range_parted set c = 95 WHERE a = 'b' and b > 10 and c > 100 returning (range_parted), *;
range_parted | a | b | c | d | e
---------------+---+----+----+----+---
(b,15,95,16,) | b | 15 | 95 | 16 |
(b,17,95,19,) | b | 17 | 95 | 19 |
(2 rows)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_c_1_100 | b | 15 | 95 | 16 |
part_c_1_100 | b | 17 | 95 | 19 |
(6 rows)
-- Transition tables with update row movement
:init_range_parted;
CREATE FUNCTION trans_updatetrigfunc() RETURNS trigger LANGUAGE plpgsql AS
$$
begin
raise notice 'trigger = %, old table = %, new table = %',
TG_NAME,
(select string_agg(old_table::text, ', ' ORDER BY a) FROM old_table),
(select string_agg(new_table::text, ', ' ORDER BY a) FROM new_table);
return null;
end;
$$;
CREATE TRIGGER trans_updatetrig
AFTER UPDATE ON range_parted REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE trans_updatetrigfunc();
UPDATE range_parted set c = (case when c = 96 then 110 else c + 1 end ) WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,12,96,1,), (b,13,97,2,), (b,15,105,16,), (b,17,105,19,), new table = (b,12,110,1,), (b,13,98,2,), (b,15,106,16,), (b,17,106,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 13 | 98 | 2 |
part_d_15_20 | b | 15 | 106 | 16 |
part_d_15_20 | b | 17 | 106 | 19 |
part_d_1_15 | b | 12 | 110 | 1 |
(6 rows)
:init_range_parted;
-- Enabling OLD TABLE capture for both DELETE as well as UPDATE stmt triggers
-- should not cause DELETEd rows to be captured twice. Similar thing for
-- INSERT triggers and inserted rows.
CREATE TRIGGER trans_deletetrig
AFTER DELETE ON range_parted REFERENCING OLD TABLE AS old_table
FOR EACH STATEMENT EXECUTE PROCEDURE trans_updatetrigfunc();
CREATE TRIGGER trans_inserttrig
AFTER INSERT ON range_parted REFERENCING NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE trans_updatetrigfunc();
UPDATE range_parted set c = c + 50 WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,12,96,1,), (b,13,97,2,), (b,15,105,16,), (b,17,105,19,), new table = (b,12,146,1,), (b,13,147,2,), (b,15,155,16,), (b,17,155,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_d_15_20 | b | 15 | 155 | 16 |
part_d_15_20 | b | 17 | 155 | 19 |
part_d_1_15 | b | 12 | 146 | 1 |
part_d_1_15 | b | 13 | 147 | 2 |
(6 rows)
DROP TRIGGER trans_deletetrig ON range_parted;
DROP TRIGGER trans_inserttrig ON range_parted;
-- Don't drop trans_updatetrig yet. It is required below.
-- Test with transition tuple conversion happening for rows moved into the
-- new partition. This requires a trigger that references transition table
-- (we already have trans_updatetrig). For inserted rows, the conversion
-- is not usually needed, because the original tuple is already compatible with
-- the desired transition tuple format. But conversion happens when there is a
-- BR trigger because the trigger can change the inserted row. So install a
-- BR triggers on those child partitions where the rows will be moved.
CREATE FUNCTION func_parted_mod_b() RETURNS trigger AS $$
BEGIN
NEW.b = NEW.b + 1;
return NEW;
END $$ language plpgsql;
CREATE TRIGGER trig_c1_100 BEFORE UPDATE OR INSERT ON part_c_1_100
FOR EACH ROW EXECUTE PROCEDURE func_parted_mod_b();
CREATE TRIGGER trig_d1_15 BEFORE UPDATE OR INSERT ON part_d_1_15
FOR EACH ROW EXECUTE PROCEDURE func_parted_mod_b();
CREATE TRIGGER trig_d15_20 BEFORE UPDATE OR INSERT ON part_d_15_20
FOR EACH ROW EXECUTE PROCEDURE func_parted_mod_b();
:init_range_parted;
UPDATE range_parted set c = (case when c = 96 then 110 else c + 1 end) WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,13,96,1,), (b,14,97,2,), (b,16,105,16,), (b,18,105,19,), new table = (b,15,110,1,), (b,15,98,2,), (b,17,106,16,), (b,19,106,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 15 | 98 | 2 |
part_d_15_20 | b | 17 | 106 | 16 |
part_d_15_20 | b | 19 | 106 | 19 |
part_d_1_15 | b | 15 | 110 | 1 |
(6 rows)
:init_range_parted;
UPDATE range_parted set c = c + 50 WHERE a = 'b' and b > 10 and c >= 96;
NOTICE: trigger = trans_updatetrig, old table = (b,13,96,1,), (b,14,97,2,), (b,16,105,16,), (b,18,105,19,), new table = (b,15,146,1,), (b,16,147,2,), (b,17,155,16,), (b,19,155,19,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_d_15_20 | b | 17 | 155 | 16 |
part_d_15_20 | b | 19 | 155 | 19 |
part_d_1_15 | b | 15 | 146 | 1 |
part_d_1_15 | b | 16 | 147 | 2 |
(6 rows)
-- Case where per-partition tuple conversion map array is allocated, but the
-- map is not required for the particular tuple that is routed, thanks to
-- matching table attributes of the partition and the target table.
:init_range_parted;
UPDATE range_parted set b = 15 WHERE b = 1;
NOTICE: trigger = trans_updatetrig, old table = (a,1,1,1,), new table = (a,15,1,1,)
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_10_a_20 | a | 15 | 1 | 1 |
part_c_1_100 | b | 13 | 96 | 1 |
part_c_1_100 | b | 14 | 97 | 2 |
part_d_15_20 | b | 16 | 105 | 16 |
part_d_15_20 | b | 18 | 105 | 19 |
(6 rows)
DROP TRIGGER trans_updatetrig ON range_parted;
DROP TRIGGER trig_c1_100 ON part_c_1_100;
DROP TRIGGER trig_d1_15 ON part_d_1_15;
DROP TRIGGER trig_d15_20 ON part_d_15_20;
DROP FUNCTION func_parted_mod_b();
-- RLS policies with update-row-movement
-----------------------------------------
ALTER TABLE range_parted ENABLE ROW LEVEL SECURITY;
CREATE USER regress_range_parted_user;
GRANT ALL ON range_parted, mintab TO regress_range_parted_user;
CREATE POLICY seeall ON range_parted AS PERMISSIVE FOR SELECT USING (true);
CREATE POLICY policy_range_parted ON range_parted for UPDATE USING (true) WITH CHECK (c % 2 = 0);
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- This should fail with RLS violation error while moving row from
-- part_a_10_a_20 to part_d_1_15, because we are setting 'c' to an odd number.
UPDATE range_parted set a = 'b', c = 151 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy for table "range_parted"
RESET SESSION AUTHORIZATION;
-- Create a trigger on part_d_1_15
CREATE FUNCTION func_d_1_15() RETURNS trigger AS $$
BEGIN
NEW.c = NEW.c + 1; -- Make even numbers odd, or vice versa
return NEW;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER trig_d_1_15 BEFORE INSERT ON part_d_1_15
FOR EACH ROW EXECUTE PROCEDURE func_d_1_15();
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- Here, RLS checks should succeed while moving row from part_a_10_a_20 to
-- part_d_1_15. Even though the UPDATE is setting 'c' to an odd number, the
-- trigger at the destination partition again makes it an even number.
UPDATE range_parted set a = 'b', c = 151 WHERE a = 'a' and c = 200;
RESET SESSION AUTHORIZATION;
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- This should fail with RLS violation error. Even though the UPDATE is setting
-- 'c' to an even number, the trigger at the destination partition again makes
-- it an odd number.
UPDATE range_parted set a = 'b', c = 150 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy for table "range_parted"
-- Cleanup
RESET SESSION AUTHORIZATION;
DROP TRIGGER trig_d_1_15 ON part_d_1_15;
DROP FUNCTION func_d_1_15();
-- Policy expression contains SubPlan
RESET SESSION AUTHORIZATION;
:init_range_parted;
CREATE POLICY policy_range_parted_subplan on range_parted
AS RESTRICTIVE for UPDATE USING (true)
WITH CHECK ((SELECT range_parted.c <= c1 FROM mintab));
SET SESSION AUTHORIZATION regress_range_parted_user;
-- fail, mintab has row with c1 = 120
UPDATE range_parted set a = 'b', c = 122 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy "policy_range_parted_subplan" for table "range_parted"
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
-- ok
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
UPDATE range_parted set a = 'b', c = 120 WHERE a = 'a' and c = 200;
-- RLS policy expression contains whole row.
RESET SESSION AUTHORIZATION;
:init_range_parted;
CREATE POLICY policy_range_parted_wholerow on range_parted AS RESTRICTIVE for UPDATE USING (true)
WITH CHECK (range_parted = row('b', 10, 112, 1, NULL)::range_parted);
SET SESSION AUTHORIZATION regress_range_parted_user;
-- ok, should pass the RLS check
UPDATE range_parted set a = 'b', c = 112 WHERE a = 'a' and c = 200;
RESET SESSION AUTHORIZATION;
:init_range_parted;
SET SESSION AUTHORIZATION regress_range_parted_user;
-- fail, the whole row RLS check should fail
UPDATE range_parted set a = 'b', c = 116 WHERE a = 'a' and c = 200;
ERROR: new row violates row-level security policy "policy_range_parted_wholerow" for table "range_parted"
-- Cleanup
RESET SESSION AUTHORIZATION;
DROP POLICY policy_range_parted ON range_parted;
DROP POLICY policy_range_parted_subplan ON range_parted;
DROP POLICY policy_range_parted_wholerow ON range_parted;
REVOKE ALL ON range_parted, mintab FROM regress_range_parted_user;
DROP USER regress_range_parted_user;
DROP TABLE mintab;
-- statement triggers with update row movement
---------------------------------------------------
:init_range_parted;
CREATE FUNCTION trigfunc() returns trigger language plpgsql as
$$
begin
raise notice 'trigger = % fired on table % during %',
TG_NAME, TG_TABLE_NAME, TG_OP;
return null;
end;
$$;
-- Triggers on root partition
CREATE TRIGGER parent_delete_trig
AFTER DELETE ON range_parted for each statement execute procedure trigfunc();
CREATE TRIGGER parent_update_trig
AFTER UPDATE ON range_parted for each statement execute procedure trigfunc();
CREATE TRIGGER parent_insert_trig
AFTER INSERT ON range_parted for each statement execute procedure trigfunc();
-- Triggers on leaf partition part_c_1_100
CREATE TRIGGER c1_delete_trig
AFTER DELETE ON part_c_1_100 for each statement execute procedure trigfunc();
CREATE TRIGGER c1_update_trig
AFTER UPDATE ON part_c_1_100 for each statement execute procedure trigfunc();
CREATE TRIGGER c1_insert_trig
AFTER INSERT ON part_c_1_100 for each statement execute procedure trigfunc();
-- Triggers on leaf partition part_d_1_15
CREATE TRIGGER d1_delete_trig
AFTER DELETE ON part_d_1_15 for each statement execute procedure trigfunc();
CREATE TRIGGER d1_update_trig
AFTER UPDATE ON part_d_1_15 for each statement execute procedure trigfunc();
CREATE TRIGGER d1_insert_trig
AFTER INSERT ON part_d_1_15 for each statement execute procedure trigfunc();
-- Triggers on leaf partition part_d_15_20
CREATE TRIGGER d15_delete_trig
AFTER DELETE ON part_d_15_20 for each statement execute procedure trigfunc();
CREATE TRIGGER d15_update_trig
AFTER UPDATE ON part_d_15_20 for each statement execute procedure trigfunc();
CREATE TRIGGER d15_insert_trig
AFTER INSERT ON part_d_15_20 for each statement execute procedure trigfunc();
-- Move all rows from part_c_100_200 to part_c_1_100. None of the delete or
-- insert statement triggers should be fired.
UPDATE range_parted set c = c - 50 WHERE c > 97;
NOTICE: trigger = parent_update_trig fired on table range_parted during UPDATE
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 150 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_c_1_100 | b | 15 | 55 | 16 |
part_c_1_100 | b | 17 | 55 | 19 |
(6 rows)
DROP TRIGGER parent_delete_trig ON range_parted;
DROP TRIGGER parent_update_trig ON range_parted;
DROP TRIGGER parent_insert_trig ON range_parted;
DROP TRIGGER c1_delete_trig ON part_c_1_100;
DROP TRIGGER c1_update_trig ON part_c_1_100;
DROP TRIGGER c1_insert_trig ON part_c_1_100;
DROP TRIGGER d1_delete_trig ON part_d_1_15;
DROP TRIGGER d1_update_trig ON part_d_1_15;
DROP TRIGGER d1_insert_trig ON part_d_1_15;
DROP TRIGGER d15_delete_trig ON part_d_15_20;
DROP TRIGGER d15_update_trig ON part_d_15_20;
DROP TRIGGER d15_insert_trig ON part_d_15_20;
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
-- Creating default partition for range
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
:init_range_parted;
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
create table part_def partition of range_parted default;
\d+ part_def
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
Table "public.part_def"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+-------------------+-----------+----------+---------+----------+--------------+-------------
a | text | | | | extended | |
b | bigint | | | | plain | |
c | numeric | | | | main | |
d | integer | | | | plain | |
e | character varying | | | | extended | |
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
Partition of: range_parted DEFAULT
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
Partition constraint: (NOT ((a IS NOT NULL) AND (b IS NOT NULL) AND (((a = 'a'::text) AND (b >= '1'::bigint) AND (b < '10'::bigint)) OR ((a = 'a'::text) AND (b >= '10'::bigint) AND (b < '20'::bigint)) OR ((a = 'b'::text) AND (b >= '1'::bigint) AND (b < '10'::bigint)) OR ((a = 'b'::text) AND (b >= '10'::bigint) AND (b < '20'::bigint)) OR ((a = 'b'::text) AND (b >= '20'::bigint) AND (b < '30'::bigint)))))
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
insert into range_parted values ('c', 9);
-- ok
update part_def set a = 'd' where a = 'c';
-- fail
update part_def set a = 'a' where a = 'd';
ERROR: new row for relation "part_def" violates partition constraint
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
DETAIL: Failing row contains (a, 9, null, null, null).
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_d_15_20 | b | 15 | 105 | 16 |
part_d_15_20 | b | 17 | 105 | 19 |
part_def | d | 9 | | |
(7 rows)
-- Update row movement from non-default to default partition.
-- fail, default partition is not under part_a_10_a_20;
UPDATE part_a_10_a_20 set a = 'ad' WHERE a = 'a';
ERROR: new row for relation "part_a_10_a_20" violates partition constraint
DETAIL: Failing row contains (ad, 10, 200, 1, null).
-- ok
UPDATE range_parted set a = 'ad' WHERE a = 'a';
UPDATE range_parted set a = 'bd' WHERE a = 'b';
:show_data;
partname | a | b | c | d | e
----------+----+----+-----+----+---
part_def | ad | 1 | 1 | 1 |
part_def | ad | 10 | 200 | 1 |
part_def | bd | 12 | 96 | 1 |
part_def | bd | 13 | 97 | 2 |
part_def | bd | 15 | 105 | 16 |
part_def | bd | 17 | 105 | 19 |
part_def | d | 9 | | |
(7 rows)
-- Update row movement from default to non-default partitions.
-- ok
UPDATE range_parted set a = 'a' WHERE a = 'ad';
UPDATE range_parted set a = 'b' WHERE a = 'bd';
:show_data;
partname | a | b | c | d | e
----------------+---+----+-----+----+---
part_a_10_a_20 | a | 10 | 200 | 1 |
part_a_1_a_10 | a | 1 | 1 | 1 |
part_c_1_100 | b | 12 | 96 | 1 |
part_c_1_100 | b | 13 | 97 | 2 |
part_d_15_20 | b | 15 | 105 | 16 |
part_d_15_20 | b | 17 | 105 | 19 |
part_def | d | 9 | | |
(7 rows)
-- Cleanup: range_parted no longer needed.
DROP TABLE range_parted;
CREATE TABLE list_parted (
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
a text,
b int
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
) PARTITION BY list (a);
CREATE TABLE list_part1 PARTITION OF list_parted for VALUES in ('a', 'b');
CREATE TABLE list_default PARTITION OF list_parted default;
INSERT into list_part1 VALUES ('a', 1);
INSERT into list_default VALUES ('d', 10);
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
-- fail
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
UPDATE list_default set a = 'a' WHERE a = 'd';
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
ERROR: new row for relation "list_default" violates partition constraint
DETAIL: Failing row contains (a, 10).
-- ok
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
UPDATE list_default set a = 'x' WHERE a = 'd';
DROP TABLE list_parted;
--------------
-- Some more update-partition-key test scenarios below. This time use list
-- partitions.
--------------
-- Setup for list partitions
CREATE TABLE list_parted (a numeric, b int, c int8) PARTITION BY list (a);
CREATE TABLE sub_parted PARTITION OF list_parted for VALUES in (1) PARTITION BY list (b);
CREATE TABLE sub_part1(b int, c int8, a numeric);
ALTER TABLE sub_parted ATTACH PARTITION sub_part1 for VALUES in (1);
CREATE TABLE sub_part2(b int, c int8, a numeric);
ALTER TABLE sub_parted ATTACH PARTITION sub_part2 for VALUES in (2);
CREATE TABLE list_part1(a numeric, b int, c int8);
ALTER TABLE list_parted ATTACH PARTITION list_part1 for VALUES in (2,3);
INSERT into list_parted VALUES (2,5,50);
INSERT into list_parted VALUES (3,6,60);
INSERT into sub_parted VALUES (1,1,60);
INSERT into sub_parted VALUES (1,2,10);
-- Test partition constraint violation when intermediate ancestor is used and
-- constraint is inherited from upper root.
UPDATE sub_parted set a = 2 WHERE c = 10;
ERROR: new row for relation "sub_part2" violates partition constraint
DETAIL: Failing row contains (2, 10, 2).
-- Test update-partition-key, where the unpruned partitions do not have their
-- partition keys updated.
SELECT tableoid::regclass::text, * FROM list_parted WHERE a = 2 ORDER BY 1;
tableoid | a | b | c
------------+---+---+----
list_part1 | 2 | 5 | 50
(1 row)
UPDATE list_parted set b = c + a WHERE a = 2;
SELECT tableoid::regclass::text, * FROM list_parted WHERE a = 2 ORDER BY 1;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
(1 row)
-- Test the case where BR UPDATE triggers change the partition key.
CREATE FUNCTION func_parted_mod_b() returns trigger as $$
BEGIN
NEW.b = 2; -- This is changing partition key column.
return NEW;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER parted_mod_b before update on sub_part1
for each row execute procedure func_parted_mod_b();
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part1 | 1 | 1 | 60
sub_part2 | 1 | 2 | 10
(4 rows)
-- This should do the tuple routing even though there is no explicit
-- partition-key update, because there is a trigger on sub_part1.
UPDATE list_parted set c = 70 WHERE b = 1;
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part2 | 1 | 2 | 10
sub_part2 | 1 | 2 | 70
(4 rows)
DROP TRIGGER parted_mod_b ON sub_part1;
-- If BR DELETE trigger prevented DELETE from happening, we should also skip
-- the INSERT if that delete is part of UPDATE=>DELETE+INSERT.
CREATE OR REPLACE FUNCTION func_parted_mod_b() returns trigger as $$
BEGIN
raise notice 'Trigger: Got OLD row %, but returning NULL', OLD;
return NULL;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER trig_skip_delete before delete on sub_part2
for each row execute procedure func_parted_mod_b();
UPDATE list_parted set b = 1 WHERE c = 70;
NOTICE: Trigger: Got OLD row (2,70,1), but returning NULL
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part2 | 1 | 2 | 10
sub_part2 | 1 | 2 | 70
(4 rows)
-- Drop the trigger. Now the row should be moved.
DROP TRIGGER trig_skip_delete ON sub_part2;
UPDATE list_parted set b = 1 WHERE c = 70;
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
sub_part1 | 1 | 1 | 70
sub_part2 | 1 | 2 | 10
(4 rows)
DROP FUNCTION func_parted_mod_b();
-- UPDATE partition-key with FROM clause. If join produces multiple output
-- rows for the same row to be modified, we should tuple-route the row only
-- once. There should not be any rows inserted.
CREATE TABLE non_parted (id int);
INSERT into non_parted VALUES (1), (1), (1), (2), (2), (2), (3), (3), (3);
UPDATE list_parted t1 set a = 2 FROM non_parted t2 WHERE t1.a = t2.id and a = 1;
SELECT tableoid::regclass::text, * FROM list_parted ORDER BY 1, 2, 3, 4;
tableoid | a | b | c
------------+---+----+----
list_part1 | 2 | 1 | 70
list_part1 | 2 | 2 | 10
list_part1 | 2 | 52 | 50
list_part1 | 3 | 6 | 60
(4 rows)
DROP TABLE non_parted;
-- Cleanup: list_parted no longer needed.
DROP TABLE list_parted;
Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly. This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.
At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work. Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.
Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few
final tweaks also by me.
Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-10 00:07:25 +01:00
-- create custom operator class and hash function, for the same reason
-- explained in alter_table.sql
create or replace function dummy_hashint4(a int4, seed int8) returns int8 as
$$ begin return (a + seed); end; $$ language 'plpgsql' immutable;
create operator class custom_opclass for type int4 using hash as
operator 1 = , function 2 dummy_hashint4(int4, int8);
create table hash_parted (
a int,
b int
) partition by hash (a custom_opclass, b custom_opclass);
create table hpart1 partition of hash_parted for values with (modulus 2, remainder 1);
create table hpart2 partition of hash_parted for values with (modulus 4, remainder 2);
create table hpart3 partition of hash_parted for values with (modulus 8, remainder 0);
create table hpart4 partition of hash_parted for values with (modulus 8, remainder 4);
insert into hpart1 values (1, 1);
insert into hpart2 values (2, 5);
insert into hpart4 values (3, 4);
-- fail
update hpart1 set a = 3, b=4 where a = 1;
ERROR: new row for relation "hpart1" violates partition constraint
DETAIL: Failing row contains (3, 4).
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
-- ok, row movement
Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly. This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.
At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work. Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.
Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few
final tweaks also by me.
Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-10 00:07:25 +01:00
update hash_parted set b = b - 1 where b = 1;
-- ok
update hash_parted set b = b + 8 where b = 1;
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
-- cleanup
Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly. This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.
At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work. Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.
Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few
final tweaks also by me.
Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-10 00:07:25 +01:00
drop table hash_parted;
drop operator class custom_opclass using hash;
drop function dummy_hashint4(a int4, seed int8);