2009-10-10 03:43:50 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* nodeModifyTable.c
|
|
|
|
* routines to handle ModifyTable nodes.
|
|
|
|
*
|
2018-01-03 05:30:12 +01:00
|
|
|
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
|
2009-10-10 03:43:50 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/executor/nodeModifyTable.c
|
2009-10-10 03:43:50 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
/* INTERFACE ROUTINES
|
2010-02-26 03:01:40 +01:00
|
|
|
* ExecInitModifyTable - initialize the ModifyTable node
|
2009-10-10 03:43:50 +02:00
|
|
|
* ExecModifyTable - retrieve the next tuple from the node
|
|
|
|
* ExecEndModifyTable - shut down the ModifyTable node
|
|
|
|
* ExecReScanModifyTable - rescan the ModifyTable node
|
|
|
|
*
|
|
|
|
* NOTES
|
|
|
|
* Each ModifyTable node contains a list of one or more subplans,
|
|
|
|
* much like an Append node. There is one subplan per result relation.
|
|
|
|
* The key reason for this is that in an inherited UPDATE command, each
|
|
|
|
* result relation could have a different schema (more or different
|
|
|
|
* columns) requiring a different plan tree to produce it. In an
|
|
|
|
* inherited DELETE, all the subplans should produce the same output
|
|
|
|
* rowtype, but we might still find that different plans are appropriate
|
|
|
|
* for different child relations.
|
|
|
|
*
|
|
|
|
* If the query specifies RETURNING, then the ModifyTable returns a
|
|
|
|
* RETURNING tuple after completing each row insert, update, or delete.
|
2014-05-06 18:12:18 +02:00
|
|
|
* It must be called again to continue the operation. Without RETURNING,
|
2009-10-10 03:43:50 +02:00
|
|
|
* we just loop within the node until all the work is done, then
|
|
|
|
* return NULL. This avoids useless call/return overhead.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include "postgres.h"
|
|
|
|
|
2012-08-30 22:15:44 +02:00
|
|
|
#include "access/htup_details.h"
|
2009-10-10 03:43:50 +02:00
|
|
|
#include "access/xact.h"
|
|
|
|
#include "commands/trigger.h"
|
2017-11-15 16:23:28 +01:00
|
|
|
#include "executor/execPartition.h"
|
2009-10-10 03:43:50 +02:00
|
|
|
#include "executor/executor.h"
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
#include "executor/nodeMerge.h"
|
2009-10-10 03:43:50 +02:00
|
|
|
#include "executor/nodeModifyTable.h"
|
2013-03-10 19:14:53 +01:00
|
|
|
#include "foreign/fdwapi.h"
|
2009-10-10 03:43:50 +02:00
|
|
|
#include "miscadmin.h"
|
|
|
|
#include "nodes/nodeFuncs.h"
|
|
|
|
#include "storage/bufmgr.h"
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
#include "storage/lmgr.h"
|
2009-10-10 03:43:50 +02:00
|
|
|
#include "utils/builtins.h"
|
|
|
|
#include "utils/memutils.h"
|
2011-02-23 18:18:09 +01:00
|
|
|
#include "utils/rel.h"
|
2009-10-10 03:43:50 +02:00
|
|
|
#include "utils/tqual.h"
|
|
|
|
|
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
static bool ExecOnConflictUpdate(ModifyTableState *mtstate,
|
|
|
|
ResultRelInfo *resultRelInfo,
|
|
|
|
ItemPointer conflictTid,
|
|
|
|
TupleTableSlot *planSlot,
|
|
|
|
TupleTableSlot *excludedSlot,
|
|
|
|
EState *estate,
|
|
|
|
bool canSetTag,
|
|
|
|
TupleTableSlot **returning);
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
|
|
|
|
static void ExecSetupChildParentMapForTcs(ModifyTableState *mtstate);
|
|
|
|
static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
|
|
|
|
static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node,
|
|
|
|
int whichplan);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/* flags for mt_merge_subcommands */
|
|
|
|
#define MERGE_INSERT 0x01
|
|
|
|
#define MERGE_UPDATE 0x02
|
|
|
|
#define MERGE_DELETE 0x04
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* Verify that the tuples to be produced by INSERT or UPDATE match the
|
|
|
|
* target relation's rowtype
|
|
|
|
*
|
|
|
|
* We do this to guard against stale plans. If plan invalidation is
|
|
|
|
* functioning properly then we should never get a failure here, but better
|
|
|
|
* safe than sorry. Note that this is called after we have obtained lock
|
|
|
|
* on the target rel, so the rowtype can't change underneath us.
|
|
|
|
*
|
|
|
|
* The plan output is represented by its targetlist, because that makes
|
|
|
|
* handling the dropped-column case easier.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
ExecCheckPlanOutput(Relation resultRel, List *targetList)
|
|
|
|
{
|
|
|
|
TupleDesc resultDesc = RelationGetDescr(resultRel);
|
|
|
|
int attno = 0;
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
foreach(lc, targetList)
|
|
|
|
{
|
|
|
|
TargetEntry *tle = (TargetEntry *) lfirst(lc);
|
|
|
|
Form_pg_attribute attr;
|
|
|
|
|
|
|
|
if (tle->resjunk)
|
|
|
|
continue; /* ignore junk tlist items */
|
|
|
|
|
|
|
|
if (attno >= resultDesc->natts)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("table row type and query-specified row type do not match"),
|
|
|
|
errdetail("Query has too many columns.")));
|
2017-08-20 20:19:07 +02:00
|
|
|
attr = TupleDescAttr(resultDesc, attno);
|
|
|
|
attno++;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
if (!attr->attisdropped)
|
|
|
|
{
|
|
|
|
/* Normal case: demand type match */
|
|
|
|
if (exprType((Node *) tle->expr) != attr->atttypid)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("table row type and query-specified row type do not match"),
|
|
|
|
errdetail("Table has type %s at ordinal position %d, but query expects %s.",
|
|
|
|
format_type_be(attr->atttypid),
|
|
|
|
attno,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
format_type_be(exprType((Node *) tle->expr)))));
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* For a dropped column, we can't check atttypid (it's likely 0).
|
|
|
|
* In any case the planner has most likely inserted an INT4 null.
|
|
|
|
* What we insist on is just *some* NULL constant.
|
|
|
|
*/
|
|
|
|
if (!IsA(tle->expr, Const) ||
|
|
|
|
!((Const *) tle->expr)->constisnull)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("table row type and query-specified row type do not match"),
|
|
|
|
errdetail("Query provides a value for a dropped column at ordinal position %d.",
|
|
|
|
attno)));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (attno != resultDesc->natts)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("table row type and query-specified row type do not match"),
|
2009-10-10 03:43:50 +02:00
|
|
|
errdetail("Query has too few columns.")));
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* ExecProcessReturning --- evaluate a RETURNING list
|
|
|
|
*
|
2018-03-14 14:28:08 +01:00
|
|
|
* resultRelInfo: current result rel
|
2009-10-10 03:43:50 +02:00
|
|
|
* tupleSlot: slot holding tuple actually inserted/updated/deleted
|
|
|
|
* planSlot: slot holding tuple returned by top subplan node
|
|
|
|
*
|
2016-03-18 18:48:58 +01:00
|
|
|
* Note: If tupleSlot is NULL, the FDW should have already provided econtext's
|
|
|
|
* scan tuple.
|
|
|
|
*
|
2009-10-10 03:43:50 +02:00
|
|
|
* Returns a slot holding the result tuple
|
|
|
|
*/
|
|
|
|
static TupleTableSlot *
|
2016-03-18 18:48:58 +01:00
|
|
|
ExecProcessReturning(ResultRelInfo *resultRelInfo,
|
2009-10-10 03:43:50 +02:00
|
|
|
TupleTableSlot *tupleSlot,
|
|
|
|
TupleTableSlot *planSlot)
|
|
|
|
{
|
2016-03-18 18:48:58 +01:00
|
|
|
ProjectionInfo *projectReturning = resultRelInfo->ri_projectReturning;
|
2009-10-10 03:43:50 +02:00
|
|
|
ExprContext *econtext = projectReturning->pi_exprContext;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset per-tuple memory context to free any expression evaluation
|
|
|
|
* storage allocated in the previous cycle.
|
|
|
|
*/
|
|
|
|
ResetExprContext(econtext);
|
|
|
|
|
|
|
|
/* Make tuple and any needed join variables available to ExecProject */
|
2016-03-18 18:48:58 +01:00
|
|
|
if (tupleSlot)
|
|
|
|
econtext->ecxt_scantuple = tupleSlot;
|
|
|
|
else
|
|
|
|
{
|
|
|
|
HeapTuple tuple;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* RETURNING expressions might reference the tableoid column, so
|
|
|
|
* initialize t_tableOid before evaluating them.
|
|
|
|
*/
|
|
|
|
Assert(!TupIsNull(econtext->ecxt_scantuple));
|
|
|
|
tuple = ExecMaterializeSlot(econtext->ecxt_scantuple);
|
|
|
|
tuple->t_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc);
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
econtext->ecxt_outertuple = planSlot;
|
|
|
|
|
|
|
|
/* Compute the RETURNING expressions */
|
2017-01-19 23:12:38 +01:00
|
|
|
return ExecProject(projectReturning);
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
/*
|
|
|
|
* ExecCheckHeapTupleVisible -- verify heap tuple is visible
|
|
|
|
*
|
|
|
|
* It would not be consistent with guarantees of the higher isolation levels to
|
|
|
|
* proceed with avoiding insertion (taking speculative insertion's alternative
|
|
|
|
* path) on the basis of another tuple that is not visible to MVCC snapshot.
|
|
|
|
* Check for the need to raise a serialization failure, and do so as necessary.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
ExecCheckHeapTupleVisible(EState *estate,
|
|
|
|
HeapTuple tuple,
|
|
|
|
Buffer buffer)
|
|
|
|
{
|
|
|
|
if (!IsolationUsesXactSnapshot())
|
|
|
|
return;
|
|
|
|
|
2016-10-24 01:14:32 +02:00
|
|
|
/*
|
|
|
|
* We need buffer pin and lock to call HeapTupleSatisfiesVisibility.
|
|
|
|
* Caller should be holding pin, but not lock.
|
|
|
|
*/
|
|
|
|
LockBuffer(buffer, BUFFER_LOCK_SHARE);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
if (!HeapTupleSatisfiesVisibility(tuple, estate->es_snapshot, buffer))
|
2016-10-24 00:36:13 +02:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* We should not raise a serialization failure if the conflict is
|
|
|
|
* against a tuple inserted by our own transaction, even if it's not
|
|
|
|
* visible to our snapshot. (This would happen, for example, if
|
|
|
|
* conflicting keys are proposed for insertion in a single command.)
|
|
|
|
*/
|
|
|
|
if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple->t_data)))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("could not serialize access due to concurrent update")));
|
2016-10-24 00:36:13 +02:00
|
|
|
}
|
2016-10-24 01:14:32 +02:00
|
|
|
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* ExecCheckTIDVisible -- convenience variant of ExecCheckHeapTupleVisible()
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
ExecCheckTIDVisible(EState *estate,
|
|
|
|
ResultRelInfo *relinfo,
|
|
|
|
ItemPointer tid)
|
|
|
|
{
|
|
|
|
Relation rel = relinfo->ri_RelationDesc;
|
|
|
|
Buffer buffer;
|
|
|
|
HeapTupleData tuple;
|
|
|
|
|
|
|
|
/* Redundantly check isolation level */
|
|
|
|
if (!IsolationUsesXactSnapshot())
|
|
|
|
return;
|
|
|
|
|
|
|
|
tuple.t_self = *tid;
|
|
|
|
if (!heap_fetch(rel, SnapshotAny, &tuple, &buffer, false, NULL))
|
|
|
|
elog(ERROR, "failed to fetch conflicting tuple for ON CONFLICT");
|
|
|
|
ExecCheckHeapTupleVisible(estate, &tuple, buffer);
|
|
|
|
ReleaseBuffer(buffer);
|
|
|
|
}
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecInsert
|
|
|
|
*
|
|
|
|
* For INSERT, we have to insert the tuple into the target relation
|
|
|
|
* and insert appropriate tuples into the index relations.
|
|
|
|
*
|
|
|
|
* Returns RETURNING result if any, otherwise NULL.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
extern TupleTableSlot *
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
ExecInsert(ModifyTableState *mtstate,
|
|
|
|
TupleTableSlot *slot,
|
2009-10-10 03:43:50 +02:00
|
|
|
TupleTableSlot *planSlot,
|
2011-02-26 00:56:23 +01:00
|
|
|
EState *estate,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
MergeActionState *actionState,
|
2011-02-26 00:56:23 +01:00
|
|
|
bool canSetTag)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
|
|
|
HeapTuple tuple;
|
|
|
|
ResultRelInfo *resultRelInfo;
|
|
|
|
Relation resultRelationDesc;
|
|
|
|
Oid newId;
|
|
|
|
List *recheckIndexes = NIL;
|
2017-04-10 18:20:08 +02:00
|
|
|
TupleTableSlot *result = NULL;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
TransitionCaptureState *ar_insert_trig_tcs;
|
2018-03-19 22:09:43 +01:00
|
|
|
ModifyTable *node = (ModifyTable *) mtstate->ps.plan;
|
|
|
|
OnConflictAction onconflict = node->onConflictAction;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* get the heap tuple out of the tuple table slot, making sure we have a
|
|
|
|
* writable copy
|
|
|
|
*/
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* get information on the (current) result relation
|
|
|
|
*/
|
|
|
|
resultRelInfo = estate->es_result_relation_info;
|
|
|
|
resultRelationDesc = resultRelInfo->ri_RelationDesc;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the result relation has OIDs, force the tuple's OID to zero so that
|
|
|
|
* heap_insert will assign a fresh OID. Usually the OID already will be
|
|
|
|
* zero at this point, but there are corner cases where the plan tree can
|
|
|
|
* return a tuple extracted literally from some table with the same
|
|
|
|
* rowtype.
|
|
|
|
*
|
|
|
|
* XXX if we ever wanted to allow users to assign their own OIDs to new
|
|
|
|
* rows, this'd be the place to do it. For the moment, we make a point of
|
|
|
|
* doing this before calling triggers, so that a user-supplied trigger
|
|
|
|
* could hack the OID if desired.
|
|
|
|
*/
|
|
|
|
if (resultRelationDesc->rd_rel->relhasoids)
|
|
|
|
HeapTupleSetOid(tuple, InvalidOid);
|
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
/*
|
|
|
|
* BEFORE ROW INSERT Triggers.
|
|
|
|
*
|
|
|
|
* Note: We fire BEFORE ROW TRIGGERS for every attempted insertion in an
|
|
|
|
* INSERT ... ON CONFLICT statement. We cannot check for constraint
|
|
|
|
* violations before firing these triggers, because they can change the
|
|
|
|
* values to insert. Also, they can run arbitrary user-defined code with
|
|
|
|
* side-effects that we can't cancel by just not inserting the tuple.
|
|
|
|
*/
|
2009-10-10 03:43:50 +02:00
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
2010-10-10 19:43:33 +02:00
|
|
|
resultRelInfo->ri_TrigDesc->trig_insert_before_row)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2011-02-22 03:18:04 +01:00
|
|
|
slot = ExecBRInsertTriggers(estate, resultRelInfo, slot);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
if (slot == NULL) /* "do nothing" */
|
2009-10-10 03:43:50 +02:00
|
|
|
return NULL;
|
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
/* trigger might have changed tuple */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/* INSTEAD OF ROW INSERT Triggers */
|
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
|
|
|
resultRelInfo->ri_TrigDesc->trig_insert_instead_row)
|
|
|
|
{
|
2011-02-22 03:18:04 +01:00
|
|
|
slot = ExecIRInsertTriggers(estate, resultRelInfo, slot);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
if (slot == NULL) /* "do nothing" */
|
2010-10-10 19:43:33 +02:00
|
|
|
return NULL;
|
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
/* trigger might have changed tuple */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
2010-10-10 19:43:33 +02:00
|
|
|
|
|
|
|
newId = InvalidOid;
|
|
|
|
}
|
2013-03-10 19:14:53 +01:00
|
|
|
else if (resultRelInfo->ri_FdwRoutine)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* insert into foreign table: let the FDW do it
|
|
|
|
*/
|
|
|
|
slot = resultRelInfo->ri_FdwRoutine->ExecForeignInsert(estate,
|
|
|
|
resultRelInfo,
|
|
|
|
slot,
|
|
|
|
planSlot);
|
|
|
|
|
|
|
|
if (slot == NULL) /* "do nothing" */
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/* FDW might have changed tuple */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
|
|
|
|
2016-02-05 03:15:57 +01:00
|
|
|
/*
|
|
|
|
* AFTER ROW Triggers or RETURNING expressions might reference the
|
|
|
|
* tableoid column, so initialize t_tableOid before evaluating them.
|
|
|
|
*/
|
|
|
|
tuple->t_tableOid = RelationGetRelid(resultRelationDesc);
|
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
newId = InvalidOid;
|
|
|
|
}
|
2010-10-10 19:43:33 +02:00
|
|
|
else
|
|
|
|
{
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
WCOKind wco_kind;
|
2018-03-19 22:09:43 +01:00
|
|
|
bool check_partition_constr;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
2017-06-07 18:45:32 +02:00
|
|
|
/*
|
|
|
|
* We always check the partition constraint, including when the tuple
|
|
|
|
* got here via tuple-routing. However we don't need to in the latter
|
|
|
|
* case if no BR trigger is defined on the partition. Note that a BR
|
|
|
|
* trigger might modify the tuple such that the partition constraint
|
|
|
|
* is no longer satisfied, so we need to check in that case.
|
|
|
|
*/
|
2018-03-19 22:09:43 +01:00
|
|
|
check_partition_constr = (resultRelInfo->ri_PartitionCheck != NIL);
|
2017-06-07 18:45:32 +02:00
|
|
|
|
Don't allow system columns in CHECK constraints, except tableoid.
Previously, arbitray system columns could be mentioned in table
constraints, but they were not correctly checked at runtime, because
the values weren't actually set correctly in the tuple. Since it
seems easy enough to initialize the table OID properly, do that,
and continue allowing that column, but disallow the rest unless and
until someone figures out a way to make them work properly.
No back-patch, because this doesn't seem important enough to take the
risk of destabilizing the back branches. In fact, this will pose a
dump-and-reload hazard for those upgrading from previous versions:
constraints that were accepted before but were not correctly enforced
will now either be enforced correctly or not accepted at all. Either
could result in restore failures, but in practice I think very few
users will notice the difference, since the use case is pretty
marginal anyway and few users will be relying on features that have
not historically worked.
Amit Kapila, reviewed by Rushabh Lathia, with doc changes by me.
2013-09-23 19:31:22 +02:00
|
|
|
/*
|
|
|
|
* Constraints might reference the tableoid column, so initialize
|
|
|
|
* t_tableOid before evaluating them.
|
|
|
|
*/
|
|
|
|
tuple->t_tableOid = RelationGetRelid(resultRelationDesc);
|
|
|
|
|
2015-04-25 02:34:26 +02:00
|
|
|
/*
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* Check any RLS WITH CHECK policies.
|
2015-04-25 02:34:26 +02:00
|
|
|
*
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* Normally we should check INSERT policies. But if the insert is the
|
|
|
|
* result of a partition key update that moved the tuple to a new
|
|
|
|
* partition, we should instead check UPDATE policies, because we are
|
|
|
|
* executing policies defined on the target table, and not those
|
|
|
|
* defined on the child partitions.
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
*
|
|
|
|
* If we're running MERGE, we refer to the action that we're executing
|
|
|
|
* to know if we're doing an INSERT or UPDATE to a partition table.
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
if (mtstate->operation == CMD_UPDATE)
|
|
|
|
wco_kind = WCO_RLS_UPDATE_CHECK;
|
|
|
|
else if (mtstate->operation == CMD_MERGE)
|
|
|
|
wco_kind = (actionState->commandType == CMD_UPDATE) ?
|
|
|
|
WCO_RLS_UPDATE_CHECK : WCO_RLS_INSERT_CHECK;
|
|
|
|
else
|
|
|
|
wco_kind = WCO_RLS_INSERT_CHECK;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* ExecWithCheckOptions() will skip any WCOs which are not of the kind
|
|
|
|
* we are looking for at this point.
|
2015-04-25 02:34:26 +02:00
|
|
|
*/
|
|
|
|
if (resultRelInfo->ri_WithCheckOptions != NIL)
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ExecWithCheckOptions(wco_kind, resultRelInfo, slot, estate);
|
2015-04-25 02:34:26 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/*
|
2017-06-07 18:45:32 +02:00
|
|
|
* No need though if the tuple has been routed, and a BR trigger
|
|
|
|
* doesn't exist.
|
2010-10-10 19:43:33 +02:00
|
|
|
*/
|
2018-03-19 21:43:57 +01:00
|
|
|
if (resultRelInfo->ri_PartitionRoot != NULL &&
|
2017-06-07 18:45:32 +02:00
|
|
|
!(resultRelInfo->ri_TrigDesc &&
|
|
|
|
resultRelInfo->ri_TrigDesc->trig_insert_before_row))
|
|
|
|
check_partition_constr = false;
|
|
|
|
|
|
|
|
/* Check the constraints of the tuple */
|
|
|
|
if (resultRelationDesc->rd_att->constr || check_partition_constr)
|
2018-01-05 21:18:03 +01:00
|
|
|
ExecConstraints(resultRelInfo, slot, estate, true);
|
2010-10-10 19:43:33 +02:00
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
if (onconflict != ONCONFLICT_NONE && resultRelInfo->ri_NumIndices > 0)
|
|
|
|
{
|
|
|
|
/* Perform a speculative insertion. */
|
|
|
|
uint32 specToken;
|
|
|
|
ItemPointerData conflictTid;
|
|
|
|
bool specConflict;
|
2018-03-19 22:09:43 +01:00
|
|
|
List *arbiterIndexes;
|
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
arbiterIndexes = resultRelInfo->ri_onConflictArbiterIndexes;
|
2010-10-10 19:43:33 +02:00
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
/*
|
|
|
|
* Do a non-conclusive check for conflicts first.
|
|
|
|
*
|
|
|
|
* We're not holding any locks yet, so this doesn't guarantee that
|
|
|
|
* the later insert won't conflict. But it avoids leaving behind
|
|
|
|
* a lot of canceled speculative insertions, if you run a lot of
|
|
|
|
* INSERT ON CONFLICT statements that do conflict.
|
|
|
|
*
|
|
|
|
* We loop back here if we find a conflict below, either during
|
|
|
|
* the pre-check, or when we re-check after inserting the tuple
|
2015-07-27 10:46:11 +02:00
|
|
|
* speculatively.
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
*/
|
|
|
|
vlock:
|
|
|
|
specConflict = false;
|
|
|
|
if (!ExecCheckIndexConstraints(slot, estate, &conflictTid,
|
|
|
|
arbiterIndexes))
|
|
|
|
{
|
|
|
|
/* committed conflict tuple found */
|
|
|
|
if (onconflict == ONCONFLICT_UPDATE)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* In case of ON CONFLICT DO UPDATE, execute the UPDATE
|
|
|
|
* part. Be prepared to retry if the UPDATE fails because
|
|
|
|
* of another concurrent UPDATE/DELETE to the conflict
|
|
|
|
* tuple.
|
|
|
|
*/
|
|
|
|
TupleTableSlot *returning = NULL;
|
|
|
|
|
|
|
|
if (ExecOnConflictUpdate(mtstate, resultRelInfo,
|
|
|
|
&conflictTid, planSlot, slot,
|
|
|
|
estate, canSetTag, &returning))
|
|
|
|
{
|
|
|
|
InstrCountFiltered2(&mtstate->ps, 1);
|
|
|
|
return returning;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
goto vlock;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* In case of ON CONFLICT DO NOTHING, do nothing. However,
|
|
|
|
* verify that the tuple is visible to the executor's MVCC
|
|
|
|
* snapshot at higher isolation levels.
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
*/
|
|
|
|
Assert(onconflict == ONCONFLICT_NOTHING);
|
|
|
|
ExecCheckTIDVisible(estate, resultRelInfo, &conflictTid);
|
|
|
|
InstrCountFiltered2(&mtstate->ps, 1);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Before we start insertion proper, acquire our "speculative
|
|
|
|
* insertion lock". Others can use that to wait for us to decide
|
|
|
|
* if we're going to go ahead with the insertion, instead of
|
|
|
|
* waiting for the whole transaction to complete.
|
|
|
|
*/
|
|
|
|
specToken = SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
|
|
|
|
HeapTupleHeaderSetSpeculativeToken(tuple->t_data, specToken);
|
|
|
|
|
|
|
|
/* insert the tuple, with the speculative token */
|
|
|
|
newId = heap_insert(resultRelationDesc, tuple,
|
|
|
|
estate->es_output_cid,
|
|
|
|
HEAP_INSERT_SPECULATIVE,
|
|
|
|
NULL);
|
|
|
|
|
|
|
|
/* insert index entries for tuple */
|
2010-10-10 19:43:33 +02:00
|
|
|
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
estate, true, &specConflict,
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
arbiterIndexes);
|
|
|
|
|
|
|
|
/* adjust the tuple's state accordingly */
|
|
|
|
if (!specConflict)
|
|
|
|
heap_finish_speculative(resultRelationDesc, tuple);
|
|
|
|
else
|
|
|
|
heap_abort_speculative(resultRelationDesc, tuple);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Wake up anyone waiting for our decision. They will re-check
|
|
|
|
* the tuple, see that it's no longer speculative, and wait on our
|
|
|
|
* XID as if this was a regularly inserted tuple all along. Or if
|
|
|
|
* we killed the tuple, they will see it's dead, and proceed as if
|
|
|
|
* the tuple never existed.
|
|
|
|
*/
|
|
|
|
SpeculativeInsertionLockRelease(GetCurrentTransactionId());
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If there was a conflict, start from the beginning. We'll do
|
|
|
|
* the pre-check again, which will now find the conflicting tuple
|
|
|
|
* (unless it aborts before we get there).
|
|
|
|
*/
|
|
|
|
if (specConflict)
|
|
|
|
{
|
|
|
|
list_free(recheckIndexes);
|
|
|
|
goto vlock;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Since there was no insertion conflict, we're done */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* insert the tuple normally.
|
|
|
|
*
|
|
|
|
* Note: heap_insert returns the tid (location) of the new tuple
|
|
|
|
* in the t_self field.
|
|
|
|
*/
|
|
|
|
newId = heap_insert(resultRelationDesc, tuple,
|
|
|
|
estate->es_output_cid,
|
|
|
|
0, NULL);
|
|
|
|
|
|
|
|
/* insert index entries for tuple */
|
|
|
|
if (resultRelInfo->ri_NumIndices > 0)
|
|
|
|
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
|
|
|
|
estate, false, NULL,
|
2018-03-19 22:09:43 +01:00
|
|
|
NIL);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
}
|
2010-10-10 19:43:33 +02:00
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
if (canSetTag)
|
|
|
|
{
|
|
|
|
(estate->es_processed)++;
|
|
|
|
estate->es_lastoid = newId;
|
|
|
|
setLastTid(&(tuple->t_self));
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* If this insert is the result of a partition key update that moved the
|
|
|
|
* tuple to a new partition, put this row into the transition NEW TABLE,
|
|
|
|
* if there is one. We need to do this separately for DELETE and INSERT
|
|
|
|
* because they happen on different tables.
|
|
|
|
*/
|
|
|
|
ar_insert_trig_tcs = mtstate->mt_transition_capture;
|
|
|
|
if (mtstate->operation == CMD_UPDATE && mtstate->mt_transition_capture
|
|
|
|
&& mtstate->mt_transition_capture->tcs_update_new_table)
|
|
|
|
{
|
|
|
|
ExecARUpdateTriggers(estate, resultRelInfo, NULL,
|
|
|
|
NULL,
|
|
|
|
tuple,
|
|
|
|
NULL,
|
|
|
|
mtstate->mt_transition_capture);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We've already captured the NEW TABLE row, so make sure any AR
|
|
|
|
* INSERT trigger fired below doesn't capture it again.
|
|
|
|
*/
|
|
|
|
ar_insert_trig_tcs = NULL;
|
|
|
|
}
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/* AFTER ROW INSERT Triggers */
|
2017-06-28 19:55:03 +02:00
|
|
|
ExecARInsertTriggers(estate, resultRelInfo, tuple, recheckIndexes,
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ar_insert_trig_tcs);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-01-31 19:15:39 +01:00
|
|
|
list_free(recheckIndexes);
|
|
|
|
|
2015-04-25 02:34:26 +02:00
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* Check any WITH CHECK OPTION constraints from parent views. We are
|
|
|
|
* required to do this after testing all constraints and uniqueness
|
|
|
|
* violations per the SQL spec, so we do it after actually inserting the
|
|
|
|
* record into the heap and all indexes.
|
2015-04-25 02:34:26 +02:00
|
|
|
*
|
2015-05-24 03:35:49 +02:00
|
|
|
* ExecWithCheckOptions will elog(ERROR) if a violation is found, so the
|
|
|
|
* tuple will never be seen, if it violates the WITH CHECK OPTION.
|
2015-04-25 02:34:26 +02:00
|
|
|
*
|
2015-05-24 03:35:49 +02:00
|
|
|
* ExecWithCheckOptions() will skip any WCOs which are not of the kind we
|
|
|
|
* are looking for at this point.
|
2015-04-25 02:34:26 +02:00
|
|
|
*/
|
2013-07-18 23:10:16 +02:00
|
|
|
if (resultRelInfo->ri_WithCheckOptions != NIL)
|
2015-04-25 02:34:26 +02:00
|
|
|
ExecWithCheckOptions(WCO_VIEW_CHECK, resultRelInfo, slot, estate);
|
2013-07-18 23:10:16 +02:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/* Process RETURNING if present */
|
|
|
|
if (resultRelInfo->ri_projectReturning)
|
2017-01-19 19:20:11 +01:00
|
|
|
result = ExecProcessReturning(resultRelInfo, slot, planSlot);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2017-01-19 19:20:11 +01:00
|
|
|
return result;
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecDelete
|
|
|
|
*
|
|
|
|
* DELETE is like UPDATE, except that we delete the tuple and no
|
2010-10-10 19:43:33 +02:00
|
|
|
* index modifications are needed.
|
|
|
|
*
|
|
|
|
* When deleting from a table, tupleid identifies the tuple to
|
|
|
|
* delete and oldtuple is NULL. When deleting from a view,
|
|
|
|
* oldtuple is passed to the INSTEAD OF triggers and identifies
|
2013-03-10 19:14:53 +01:00
|
|
|
* what to delete, and tupleid is invalid. When deleting from a
|
2014-03-23 07:16:34 +01:00
|
|
|
* foreign table, tupleid is invalid; the FDW has to figure out
|
|
|
|
* which row to delete using data from the planSlot. oldtuple is
|
|
|
|
* passed to foreign table triggers; it is NULL when the foreign
|
|
|
|
* table has no relevant triggers.
|
2009-10-10 03:43:50 +02:00
|
|
|
*
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
* MERGE passes actionState of the action it's currently executing;
|
|
|
|
* regular DELETE passes NULL. This is used by ExecDelete to know if it's
|
|
|
|
* being called from MERGE or regular DELETE operation.
|
|
|
|
*
|
|
|
|
* If the DELETE fails because the tuple is concurrently updated/deleted
|
|
|
|
* by this or some other transaction, hufdp is filled with the reason as
|
|
|
|
* well as other important information. Currently only MERGE needs this
|
|
|
|
* information.
|
|
|
|
*
|
2009-10-10 03:43:50 +02:00
|
|
|
* Returns RETURNING result if any, otherwise NULL.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
TupleTableSlot *
|
2017-06-28 19:55:03 +02:00
|
|
|
ExecDelete(ModifyTableState *mtstate,
|
|
|
|
ItemPointer tupleid,
|
2014-03-23 07:16:34 +01:00
|
|
|
HeapTuple oldtuple,
|
2009-10-10 03:43:50 +02:00
|
|
|
TupleTableSlot *planSlot,
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
EPQState *epqstate,
|
2011-02-26 00:56:23 +01:00
|
|
|
EState *estate,
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
bool *tupleDeleted,
|
|
|
|
bool processReturning,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
HeapUpdateFailureData *hufdp,
|
|
|
|
MergeActionState *actionState,
|
2011-02-26 00:56:23 +01:00
|
|
|
bool canSetTag)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
|
|
|
ResultRelInfo *resultRelInfo;
|
|
|
|
Relation resultRelationDesc;
|
|
|
|
HTSU_Result result;
|
2012-10-26 21:55:36 +02:00
|
|
|
HeapUpdateFailureData hufd;
|
2013-03-10 19:14:53 +01:00
|
|
|
TupleTableSlot *slot = NULL;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
TransitionCaptureState *ar_delete_trig_tcs;
|
|
|
|
|
|
|
|
if (tupleDeleted)
|
|
|
|
*tupleDeleted = false;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/*
|
|
|
|
* Initialize hufdp. Since the caller is only interested in the failure
|
|
|
|
* status, initialize with the state that is used to indicate successful
|
|
|
|
* operation.
|
|
|
|
*/
|
|
|
|
if (hufdp)
|
|
|
|
hufdp->result = HeapTupleMayBeUpdated;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* get information on the (current) result relation
|
|
|
|
*/
|
|
|
|
resultRelInfo = estate->es_result_relation_info;
|
|
|
|
resultRelationDesc = resultRelInfo->ri_RelationDesc;
|
|
|
|
|
|
|
|
/* BEFORE ROW DELETE Triggers */
|
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
2010-10-10 19:43:33 +02:00
|
|
|
resultRelInfo->ri_TrigDesc->trig_delete_before_row)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
|
|
|
bool dodelete;
|
|
|
|
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
dodelete = ExecBRDeleteTriggers(estate, epqstate, resultRelInfo,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
tupleid, oldtuple, hufdp);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
if (!dodelete) /* "do nothing" */
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/* INSTEAD OF ROW DELETE Triggers */
|
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
|
|
|
resultRelInfo->ri_TrigDesc->trig_delete_instead_row)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2010-10-10 19:43:33 +02:00
|
|
|
bool dodelete;
|
|
|
|
|
|
|
|
Assert(oldtuple != NULL);
|
2014-03-23 07:16:34 +01:00
|
|
|
dodelete = ExecIRDeleteTriggers(estate, resultRelInfo, oldtuple);
|
2010-10-10 19:43:33 +02:00
|
|
|
|
|
|
|
if (!dodelete) /* "do nothing" */
|
2009-10-10 03:43:50 +02:00
|
|
|
return NULL;
|
2010-10-10 19:43:33 +02:00
|
|
|
}
|
2013-03-10 19:14:53 +01:00
|
|
|
else if (resultRelInfo->ri_FdwRoutine)
|
|
|
|
{
|
2016-02-05 03:15:57 +01:00
|
|
|
HeapTuple tuple;
|
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
/*
|
|
|
|
* delete from foreign table: let the FDW do it
|
|
|
|
*
|
|
|
|
* We offer the trigger tuple slot as a place to store RETURNING data,
|
|
|
|
* although the FDW can return some other slot if it wants. Set up
|
|
|
|
* the slot's tupdesc so the FDW doesn't need to do that for itself.
|
|
|
|
*/
|
|
|
|
slot = estate->es_trig_tuple_slot;
|
|
|
|
if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
|
|
|
|
ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
|
|
|
|
|
|
|
|
slot = resultRelInfo->ri_FdwRoutine->ExecForeignDelete(estate,
|
|
|
|
resultRelInfo,
|
|
|
|
slot,
|
|
|
|
planSlot);
|
|
|
|
|
|
|
|
if (slot == NULL) /* "do nothing" */
|
|
|
|
return NULL;
|
2016-02-05 03:15:57 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* RETURNING expressions might reference the tableoid column, so
|
|
|
|
* initialize t_tableOid before evaluating them.
|
|
|
|
*/
|
|
|
|
if (slot->tts_isempty)
|
|
|
|
ExecStoreAllNullTuple(slot);
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
|
|
|
tuple->t_tableOid = RelationGetRelid(resultRelationDesc);
|
2013-03-10 19:14:53 +01:00
|
|
|
}
|
2010-10-10 19:43:33 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* delete the tuple
|
|
|
|
*
|
|
|
|
* Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check
|
|
|
|
* that the row to be deleted is visible to that snapshot, and throw a
|
|
|
|
* can't-serialize error if not. This is a special-case behavior
|
|
|
|
* needed for referential integrity updates in transaction-snapshot
|
|
|
|
* mode transactions.
|
|
|
|
*/
|
|
|
|
ldelete:;
|
|
|
|
result = heap_delete(resultRelationDesc, tupleid,
|
|
|
|
estate->es_output_cid,
|
|
|
|
estate->es_crosscheck_snapshot,
|
2013-05-29 22:58:43 +02:00
|
|
|
true /* wait for commit */ ,
|
2012-10-26 21:55:36 +02:00
|
|
|
&hufd);
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Copy the necessary information, if the caller has asked for it. We
|
|
|
|
* must do this irrespective of whether the tuple was updated or
|
|
|
|
* deleted.
|
|
|
|
*/
|
|
|
|
if (hufdp)
|
|
|
|
*hufdp = hufd;
|
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
switch (result)
|
|
|
|
{
|
|
|
|
case HeapTupleSelfUpdated:
|
2013-05-29 22:58:43 +02:00
|
|
|
|
2012-10-26 21:55:36 +02:00
|
|
|
/*
|
|
|
|
* The target tuple was already updated or deleted by the
|
|
|
|
* current command, or by a later command in the current
|
|
|
|
* transaction. The former case is possible in a join DELETE
|
2013-05-29 22:58:43 +02:00
|
|
|
* where multiple tuples join to the same target tuple. This
|
|
|
|
* is somewhat questionable, but Postgres has always allowed
|
|
|
|
* it: we just ignore additional deletion attempts.
|
2012-10-26 21:55:36 +02:00
|
|
|
*
|
|
|
|
* The latter case arises if the tuple is modified by a
|
|
|
|
* command in a BEFORE trigger, or perhaps by a command in a
|
|
|
|
* volatile function used in the query. In such situations we
|
|
|
|
* should not ignore the deletion, but it is equally unsafe to
|
|
|
|
* proceed. We don't want to discard the original DELETE
|
|
|
|
* while keeping the triggered actions based on its deletion;
|
|
|
|
* and it would be no better to allow the original DELETE
|
2014-05-06 18:12:18 +02:00
|
|
|
* while discarding updates that it triggered. The row update
|
2012-10-26 21:55:36 +02:00
|
|
|
* carries some information that might be important according
|
|
|
|
* to business rules; so throwing an error is the only safe
|
|
|
|
* course.
|
|
|
|
*
|
2013-05-29 22:58:43 +02:00
|
|
|
* If a trigger actually intends this type of interaction, it
|
|
|
|
* can re-execute the DELETE and then return NULL to cancel
|
|
|
|
* the outer delete.
|
2012-10-26 21:55:36 +02:00
|
|
|
*/
|
|
|
|
if (hufd.cmax != estate->es_output_cid)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
|
|
|
|
errmsg("tuple to be updated was already modified by an operation triggered by the current command"),
|
|
|
|
errhint("Consider using an AFTER trigger instead of a BEFORE trigger to propagate changes to other rows.")));
|
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/*
|
|
|
|
* Else, already deleted by self; nothing to do but inform
|
|
|
|
* MERGE about it anyways so that it can take necessary
|
|
|
|
* action.
|
|
|
|
*/
|
2010-10-10 19:43:33 +02:00
|
|
|
return NULL;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
case HeapTupleMayBeUpdated:
|
|
|
|
break;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
case HeapTupleUpdated:
|
|
|
|
if (IsolationUsesXactSnapshot())
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
|
|
|
|
errmsg("could not serialize access due to concurrent update")));
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
2012-10-26 21:55:36 +02:00
|
|
|
if (!ItemPointerEquals(tupleid, &hufd.ctid))
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2010-10-10 19:43:33 +02:00
|
|
|
TupleTableSlot *epqslot;
|
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/*
|
|
|
|
* If we're executing MERGE, then the onus of running
|
|
|
|
* EvalPlanQual() and handling its outcome lies with the
|
|
|
|
* caller.
|
|
|
|
*/
|
|
|
|
if (actionState != NULL)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/* Normal DELETE path. */
|
2010-10-10 19:43:33 +02:00
|
|
|
epqslot = EvalPlanQual(estate,
|
|
|
|
epqstate,
|
|
|
|
resultRelationDesc,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
GetEPQRangeTableIndex(resultRelInfo),
|
Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
1290721684-sup-3951@alvh.no-ip.org
1294953201-sup-2099@alvh.no-ip.org
1320343602-sup-2290@alvh.no-ip.org
1339690386-sup-8927@alvh.no-ip.org
4FE5FF020200002500048A3D@gw.wicourts.gov
4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
|
|
|
LockTupleExclusive,
|
2012-10-26 21:55:36 +02:00
|
|
|
&hufd.ctid,
|
|
|
|
hufd.xmax);
|
2010-10-10 19:43:33 +02:00
|
|
|
if (!TupIsNull(epqslot))
|
|
|
|
{
|
2012-10-26 21:55:36 +02:00
|
|
|
*tupleid = hufd.ctid;
|
2010-10-10 19:43:33 +02:00
|
|
|
goto ldelete;
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* tuple already deleted; nothing to do. But MERGE might want
|
|
|
|
* to handle it differently. We've already filled-in hufdp
|
|
|
|
* with sufficient information for MERGE to look at.
|
|
|
|
*/
|
2010-10-10 19:43:33 +02:00
|
|
|
return NULL;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized heap_delete status: %u", result);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note: Normally one would think that we have to delete index tuples
|
|
|
|
* associated with the heap tuple now...
|
|
|
|
*
|
|
|
|
* ... but in POSTGRES, we have no need to do this because VACUUM will
|
|
|
|
* take care of it later. We can't delete index tuples immediately
|
|
|
|
* anyway, since the tuple is still visible to other transactions.
|
|
|
|
*/
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
if (canSetTag)
|
|
|
|
(estate->es_processed)++;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/* Tell caller that the delete actually happened. */
|
|
|
|
if (tupleDeleted)
|
|
|
|
*tupleDeleted = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If this delete is the result of a partition key update that moved the
|
|
|
|
* tuple to a new partition, put this row into the transition OLD TABLE,
|
|
|
|
* if there is one. We need to do this separately for DELETE and INSERT
|
|
|
|
* because they happen on different tables.
|
|
|
|
*/
|
|
|
|
ar_delete_trig_tcs = mtstate->mt_transition_capture;
|
|
|
|
if (mtstate->operation == CMD_UPDATE && mtstate->mt_transition_capture
|
|
|
|
&& mtstate->mt_transition_capture->tcs_update_old_table)
|
|
|
|
{
|
|
|
|
ExecARUpdateTriggers(estate, resultRelInfo,
|
|
|
|
tupleid,
|
|
|
|
oldtuple,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
mtstate->mt_transition_capture);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We've already captured the NEW TABLE row, so make sure any AR
|
|
|
|
* DELETE trigger fired below doesn't capture it again.
|
|
|
|
*/
|
|
|
|
ar_delete_trig_tcs = NULL;
|
|
|
|
}
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/* AFTER ROW DELETE Triggers */
|
2017-06-28 19:55:03 +02:00
|
|
|
ExecARDeleteTriggers(estate, resultRelInfo, tupleid, oldtuple,
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ar_delete_trig_tcs);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/* Process RETURNING if present and if requested */
|
|
|
|
if (processReturning && resultRelInfo->ri_projectReturning)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* We have to put the target tuple into a slot, which means first we
|
2014-05-06 18:12:18 +02:00
|
|
|
* gotta fetch it. We can use the trigger tuple slot.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
|
|
|
TupleTableSlot *rslot;
|
|
|
|
HeapTupleData deltuple;
|
|
|
|
Buffer delbuffer;
|
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
if (resultRelInfo->ri_FdwRoutine)
|
2010-10-10 19:43:33 +02:00
|
|
|
{
|
2013-03-10 19:14:53 +01:00
|
|
|
/* FDW must have provided a slot containing the deleted row */
|
|
|
|
Assert(!TupIsNull(slot));
|
2010-10-10 19:43:33 +02:00
|
|
|
delbuffer = InvalidBuffer;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
2013-03-10 19:14:53 +01:00
|
|
|
slot = estate->es_trig_tuple_slot;
|
|
|
|
if (oldtuple != NULL)
|
|
|
|
{
|
2014-03-23 07:16:34 +01:00
|
|
|
deltuple = *oldtuple;
|
2013-03-10 19:14:53 +01:00
|
|
|
delbuffer = InvalidBuffer;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
deltuple.t_self = *tupleid;
|
|
|
|
if (!heap_fetch(resultRelationDesc, SnapshotAny,
|
|
|
|
&deltuple, &delbuffer, false, NULL))
|
|
|
|
elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
|
|
|
|
ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));
|
|
|
|
ExecStoreTuple(&deltuple, slot, InvalidBuffer, false);
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2016-03-18 18:48:58 +01:00
|
|
|
rslot = ExecProcessReturning(resultRelInfo, slot, planSlot);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
/*
|
|
|
|
* Before releasing the target tuple again, make sure rslot has a
|
|
|
|
* local copy of any pass-by-reference values.
|
|
|
|
*/
|
|
|
|
ExecMaterializeSlot(rslot);
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
ExecClearTuple(slot);
|
2010-10-10 19:43:33 +02:00
|
|
|
if (BufferIsValid(delbuffer))
|
|
|
|
ReleaseBuffer(delbuffer);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
return rslot;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecUpdate
|
|
|
|
*
|
|
|
|
* note: we can't run UPDATE queries with transactions
|
|
|
|
* off because UPDATEs are actually INSERTs and our
|
|
|
|
* scan will mistakenly loop forever, updating the tuple
|
2014-05-06 18:12:18 +02:00
|
|
|
* it just inserted.. This should be fixed but until it
|
2009-10-10 03:43:50 +02:00
|
|
|
* is, we don't want to get stuck in an infinite loop
|
|
|
|
* which corrupts your database..
|
|
|
|
*
|
2010-10-10 19:43:33 +02:00
|
|
|
* When updating a table, tupleid identifies the tuple to
|
|
|
|
* update and oldtuple is NULL. When updating a view, oldtuple
|
|
|
|
* is passed to the INSTEAD OF triggers and identifies what to
|
2013-03-10 19:14:53 +01:00
|
|
|
* update, and tupleid is invalid. When updating a foreign table,
|
2014-03-23 07:16:34 +01:00
|
|
|
* tupleid is invalid; the FDW has to figure out which row to
|
|
|
|
* update using data from the planSlot. oldtuple is passed to
|
|
|
|
* foreign table triggers; it is NULL when the foreign table has
|
|
|
|
* no relevant triggers.
|
2010-10-10 19:43:33 +02:00
|
|
|
*
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
* MERGE passes actionState of the action it's currently executing;
|
|
|
|
* regular UPDATE passes NULL. This is used by ExecUpdate to know if it's
|
|
|
|
* being called from MERGE or regular UPDATE operation. ExecUpdate may
|
|
|
|
* pass this information to ExecInsert if it ends up running DELETE+INSERT
|
|
|
|
* for partition key updates.
|
|
|
|
*
|
|
|
|
* If the UPDATE fails because the tuple is concurrently updated/deleted
|
|
|
|
* by this or some other transaction, hufdp is filled with the reason as
|
|
|
|
* well as other important information. Currently only MERGE needs this
|
|
|
|
* information.
|
|
|
|
*
|
2009-10-10 03:43:50 +02:00
|
|
|
* Returns RETURNING result if any, otherwise NULL.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
extern TupleTableSlot *
|
2017-06-28 19:55:03 +02:00
|
|
|
ExecUpdate(ModifyTableState *mtstate,
|
|
|
|
ItemPointer tupleid,
|
2014-03-23 07:16:34 +01:00
|
|
|
HeapTuple oldtuple,
|
2009-10-10 03:43:50 +02:00
|
|
|
TupleTableSlot *slot,
|
|
|
|
TupleTableSlot *planSlot,
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
EPQState *epqstate,
|
2011-02-26 00:56:23 +01:00
|
|
|
EState *estate,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
bool *tuple_updated,
|
|
|
|
HeapUpdateFailureData *hufdp,
|
|
|
|
MergeActionState *actionState,
|
2011-02-26 00:56:23 +01:00
|
|
|
bool canSetTag)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
|
|
|
HeapTuple tuple;
|
|
|
|
ResultRelInfo *resultRelInfo;
|
|
|
|
Relation resultRelationDesc;
|
|
|
|
HTSU_Result result;
|
2012-10-26 21:55:36 +02:00
|
|
|
HeapUpdateFailureData hufd;
|
2009-10-10 03:43:50 +02:00
|
|
|
List *recheckIndexes = NIL;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
TupleConversionMap *saved_tcs_map = NULL;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* abort the operation if not running transactions
|
|
|
|
*/
|
|
|
|
if (IsBootstrapProcessingMode())
|
|
|
|
elog(ERROR, "cannot UPDATE during bootstrap");
|
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
if (tuple_updated)
|
|
|
|
*tuple_updated = false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize hufdp. Since the caller is only interested in the failure
|
|
|
|
* status, initialize with the state that is used to indicate successful
|
|
|
|
* operation.
|
|
|
|
*/
|
|
|
|
if (hufdp)
|
|
|
|
hufdp->result = HeapTupleMayBeUpdated;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* get the heap tuple out of the tuple table slot, making sure we have a
|
|
|
|
* writable copy
|
|
|
|
*/
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* get information on the (current) result relation
|
|
|
|
*/
|
|
|
|
resultRelInfo = estate->es_result_relation_info;
|
|
|
|
resultRelationDesc = resultRelInfo->ri_RelationDesc;
|
|
|
|
|
|
|
|
/* BEFORE ROW UPDATE Triggers */
|
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
2010-10-10 19:43:33 +02:00
|
|
|
resultRelInfo->ri_TrigDesc->trig_update_before_row)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2011-02-22 03:18:04 +01:00
|
|
|
slot = ExecBRUpdateTriggers(estate, epqstate, resultRelInfo,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
tupleid, oldtuple, slot, hufdp);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
if (slot == NULL) /* "do nothing" */
|
2009-10-10 03:43:50 +02:00
|
|
|
return NULL;
|
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
/* trigger might have changed tuple */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/* INSTEAD OF ROW UPDATE Triggers */
|
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
|
|
|
resultRelInfo->ri_TrigDesc->trig_update_instead_row)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2011-02-22 03:18:04 +01:00
|
|
|
slot = ExecIRUpdateTriggers(estate, resultRelInfo,
|
2014-03-23 07:16:34 +01:00
|
|
|
oldtuple, slot);
|
2010-10-10 19:43:33 +02:00
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
if (slot == NULL) /* "do nothing" */
|
2009-10-10 03:43:50 +02:00
|
|
|
return NULL;
|
|
|
|
|
2011-02-22 03:18:04 +01:00
|
|
|
/* trigger might have changed tuple */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
2010-10-10 19:43:33 +02:00
|
|
|
}
|
2013-03-10 19:14:53 +01:00
|
|
|
else if (resultRelInfo->ri_FdwRoutine)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* update in foreign table: let the FDW do it
|
|
|
|
*/
|
|
|
|
slot = resultRelInfo->ri_FdwRoutine->ExecForeignUpdate(estate,
|
|
|
|
resultRelInfo,
|
|
|
|
slot,
|
|
|
|
planSlot);
|
|
|
|
|
|
|
|
if (slot == NULL) /* "do nothing" */
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/* FDW might have changed tuple */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
2016-02-05 03:15:57 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* AFTER ROW Triggers or RETURNING expressions might reference the
|
|
|
|
* tableoid column, so initialize t_tableOid before evaluating them.
|
|
|
|
*/
|
|
|
|
tuple->t_tableOid = RelationGetRelid(resultRelationDesc);
|
2013-03-10 19:14:53 +01:00
|
|
|
}
|
2010-10-10 19:43:33 +02:00
|
|
|
else
|
|
|
|
{
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
bool partition_constraint_failed;
|
Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
1290721684-sup-3951@alvh.no-ip.org
1294953201-sup-2099@alvh.no-ip.org
1320343602-sup-2290@alvh.no-ip.org
1339690386-sup-8927@alvh.no-ip.org
4FE5FF020200002500048A3D@gw.wicourts.gov
4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
|
|
|
|
Don't allow system columns in CHECK constraints, except tableoid.
Previously, arbitray system columns could be mentioned in table
constraints, but they were not correctly checked at runtime, because
the values weren't actually set correctly in the tuple. Since it
seems easy enough to initialize the table OID properly, do that,
and continue allowing that column, but disallow the rest unless and
until someone figures out a way to make them work properly.
No back-patch, because this doesn't seem important enough to take the
risk of destabilizing the back branches. In fact, this will pose a
dump-and-reload hazard for those upgrading from previous versions:
constraints that were accepted before but were not correctly enforced
will now either be enforced correctly or not accepted at all. Either
could result in restore failures, but in practice I think very few
users will notice the difference, since the use case is pretty
marginal anyway and few users will be relying on features that have
not historically worked.
Amit Kapila, reviewed by Rushabh Lathia, with doc changes by me.
2013-09-23 19:31:22 +02:00
|
|
|
/*
|
|
|
|
* Constraints might reference the tableoid column, so initialize
|
|
|
|
* t_tableOid before evaluating them.
|
|
|
|
*/
|
|
|
|
tuple->t_tableOid = RelationGetRelid(resultRelationDesc);
|
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/*
|
2015-04-25 02:34:26 +02:00
|
|
|
* Check any RLS UPDATE WITH CHECK policies
|
2010-10-10 19:43:33 +02:00
|
|
|
*
|
|
|
|
* If we generate a new candidate tuple after EvalPlanQual testing, we
|
2015-04-25 02:34:26 +02:00
|
|
|
* must loop back here and recheck any RLS policies and constraints.
|
|
|
|
* (We don't need to redo triggers, however. If there are any BEFORE
|
|
|
|
* triggers then trigger.c will have done heap_lock_tuple to lock the
|
|
|
|
* correct tuple, so there's no need to do them again.)
|
2010-10-10 19:43:33 +02:00
|
|
|
*/
|
|
|
|
lreplace:;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If partition constraint fails, this row might get moved to another
|
|
|
|
* partition, in which case we should check the RLS CHECK policy just
|
|
|
|
* before inserting into the new partition, rather than doing it here.
|
|
|
|
* This is because a trigger on that partition might again change the
|
|
|
|
* row. So skip the WCO checks if the partition constraint fails.
|
|
|
|
*/
|
|
|
|
partition_constraint_failed =
|
|
|
|
resultRelInfo->ri_PartitionCheck &&
|
|
|
|
!ExecPartitionCheck(resultRelInfo, slot, estate);
|
|
|
|
|
|
|
|
if (!partition_constraint_failed &&
|
|
|
|
resultRelInfo->ri_WithCheckOptions != NIL)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* ExecWithCheckOptions() will skip any WCOs which are not of the
|
|
|
|
* kind we are looking for at this point.
|
|
|
|
*/
|
2015-04-25 02:34:26 +02:00
|
|
|
ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK,
|
|
|
|
resultRelInfo, slot, estate);
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If a partition check failed, try to move the row into the right
|
|
|
|
* partition.
|
|
|
|
*/
|
|
|
|
if (partition_constraint_failed)
|
|
|
|
{
|
|
|
|
bool tuple_deleted;
|
|
|
|
TupleTableSlot *ret_slot;
|
|
|
|
PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing;
|
|
|
|
int map_index;
|
|
|
|
TupleConversionMap *tupconv_map;
|
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
/*
|
|
|
|
* Disallow an INSERT ON CONFLICT DO UPDATE that causes the
|
|
|
|
* original row to migrate to a different partition. Maybe this
|
|
|
|
* can be implemented some day, but it seems a fringe feature with
|
|
|
|
* little redeeming value.
|
|
|
|
*/
|
|
|
|
if (((ModifyTable *) mtstate->ps.plan)->onConflictAction == ONCONFLICT_UPDATE)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("invalid ON UPDATE specification"),
|
|
|
|
errdetail("The result tuple would appear in a different partition than the original tuple.")));
|
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* When an UPDATE is run on a leaf partition, we will not have
|
|
|
|
* partition tuple routing set up. In that case, fail with
|
|
|
|
* partition constraint violation error.
|
|
|
|
*/
|
|
|
|
if (proute == NULL)
|
|
|
|
ExecPartitionCheckEmitError(resultRelInfo, slot, estate);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Row movement, part 1. Delete the tuple, but skip RETURNING
|
|
|
|
* processing. We want to return rows from INSERT.
|
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
ExecDelete(mtstate, tupleid, oldtuple, planSlot, epqstate,
|
|
|
|
estate, &tuple_deleted, false, hufdp, NULL,
|
|
|
|
false);
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* For some reason if DELETE didn't happen (e.g. trigger prevented
|
|
|
|
* it, or it was already deleted by self, or it was concurrently
|
|
|
|
* deleted by another transaction), then we should skip the insert
|
|
|
|
* as well; otherwise, an UPDATE could cause an increase in the
|
|
|
|
* total number of rows across all partitions, which is clearly
|
|
|
|
* wrong.
|
|
|
|
*
|
|
|
|
* For a normal UPDATE, the case where the tuple has been the
|
|
|
|
* subject of a concurrent UPDATE or DELETE would be handled by
|
|
|
|
* the EvalPlanQual machinery, but for an UPDATE that we've
|
|
|
|
* translated into a DELETE from this partition and an INSERT into
|
|
|
|
* some other partition, that's not available, because CTID chains
|
|
|
|
* can't span relation boundaries. We mimic the semantics to a
|
|
|
|
* limited extent by skipping the INSERT if the DELETE fails to
|
|
|
|
* find a tuple. This ensures that two concurrent attempts to
|
|
|
|
* UPDATE the same tuple at the same time can't turn one tuple
|
|
|
|
* into two, and that an UPDATE of a just-deleted tuple can't
|
|
|
|
* resurrect it.
|
|
|
|
*/
|
|
|
|
if (!tuple_deleted)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Updates set the transition capture map only when a new subplan
|
|
|
|
* is chosen. But for inserts, it is set for each row. So after
|
|
|
|
* INSERT, we need to revert back to the map created for UPDATE;
|
|
|
|
* otherwise the next UPDATE will incorrectly use the one created
|
|
|
|
* for INSERT. So first save the one created for UPDATE.
|
|
|
|
*/
|
|
|
|
if (mtstate->mt_transition_capture)
|
|
|
|
saved_tcs_map = mtstate->mt_transition_capture->tcs_map;
|
|
|
|
|
|
|
|
/*
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
* We should convert the tuple into root's tuple descriptor, since
|
|
|
|
* ExecInsert() starts the search from root. To do that, we need to
|
|
|
|
* retrieve the tuple conversion map for this resultRelInfo.
|
|
|
|
*
|
|
|
|
* If we're running MERGE then resultRelInfo is per-partition
|
|
|
|
* resultRelInfo as initialized in ExecInitPartitionInfo(). Note
|
|
|
|
* that we don't expand inheritance for the resultRelation in case
|
|
|
|
* of MERGE and hence there is just one subplan. Whereas for
|
|
|
|
* regular UPDATE, resultRelInfo is one of the per-subplan
|
|
|
|
* resultRelInfos. In either case the position of this partition in
|
|
|
|
* tracked in ri_PartitionLeafIndex;
|
|
|
|
*
|
|
|
|
* Retrieve the map either by looking at the resultRelInfo's
|
|
|
|
* position in mtstate->resultRelInfo[] (for UPDATE) or by simply
|
|
|
|
* using the ri_PartitionLeafIndex value (for MERGE).
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
if (mtstate->operation == CMD_MERGE)
|
|
|
|
{
|
|
|
|
map_index = resultRelInfo->ri_PartitionLeafIndex;
|
|
|
|
Assert(mtstate->rootResultRelInfo == NULL);
|
|
|
|
tupconv_map = TupConvMapForLeaf(proute,
|
|
|
|
mtstate->resultRelInfo,
|
|
|
|
map_index);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
map_index = resultRelInfo - mtstate->resultRelInfo;
|
|
|
|
Assert(map_index >= 0 && map_index < mtstate->mt_nplans);
|
|
|
|
tupconv_map = tupconv_map_for_subplan(mtstate, map_index);
|
|
|
|
}
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
tuple = ConvertPartitionTupleSlot(tupconv_map,
|
|
|
|
tuple,
|
|
|
|
proute->root_tuple_slot,
|
|
|
|
&slot);
|
|
|
|
|
2018-03-19 22:01:14 +01:00
|
|
|
/*
|
|
|
|
* Prepare for tuple routing, making it look like we're inserting
|
|
|
|
* into the root.
|
|
|
|
*/
|
2018-03-19 21:43:57 +01:00
|
|
|
slot = ExecPrepareTupleRouting(mtstate, estate, proute,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
getTargetResultRelInfo(mtstate),
|
|
|
|
slot);
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
2018-03-19 22:09:43 +01:00
|
|
|
ret_slot = ExecInsert(mtstate, slot, planSlot,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
estate, actionState, canSetTag);
|
|
|
|
|
|
|
|
/* Update is successful. */
|
|
|
|
if (tuple_updated)
|
|
|
|
*tuple_updated = true;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
2018-03-19 21:43:57 +01:00
|
|
|
/* Revert ExecPrepareTupleRouting's node change. */
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
estate->es_result_relation_info = resultRelInfo;
|
|
|
|
if (mtstate->mt_transition_capture)
|
|
|
|
{
|
|
|
|
mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL;
|
|
|
|
mtstate->mt_transition_capture->tcs_map = saved_tcs_map;
|
|
|
|
}
|
2018-03-19 21:43:57 +01:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
return ret_slot;
|
|
|
|
}
|
2015-04-25 02:34:26 +02:00
|
|
|
|
|
|
|
/*
|
2017-01-04 20:36:34 +01:00
|
|
|
* Check the constraints of the tuple. Note that we pass the same
|
|
|
|
* slot for the orig_slot argument, because unlike ExecInsert(), no
|
|
|
|
* tuple-routing is performed here, hence the slot remains unchanged.
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* We've already checked the partition constraint above; however, we
|
|
|
|
* must still ensure the tuple passes all other constraints, so we
|
|
|
|
* will call ExecConstraints() and have it validate all remaining
|
|
|
|
* checks.
|
2015-04-25 02:34:26 +02:00
|
|
|
*/
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
if (resultRelationDesc->rd_att->constr)
|
|
|
|
ExecConstraints(resultRelInfo, slot, estate, false);
|
2010-10-10 19:43:33 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* replace the heap tuple
|
|
|
|
*
|
|
|
|
* Note: if es_crosscheck_snapshot isn't InvalidSnapshot, we check
|
|
|
|
* that the row to be updated is visible to that snapshot, and throw a
|
|
|
|
* can't-serialize error if not. This is a special-case behavior
|
|
|
|
* needed for referential integrity updates in transaction-snapshot
|
|
|
|
* mode transactions.
|
|
|
|
*/
|
|
|
|
result = heap_update(resultRelationDesc, tupleid, tuple,
|
|
|
|
estate->es_output_cid,
|
|
|
|
estate->es_crosscheck_snapshot,
|
2013-05-29 22:58:43 +02:00
|
|
|
true /* wait for commit */ ,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
&hufd);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Copy the necessary information, if the caller has asked for it. We
|
|
|
|
* must do this irrespective of whether the tuple was updated or
|
|
|
|
* deleted.
|
|
|
|
*/
|
|
|
|
if (hufdp)
|
|
|
|
*hufdp = hufd;
|
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
switch (result)
|
|
|
|
{
|
|
|
|
case HeapTupleSelfUpdated:
|
2013-05-29 22:58:43 +02:00
|
|
|
|
2012-10-26 21:55:36 +02:00
|
|
|
/*
|
|
|
|
* The target tuple was already updated or deleted by the
|
|
|
|
* current command, or by a later command in the current
|
|
|
|
* transaction. The former case is possible in a join UPDATE
|
2013-05-29 22:58:43 +02:00
|
|
|
* where multiple tuples join to the same target tuple. This
|
|
|
|
* is pretty questionable, but Postgres has always allowed it:
|
|
|
|
* we just execute the first update action and ignore
|
|
|
|
* additional update attempts.
|
2012-10-26 21:55:36 +02:00
|
|
|
*
|
|
|
|
* The latter case arises if the tuple is modified by a
|
|
|
|
* command in a BEFORE trigger, or perhaps by a command in a
|
|
|
|
* volatile function used in the query. In such situations we
|
|
|
|
* should not ignore the update, but it is equally unsafe to
|
|
|
|
* proceed. We don't want to discard the original UPDATE
|
|
|
|
* while keeping the triggered actions based on it; and we
|
|
|
|
* have no principled way to merge this update with the
|
|
|
|
* previous ones. So throwing an error is the only safe
|
|
|
|
* course.
|
|
|
|
*
|
2013-05-29 22:58:43 +02:00
|
|
|
* If a trigger actually intends this type of interaction, it
|
|
|
|
* can re-execute the UPDATE (assuming it can figure out how)
|
|
|
|
* and then return NULL to cancel the outer update.
|
2012-10-26 21:55:36 +02:00
|
|
|
*/
|
|
|
|
if (hufd.cmax != estate->es_output_cid)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
|
|
|
|
errmsg("tuple to be updated was already modified by an operation triggered by the current command"),
|
|
|
|
errhint("Consider using an AFTER trigger instead of a BEFORE trigger to propagate changes to other rows.")));
|
|
|
|
|
|
|
|
/* Else, already updated by self; nothing to do */
|
2010-10-10 19:43:33 +02:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
case HeapTupleMayBeUpdated:
|
|
|
|
break;
|
|
|
|
|
|
|
|
case HeapTupleUpdated:
|
|
|
|
if (IsolationUsesXactSnapshot())
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
|
|
|
|
errmsg("could not serialize access due to concurrent update")));
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
2012-10-26 21:55:36 +02:00
|
|
|
if (!ItemPointerEquals(tupleid, &hufd.ctid))
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2010-10-10 19:43:33 +02:00
|
|
|
TupleTableSlot *epqslot;
|
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/*
|
|
|
|
* If we're executing MERGE, then the onus of running
|
|
|
|
* EvalPlanQual() and handling its outcome lies with the
|
|
|
|
* caller.
|
|
|
|
*/
|
|
|
|
if (actionState != NULL)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/* Regular UPDATE path. */
|
2010-10-10 19:43:33 +02:00
|
|
|
epqslot = EvalPlanQual(estate,
|
|
|
|
epqstate,
|
|
|
|
resultRelationDesc,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
GetEPQRangeTableIndex(resultRelInfo),
|
|
|
|
hufd.lockmode,
|
2012-10-26 21:55:36 +02:00
|
|
|
&hufd.ctid,
|
|
|
|
hufd.xmax);
|
2010-10-10 19:43:33 +02:00
|
|
|
if (!TupIsNull(epqslot))
|
|
|
|
{
|
2012-10-26 21:55:36 +02:00
|
|
|
*tupleid = hufd.ctid;
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/* Normal UPDATE path */
|
2010-10-10 19:43:33 +02:00
|
|
|
slot = ExecFilterJunk(resultRelInfo->ri_junkFilter, epqslot);
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
|
|
|
goto lreplace;
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* tuple already deleted; nothing to do. But MERGE might want
|
|
|
|
* to handle it differently. We've already filled-in hufdp
|
|
|
|
* with sufficient information for MERGE to look at.
|
|
|
|
*/
|
2010-10-10 19:43:33 +02:00
|
|
|
return NULL;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized heap_update status: %u", result);
|
|
|
|
return NULL;
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/*
|
|
|
|
* Note: instead of having to update the old index tuples associated
|
2011-04-10 17:42:00 +02:00
|
|
|
* with the heap tuple, all we do is form and insert new index tuples.
|
|
|
|
* This is because UPDATEs are actually DELETEs and INSERTs, and index
|
|
|
|
* tuple deletion is done later by VACUUM (see notes in ExecDelete).
|
|
|
|
* All we do here is insert new index tuples. -cim 9/27/89
|
2010-10-10 19:43:33 +02:00
|
|
|
*/
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2010-10-10 19:43:33 +02:00
|
|
|
/*
|
|
|
|
* insert index entries for tuple
|
|
|
|
*
|
|
|
|
* Note: heap_update returns the tid (location) of the new tuple in
|
|
|
|
* the t_self field.
|
|
|
|
*
|
|
|
|
* If it's a HOT update, we mustn't insert new index entries.
|
|
|
|
*/
|
|
|
|
if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
|
|
|
|
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
estate, false, NULL, NIL);
|
2010-10-10 19:43:33 +02:00
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
if (tuple_updated)
|
|
|
|
*tuple_updated = true;
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
if (canSetTag)
|
|
|
|
(estate->es_processed)++;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/* AFTER ROW UPDATE Triggers */
|
2014-03-23 07:16:34 +01:00
|
|
|
ExecARUpdateTriggers(estate, resultRelInfo, tupleid, oldtuple, tuple,
|
2017-06-28 19:55:03 +02:00
|
|
|
recheckIndexes,
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
mtstate->operation == CMD_INSERT ?
|
|
|
|
mtstate->mt_oc_transition_capture :
|
2017-06-28 19:55:03 +02:00
|
|
|
mtstate->mt_transition_capture);
|
2010-01-31 19:15:39 +01:00
|
|
|
|
|
|
|
list_free(recheckIndexes);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2015-04-25 02:34:26 +02:00
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* Check any WITH CHECK OPTION constraints from parent views. We are
|
|
|
|
* required to do this after testing all constraints and uniqueness
|
|
|
|
* violations per the SQL spec, so we do it after actually updating the
|
|
|
|
* record in the heap and all indexes.
|
2015-04-25 02:34:26 +02:00
|
|
|
*
|
2015-05-24 03:35:49 +02:00
|
|
|
* ExecWithCheckOptions() will skip any WCOs which are not of the kind we
|
|
|
|
* are looking for at this point.
|
2015-04-25 02:34:26 +02:00
|
|
|
*/
|
2013-07-18 23:10:16 +02:00
|
|
|
if (resultRelInfo->ri_WithCheckOptions != NIL)
|
2015-04-25 02:34:26 +02:00
|
|
|
ExecWithCheckOptions(WCO_VIEW_CHECK, resultRelInfo, slot, estate);
|
2013-07-18 23:10:16 +02:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/* Process RETURNING if present */
|
|
|
|
if (resultRelInfo->ri_projectReturning)
|
2016-03-18 18:48:58 +01:00
|
|
|
return ExecProcessReturning(resultRelInfo, slot, planSlot);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
/*
|
|
|
|
* ExecOnConflictUpdate --- execute UPDATE of INSERT ON CONFLICT DO UPDATE
|
|
|
|
*
|
|
|
|
* Try to lock tuple for update as part of speculative insertion. If
|
|
|
|
* a qual originating from ON CONFLICT DO UPDATE is satisfied, update
|
|
|
|
* (but still lock row, even though it may not satisfy estate's
|
|
|
|
* snapshot).
|
|
|
|
*
|
|
|
|
* Returns true if if we're done (with or without an update), or false if
|
|
|
|
* the caller must retry the INSERT from scratch.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
ExecOnConflictUpdate(ModifyTableState *mtstate,
|
|
|
|
ResultRelInfo *resultRelInfo,
|
|
|
|
ItemPointer conflictTid,
|
|
|
|
TupleTableSlot *planSlot,
|
|
|
|
TupleTableSlot *excludedSlot,
|
|
|
|
EState *estate,
|
|
|
|
bool canSetTag,
|
|
|
|
TupleTableSlot **returning)
|
|
|
|
{
|
|
|
|
ExprContext *econtext = mtstate->ps.ps_ExprContext;
|
|
|
|
Relation relation = resultRelInfo->ri_RelationDesc;
|
2018-03-26 15:43:54 +02:00
|
|
|
ExprState *onConflictSetWhere = resultRelInfo->ri_onConflict->oc_WhereClause;
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
HeapTupleData tuple;
|
|
|
|
HeapUpdateFailureData hufd;
|
|
|
|
LockTupleMode lockmode;
|
|
|
|
HTSU_Result test;
|
|
|
|
Buffer buffer;
|
|
|
|
|
|
|
|
/* Determine lock mode to use */
|
|
|
|
lockmode = ExecUpdateLockMode(estate, resultRelInfo);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Lock tuple for update. Don't follow updates when tuple cannot be
|
|
|
|
* locked without doing so. A row locking conflict here means our
|
|
|
|
* previous conclusion that the tuple is conclusively committed is not
|
|
|
|
* true anymore.
|
|
|
|
*/
|
|
|
|
tuple.t_self = *conflictTid;
|
|
|
|
test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
|
|
|
|
lockmode, LockWaitBlock, false, &buffer,
|
|
|
|
&hufd);
|
|
|
|
switch (test)
|
|
|
|
{
|
|
|
|
case HeapTupleMayBeUpdated:
|
|
|
|
/* success! */
|
|
|
|
break;
|
|
|
|
|
|
|
|
case HeapTupleInvisible:
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This can occur when a just inserted tuple is updated again in
|
|
|
|
* the same command. E.g. because multiple rows with the same
|
|
|
|
* conflicting key values are inserted.
|
|
|
|
*
|
|
|
|
* This is somewhat similar to the ExecUpdate()
|
|
|
|
* HeapTupleSelfUpdated case. We do not want to proceed because
|
|
|
|
* it would lead to the same row being updated a second time in
|
|
|
|
* some unspecified order, and in contrast to plain UPDATEs
|
|
|
|
* there's no historical behavior to break.
|
|
|
|
*
|
|
|
|
* It is the user's responsibility to prevent this situation from
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
* occurring. These problems are why SQL Standard similarly
|
|
|
|
* specifies that for SQL MERGE, an exception must be raised in
|
|
|
|
* the event of an attempt to update the same row twice.
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
*/
|
|
|
|
if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data)))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_CARDINALITY_VIOLATION),
|
|
|
|
errmsg("ON CONFLICT DO UPDATE command cannot affect row a second time"),
|
|
|
|
errhint("Ensure that no rows proposed for insertion within the same command have duplicate constrained values.")));
|
|
|
|
|
|
|
|
/* This shouldn't happen */
|
|
|
|
elog(ERROR, "attempted to lock invisible tuple");
|
|
|
|
|
|
|
|
case HeapTupleSelfUpdated:
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This state should never be reached. As a dirty snapshot is used
|
|
|
|
* to find conflicting tuples, speculative insertion wouldn't have
|
|
|
|
* seen this row to conflict with.
|
|
|
|
*/
|
|
|
|
elog(ERROR, "unexpected self-updated tuple");
|
|
|
|
|
|
|
|
case HeapTupleUpdated:
|
|
|
|
if (IsolationUsesXactSnapshot())
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
|
|
|
|
errmsg("could not serialize access due to concurrent update")));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Tell caller to try again from the very start.
|
|
|
|
*
|
|
|
|
* It does not make sense to use the usual EvalPlanQual() style
|
|
|
|
* loop here, as the new version of the row might not conflict
|
|
|
|
* anymore, or the conflicting tuple has actually been deleted.
|
|
|
|
*/
|
|
|
|
ReleaseBuffer(buffer);
|
|
|
|
return false;
|
|
|
|
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized heap_lock_tuple status: %u", test);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Success, the tuple is locked.
|
|
|
|
*
|
|
|
|
* Reset per-tuple memory context to free any expression evaluation
|
|
|
|
* storage allocated in the previous cycle.
|
|
|
|
*/
|
|
|
|
ResetExprContext(econtext);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Verify that the tuple is visible to our MVCC snapshot if the current
|
|
|
|
* isolation level mandates that.
|
|
|
|
*
|
|
|
|
* It's not sufficient to rely on the check within ExecUpdate() as e.g.
|
|
|
|
* CONFLICT ... WHERE clause may prevent us from reaching that.
|
|
|
|
*
|
|
|
|
* This means we only ever continue when a new command in the current
|
|
|
|
* transaction could see the row, even though in READ COMMITTED mode the
|
|
|
|
* tuple will not be visible according to the current statement's
|
|
|
|
* snapshot. This is in line with the way UPDATE deals with newer tuple
|
|
|
|
* versions.
|
|
|
|
*/
|
|
|
|
ExecCheckHeapTupleVisible(estate, &tuple, buffer);
|
|
|
|
|
|
|
|
/* Store target's existing tuple in the state's dedicated slot */
|
|
|
|
ExecStoreTuple(&tuple, mtstate->mt_existing, buffer, false);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make tuple and any needed join variables available to ExecQual and
|
|
|
|
* ExecProject. The EXCLUDED tuple is installed in ecxt_innertuple, while
|
2015-05-24 03:35:49 +02:00
|
|
|
* the target's existing tuple is installed in the scantuple. EXCLUDED
|
|
|
|
* has been made to reference INNER_VAR in setrefs.c, but there is no
|
|
|
|
* other redirection.
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
*/
|
|
|
|
econtext->ecxt_scantuple = mtstate->mt_existing;
|
|
|
|
econtext->ecxt_innertuple = excludedSlot;
|
|
|
|
econtext->ecxt_outertuple = NULL;
|
|
|
|
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
if (!ExecQual(onConflictSetWhere, econtext))
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
{
|
|
|
|
ReleaseBuffer(buffer);
|
|
|
|
InstrCountFiltered1(&mtstate->ps, 1);
|
|
|
|
return true; /* done with the tuple */
|
|
|
|
}
|
|
|
|
|
|
|
|
if (resultRelInfo->ri_WithCheckOptions != NIL)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Check target's existing tuple against UPDATE-applicable USING
|
|
|
|
* security barrier quals (if any), enforced here as RLS checks/WCOs.
|
|
|
|
*
|
|
|
|
* The rewriter creates UPDATE RLS checks/WCOs for UPDATE security
|
|
|
|
* quals, and stores them as WCOs of "kind" WCO_RLS_CONFLICT_CHECK,
|
|
|
|
* but that's almost the extent of its special handling for ON
|
|
|
|
* CONFLICT DO UPDATE.
|
|
|
|
*
|
|
|
|
* The rewriter will also have associated UPDATE applicable straight
|
|
|
|
* RLS checks/WCOs for the benefit of the ExecUpdate() call that
|
|
|
|
* follows. INSERTs and UPDATEs naturally have mutually exclusive WCO
|
|
|
|
* kinds, so there is no danger of spurious over-enforcement in the
|
|
|
|
* INSERT or UPDATE path.
|
|
|
|
*/
|
|
|
|
ExecWithCheckOptions(WCO_RLS_CONFLICT_CHECK, resultRelInfo,
|
|
|
|
mtstate->mt_existing,
|
|
|
|
mtstate->ps.state);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Project the new tuple version */
|
2018-03-26 15:43:54 +02:00
|
|
|
ExecProject(resultRelInfo->ri_onConflict->oc_ProjInfo);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
Fix ON CONFLICT UPDATE bug breaking AFTER UPDATE triggers.
ExecOnConflictUpdate() passed t_ctid of the to-be-updated tuple to
ExecUpdate(). That's problematic primarily because of two reason: First
and foremost t_ctid could point to a different tuple. Secondly, and
that's what triggered the complaint by Stanislav, t_ctid is changed by
heap_update() to point to the new tuple version. The behavior of AFTER
UPDATE triggers was therefore broken, with NEW.* and OLD.* tuples
spuriously identical within AFTER UPDATE triggers.
To fix both issues, pass a pointer to t_self of a on-stack HeapTuple
instead.
Fixing this bug lead to one change in regression tests, which previously
failed due to the first issue mentioned above. There's a reasonable
expectation that test fails, as it updates one row repeatedly within one
INSERT ... ON CONFLICT statement. That is only possible if the second
update is triggered via ON CONFLICT ... SET, ON CONFLICT ... WHERE, or
by a WITH CHECK expression, as those are executed after
ExecOnConflictUpdate() does a visibility check. That could easily be
prohibited, but given it's allowed for plain UPDATEs and a rare corner
case, it doesn't seem worthwhile.
Reported-By: Stanislav Grozev
Author: Andres Freund and Peter Geoghegan
Discussion: CAA78GVqy1+LisN-8DygekD_Ldfy=BJLarSpjGhytOsgkpMavfQ@mail.gmail.com
Backpatch: 9.5, where ON CONFLICT was introduced
2015-12-10 16:26:45 +01:00
|
|
|
/*
|
|
|
|
* Note that it is possible that the target tuple has been modified in
|
|
|
|
* this session, after the above heap_lock_tuple. We choose to not error
|
2016-06-10 00:02:36 +02:00
|
|
|
* out in that case, in line with ExecUpdate's treatment of similar cases.
|
|
|
|
* This can happen if an UPDATE is triggered from within ExecQual(),
|
|
|
|
* ExecWithCheckOptions() or ExecProject() above, e.g. by selecting from a
|
|
|
|
* wCTE in the ON CONFLICT's SET.
|
Fix ON CONFLICT UPDATE bug breaking AFTER UPDATE triggers.
ExecOnConflictUpdate() passed t_ctid of the to-be-updated tuple to
ExecUpdate(). That's problematic primarily because of two reason: First
and foremost t_ctid could point to a different tuple. Secondly, and
that's what triggered the complaint by Stanislav, t_ctid is changed by
heap_update() to point to the new tuple version. The behavior of AFTER
UPDATE triggers was therefore broken, with NEW.* and OLD.* tuples
spuriously identical within AFTER UPDATE triggers.
To fix both issues, pass a pointer to t_self of a on-stack HeapTuple
instead.
Fixing this bug lead to one change in regression tests, which previously
failed due to the first issue mentioned above. There's a reasonable
expectation that test fails, as it updates one row repeatedly within one
INSERT ... ON CONFLICT statement. That is only possible if the second
update is triggered via ON CONFLICT ... SET, ON CONFLICT ... WHERE, or
by a WITH CHECK expression, as those are executed after
ExecOnConflictUpdate() does a visibility check. That could easily be
prohibited, but given it's allowed for plain UPDATEs and a rare corner
case, it doesn't seem worthwhile.
Reported-By: Stanislav Grozev
Author: Andres Freund and Peter Geoghegan
Discussion: CAA78GVqy1+LisN-8DygekD_Ldfy=BJLarSpjGhytOsgkpMavfQ@mail.gmail.com
Backpatch: 9.5, where ON CONFLICT was introduced
2015-12-10 16:26:45 +01:00
|
|
|
*/
|
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
/* Execute UPDATE with projection */
|
2017-06-28 19:55:03 +02:00
|
|
|
*returning = ExecUpdate(mtstate, &tuple.t_self, NULL,
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
mtstate->mt_conflproj, planSlot,
|
|
|
|
&mtstate->mt_epqstate, mtstate->ps.state,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
NULL, NULL, NULL, canSetTag);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
|
|
|
ReleaseBuffer(buffer);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Process BEFORE EACH STATEMENT triggers
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
fireBSTriggers(ModifyTableState *node)
|
|
|
|
{
|
2018-03-19 22:09:43 +01:00
|
|
|
ModifyTable *plan = (ModifyTable *) node->ps.plan;
|
2017-05-17 22:31:56 +02:00
|
|
|
ResultRelInfo *resultRelInfo = node->resultRelInfo;
|
2017-05-01 14:23:01 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If the node modifies a partitioned table, we must fire its triggers.
|
|
|
|
* Note that in that case, node->resultRelInfo points to the first leaf
|
|
|
|
* partition, not the root table.
|
|
|
|
*/
|
|
|
|
if (node->rootResultRelInfo != NULL)
|
|
|
|
resultRelInfo = node->rootResultRelInfo;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
switch (node->operation)
|
|
|
|
{
|
|
|
|
case CMD_INSERT:
|
2017-05-01 14:23:01 +02:00
|
|
|
ExecBSInsertTriggers(node->ps.state, resultRelInfo);
|
2018-03-19 22:09:43 +01:00
|
|
|
if (plan->onConflictAction == ONCONFLICT_UPDATE)
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
ExecBSUpdateTriggers(node->ps.state,
|
2017-05-01 14:23:01 +02:00
|
|
|
resultRelInfo);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
case CMD_UPDATE:
|
2017-05-01 14:23:01 +02:00
|
|
|
ExecBSUpdateTriggers(node->ps.state, resultRelInfo);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
case CMD_DELETE:
|
2017-05-01 14:23:01 +02:00
|
|
|
ExecBSDeleteTriggers(node->ps.state, resultRelInfo);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
case CMD_MERGE:
|
|
|
|
if (node->mt_merge_subcommands & MERGE_INSERT)
|
|
|
|
ExecBSInsertTriggers(node->ps.state, resultRelInfo);
|
|
|
|
if (node->mt_merge_subcommands & MERGE_UPDATE)
|
|
|
|
ExecBSUpdateTriggers(node->ps.state, resultRelInfo);
|
|
|
|
if (node->mt_merge_subcommands & MERGE_DELETE)
|
|
|
|
ExecBSDeleteTriggers(node->ps.state, resultRelInfo);
|
|
|
|
break;
|
2009-10-10 03:43:50 +02:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unknown operation");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* Return the target rel ResultRelInfo.
|
|
|
|
*
|
|
|
|
* This relation is the same as :
|
|
|
|
* - the relation for which we will fire AFTER STATEMENT triggers.
|
|
|
|
* - the relation into whose tuple format all captured transition tuples must
|
|
|
|
* be converted.
|
|
|
|
* - the root partitioned table.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
2017-06-28 19:55:03 +02:00
|
|
|
static ResultRelInfo *
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
getTargetResultRelInfo(ModifyTableState *node)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2017-05-01 14:23:01 +02:00
|
|
|
/*
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* Note that if the node modifies a partitioned table, node->resultRelInfo
|
|
|
|
* points to the first leaf partition, not the root table.
|
2017-05-01 14:23:01 +02:00
|
|
|
*/
|
|
|
|
if (node->rootResultRelInfo != NULL)
|
2017-06-28 19:55:03 +02:00
|
|
|
return node->rootResultRelInfo;
|
|
|
|
else
|
|
|
|
return node->resultRelInfo;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Process AFTER EACH STATEMENT triggers
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
fireASTriggers(ModifyTableState *node)
|
|
|
|
{
|
2018-03-19 22:09:43 +01:00
|
|
|
ModifyTable *plan = (ModifyTable *) node->ps.plan;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ResultRelInfo *resultRelInfo = getTargetResultRelInfo(node);
|
2017-05-01 14:23:01 +02:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
switch (node->operation)
|
|
|
|
{
|
|
|
|
case CMD_INSERT:
|
2018-03-19 22:09:43 +01:00
|
|
|
if (plan->onConflictAction == ONCONFLICT_UPDATE)
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
ExecASUpdateTriggers(node->ps.state,
|
2017-06-28 19:59:01 +02:00
|
|
|
resultRelInfo,
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
node->mt_oc_transition_capture);
|
2017-06-28 19:59:01 +02:00
|
|
|
ExecASInsertTriggers(node->ps.state, resultRelInfo,
|
|
|
|
node->mt_transition_capture);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
case CMD_UPDATE:
|
2017-06-28 19:59:01 +02:00
|
|
|
ExecASUpdateTriggers(node->ps.state, resultRelInfo,
|
|
|
|
node->mt_transition_capture);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
case CMD_DELETE:
|
2017-06-28 19:59:01 +02:00
|
|
|
ExecASDeleteTriggers(node->ps.state, resultRelInfo,
|
|
|
|
node->mt_transition_capture);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
case CMD_MERGE:
|
|
|
|
if (node->mt_merge_subcommands & MERGE_DELETE)
|
|
|
|
ExecASDeleteTriggers(node->ps.state, resultRelInfo,
|
|
|
|
node->mt_transition_capture);
|
|
|
|
if (node->mt_merge_subcommands & MERGE_UPDATE)
|
|
|
|
ExecASUpdateTriggers(node->ps.state, resultRelInfo,
|
|
|
|
node->mt_transition_capture);
|
|
|
|
if (node->mt_merge_subcommands & MERGE_INSERT)
|
|
|
|
ExecASInsertTriggers(node->ps.state, resultRelInfo,
|
|
|
|
node->mt_transition_capture);
|
|
|
|
break;
|
2009-10-10 03:43:50 +02:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unknown operation");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-06-28 19:55:03 +02:00
|
|
|
/*
|
|
|
|
* Set up the state needed for collecting transition tuples for AFTER
|
|
|
|
* triggers.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
ExecSetupTransitionCaptureState(ModifyTableState *mtstate, EState *estate)
|
|
|
|
{
|
2018-03-19 22:09:43 +01:00
|
|
|
ModifyTable *plan = (ModifyTable *) mtstate->ps.plan;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ResultRelInfo *targetRelInfo = getTargetResultRelInfo(mtstate);
|
2017-06-28 19:55:03 +02:00
|
|
|
|
|
|
|
/* Check for transition tables on the directly targeted relation. */
|
|
|
|
mtstate->mt_transition_capture =
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
MakeTransitionCaptureState(targetRelInfo->ri_TrigDesc,
|
|
|
|
RelationGetRelid(targetRelInfo->ri_RelationDesc),
|
|
|
|
mtstate->operation);
|
2018-03-19 22:09:43 +01:00
|
|
|
if (plan->operation == CMD_INSERT &&
|
|
|
|
plan->onConflictAction == ONCONFLICT_UPDATE)
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
mtstate->mt_oc_transition_capture =
|
|
|
|
MakeTransitionCaptureState(targetRelInfo->ri_TrigDesc,
|
|
|
|
RelationGetRelid(targetRelInfo->ri_RelationDesc),
|
|
|
|
CMD_UPDATE);
|
2017-06-28 19:55:03 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If we found that we need to collect transition tuples then we may also
|
|
|
|
* need tuple conversion maps for any children that have TupleDescs that
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
* aren't compatible with the tuplestores. (We can share these maps
|
|
|
|
* between the regular and ON CONFLICT cases.)
|
2017-06-28 19:55:03 +02:00
|
|
|
*/
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
if (mtstate->mt_transition_capture != NULL ||
|
|
|
|
mtstate->mt_oc_transition_capture != NULL)
|
2017-06-28 19:55:03 +02:00
|
|
|
{
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ExecSetupChildParentMapForTcs(mtstate);
|
2017-06-28 19:55:03 +02:00
|
|
|
|
|
|
|
/*
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* Install the conversion map for the first plan for UPDATE and DELETE
|
|
|
|
* operations. It will be advanced each time we switch to the next
|
|
|
|
* plan. (INSERT operations set it every time, so we need not update
|
|
|
|
* mtstate->mt_oc_transition_capture here.)
|
2017-06-28 19:55:03 +02:00
|
|
|
*/
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
if (mtstate->mt_transition_capture && mtstate->operation != CMD_INSERT)
|
|
|
|
mtstate->mt_transition_capture->tcs_map =
|
|
|
|
tupconv_map_for_subplan(mtstate, 0);
|
|
|
|
}
|
|
|
|
}
|
2017-10-12 22:50:53 +02:00
|
|
|
|
2018-03-19 21:43:57 +01:00
|
|
|
/*
|
|
|
|
* ExecPrepareTupleRouting --- prepare for routing one tuple
|
|
|
|
*
|
|
|
|
* Determine the partition in which the tuple in slot is to be inserted,
|
|
|
|
* and modify mtstate and estate to prepare for it.
|
|
|
|
*
|
|
|
|
* Caller must revert the estate changes after executing the insertion!
|
|
|
|
* In mtstate, transition capture changes may also need to be reverted.
|
|
|
|
*
|
|
|
|
* Returns a slot holding the tuple of the partition rowtype.
|
|
|
|
*/
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
TupleTableSlot *
|
2018-03-19 21:43:57 +01:00
|
|
|
ExecPrepareTupleRouting(ModifyTableState *mtstate,
|
|
|
|
EState *estate,
|
|
|
|
PartitionTupleRouting *proute,
|
|
|
|
ResultRelInfo *targetRelInfo,
|
|
|
|
TupleTableSlot *slot)
|
|
|
|
{
|
2018-03-26 15:43:54 +02:00
|
|
|
ModifyTable *node;
|
2018-03-19 21:43:57 +01:00
|
|
|
int partidx;
|
|
|
|
ResultRelInfo *partrel;
|
|
|
|
HeapTuple tuple;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Determine the target partition. If ExecFindPartition does not find
|
|
|
|
* a partition after all, it doesn't return here; otherwise, the returned
|
|
|
|
* value is to be used as an index into the arrays for the ResultRelInfo
|
|
|
|
* and TupleConversionMap for the partition.
|
|
|
|
*/
|
|
|
|
partidx = ExecFindPartition(targetRelInfo,
|
|
|
|
proute->partition_dispatch_info,
|
|
|
|
slot,
|
|
|
|
estate);
|
|
|
|
Assert(partidx >= 0 && partidx < proute->num_partitions);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get the ResultRelInfo corresponding to the selected partition; if not
|
|
|
|
* yet there, initialize it.
|
|
|
|
*/
|
|
|
|
partrel = proute->partitions[partidx];
|
|
|
|
if (partrel == NULL)
|
|
|
|
partrel = ExecInitPartitionInfo(mtstate, targetRelInfo,
|
|
|
|
proute, estate,
|
|
|
|
partidx);
|
|
|
|
|
|
|
|
/* We do not yet have a way to insert into a foreign partition */
|
|
|
|
if (partrel->ri_FdwRoutine)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("cannot route inserted tuples to a foreign table")));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make it look like we are inserting into the partition.
|
|
|
|
*/
|
|
|
|
estate->es_result_relation_info = partrel;
|
|
|
|
|
|
|
|
/* Get the heap tuple out of the given slot. */
|
|
|
|
tuple = ExecMaterializeSlot(slot);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we're capturing transition tuples, we might need to convert from the
|
|
|
|
* partition rowtype to parent rowtype.
|
|
|
|
*/
|
|
|
|
if (mtstate->mt_transition_capture != NULL)
|
|
|
|
{
|
|
|
|
if (partrel->ri_TrigDesc &&
|
|
|
|
partrel->ri_TrigDesc->trig_insert_before_row)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* If there are any BEFORE triggers on the partition, we'll have
|
|
|
|
* to be ready to convert their result back to tuplestore format.
|
|
|
|
*/
|
|
|
|
mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL;
|
|
|
|
mtstate->mt_transition_capture->tcs_map =
|
|
|
|
TupConvMapForLeaf(proute, targetRelInfo, partidx);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Otherwise, just remember the original unconverted tuple, to
|
|
|
|
* avoid a needless round trip conversion.
|
|
|
|
*/
|
|
|
|
mtstate->mt_transition_capture->tcs_original_insert_tuple = tuple;
|
|
|
|
mtstate->mt_transition_capture->tcs_map = NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (mtstate->mt_oc_transition_capture != NULL)
|
|
|
|
{
|
|
|
|
mtstate->mt_oc_transition_capture->tcs_map =
|
|
|
|
TupConvMapForLeaf(proute, targetRelInfo, partidx);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert the tuple, if necessary.
|
|
|
|
*/
|
|
|
|
ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[partidx],
|
|
|
|
tuple,
|
|
|
|
proute->partition_tuple_slot,
|
|
|
|
&slot);
|
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
/* Initialize information needed to handle ON CONFLICT DO UPDATE. */
|
|
|
|
Assert(mtstate != NULL);
|
|
|
|
node = (ModifyTable *) mtstate->ps.plan;
|
|
|
|
if (node->onConflictAction == ONCONFLICT_UPDATE)
|
|
|
|
{
|
|
|
|
Assert(mtstate->mt_existing != NULL);
|
|
|
|
ExecSetSlotDescriptor(mtstate->mt_existing,
|
|
|
|
RelationGetDescr(partrel->ri_RelationDesc));
|
|
|
|
Assert(mtstate->mt_conflproj != NULL);
|
|
|
|
ExecSetSlotDescriptor(mtstate->mt_conflproj,
|
|
|
|
partrel->ri_onConflict->oc_ProjTupdesc);
|
|
|
|
}
|
|
|
|
|
2018-03-19 21:43:57 +01:00
|
|
|
return slot;
|
|
|
|
}
|
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* Initialize the child-to-root tuple conversion map array for UPDATE subplans.
|
|
|
|
*
|
|
|
|
* This map array is required to convert the tuple from the subplan result rel
|
|
|
|
* to the target table descriptor. This requirement arises for two independent
|
|
|
|
* scenarios:
|
|
|
|
* 1. For update-tuple-routing.
|
|
|
|
* 2. For capturing tuples in transition tables.
|
|
|
|
*/
|
2018-01-25 20:32:28 +01:00
|
|
|
static void
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate)
|
|
|
|
{
|
|
|
|
ResultRelInfo *targetRelInfo = getTargetResultRelInfo(mtstate);
|
|
|
|
ResultRelInfo *resultRelInfos = mtstate->resultRelInfo;
|
|
|
|
TupleDesc outdesc;
|
|
|
|
int numResultRelInfos = mtstate->mt_nplans;
|
|
|
|
int i;
|
2017-10-12 22:50:53 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* First check if there is already a per-subplan array allocated. Even if
|
|
|
|
* there is already a per-leaf map array, we won't require a per-subplan
|
|
|
|
* one, since we will use the subplan offset array to convert the subplan
|
|
|
|
* index to per-leaf index.
|
|
|
|
*/
|
|
|
|
if (mtstate->mt_per_subplan_tupconv_maps ||
|
|
|
|
(mtstate->mt_partition_tuple_routing &&
|
|
|
|
mtstate->mt_partition_tuple_routing->child_parent_tupconv_maps))
|
|
|
|
return;
|
2017-10-12 22:50:53 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* Build array of conversion maps from each child's TupleDesc to the one
|
|
|
|
* used in the target relation. The map pointers may be NULL when no
|
|
|
|
* conversion is necessary, which is hopefully a common case.
|
|
|
|
*/
|
2017-06-28 19:55:03 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/* Get tuple descriptor of the target rel. */
|
|
|
|
outdesc = RelationGetDescr(targetRelInfo->ri_RelationDesc);
|
|
|
|
|
|
|
|
mtstate->mt_per_subplan_tupconv_maps = (TupleConversionMap **)
|
|
|
|
palloc(sizeof(TupleConversionMap *) * numResultRelInfos);
|
|
|
|
|
|
|
|
for (i = 0; i < numResultRelInfos; ++i)
|
|
|
|
{
|
|
|
|
mtstate->mt_per_subplan_tupconv_maps[i] =
|
|
|
|
convert_tuples_by_name(RelationGetDescr(resultRelInfos[i].ri_RelationDesc),
|
|
|
|
outdesc,
|
|
|
|
gettext_noop("could not convert row type"));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize the child-to-root tuple conversion map array required for
|
|
|
|
* capturing transition tuples.
|
|
|
|
*
|
|
|
|
* The map array can be indexed either by subplan index or by leaf-partition
|
|
|
|
* index. For transition tables, we need a subplan-indexed access to the map,
|
|
|
|
* and where tuple-routing is present, we also require a leaf-indexed access.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
ExecSetupChildParentMapForTcs(ModifyTableState *mtstate)
|
|
|
|
{
|
|
|
|
PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If partition tuple routing is set up, we will require partition-indexed
|
|
|
|
* access. In that case, create the map array indexed by partition; we
|
|
|
|
* will still be able to access the maps using a subplan index by
|
|
|
|
* converting the subplan index to a partition index using
|
|
|
|
* subplan_partition_offsets. If tuple routing is not set up, it means we
|
|
|
|
* don't require partition-indexed access. In that case, create just a
|
|
|
|
* subplan-indexed map.
|
|
|
|
*/
|
|
|
|
if (proute)
|
|
|
|
{
|
2017-06-28 19:55:03 +02:00
|
|
|
/*
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
* If a partition-indexed map array is to be created, the subplan map
|
|
|
|
* array has to be NULL. If the subplan map array is already created,
|
|
|
|
* we won't be able to access the map using a partition index.
|
2017-06-28 19:55:03 +02:00
|
|
|
*/
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
Assert(mtstate->mt_per_subplan_tupconv_maps == NULL);
|
|
|
|
|
|
|
|
ExecSetupChildParentMapForLeaf(proute);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
ExecSetupChildParentMapForSubplan(mtstate);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* For a given subplan index, get the tuple conversion map.
|
|
|
|
*/
|
|
|
|
static TupleConversionMap *
|
|
|
|
tupconv_map_for_subplan(ModifyTableState *mtstate, int whichplan)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* If a partition-index tuple conversion map array is allocated, we need
|
|
|
|
* to first get the index into the partition array. Exactly *one* of the
|
|
|
|
* two arrays is allocated. This is because if there is a partition array
|
|
|
|
* required, we don't require subplan-indexed array since we can translate
|
|
|
|
* subplan index into partition index. And, we create a subplan-indexed
|
|
|
|
* array *only* if partition-indexed array is not required.
|
|
|
|
*/
|
|
|
|
if (mtstate->mt_per_subplan_tupconv_maps == NULL)
|
|
|
|
{
|
|
|
|
int leaf_index;
|
|
|
|
PartitionTupleRouting *proute = mtstate->mt_partition_tuple_routing;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If subplan-indexed array is NULL, things should have been arranged
|
|
|
|
* to convert the subplan index to partition index.
|
|
|
|
*/
|
2018-01-24 22:34:51 +01:00
|
|
|
Assert(proute && proute->subplan_partition_offsets != NULL &&
|
|
|
|
whichplan < proute->num_subplan_partition_offsets);
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
|
|
|
|
leaf_index = proute->subplan_partition_offsets[whichplan];
|
|
|
|
|
|
|
|
return TupConvMapForLeaf(proute, getTargetResultRelInfo(mtstate),
|
|
|
|
leaf_index);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
Assert(whichplan >= 0 && whichplan < mtstate->mt_nplans);
|
|
|
|
return mtstate->mt_per_subplan_tupconv_maps[whichplan];
|
2017-06-28 19:55:03 +02:00
|
|
|
}
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecModifyTable
|
|
|
|
*
|
|
|
|
* Perform table modifications as required, and return RETURNING results
|
|
|
|
* if needed.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
2017-07-17 09:33:49 +02:00
|
|
|
static TupleTableSlot *
|
|
|
|
ExecModifyTable(PlanState *pstate)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2017-07-17 09:33:49 +02:00
|
|
|
ModifyTableState *node = castNode(ModifyTableState, pstate);
|
2018-03-19 21:43:57 +01:00
|
|
|
PartitionTupleRouting *proute = node->mt_partition_tuple_routing;
|
2010-02-26 03:01:40 +01:00
|
|
|
EState *estate = node->ps.state;
|
|
|
|
CmdType operation = node->operation;
|
2011-02-26 00:56:23 +01:00
|
|
|
ResultRelInfo *saved_resultRelInfo;
|
|
|
|
ResultRelInfo *resultRelInfo;
|
2010-02-26 03:01:40 +01:00
|
|
|
PlanState *subplanstate;
|
2009-10-10 03:43:50 +02:00
|
|
|
JunkFilter *junkfilter;
|
|
|
|
TupleTableSlot *slot;
|
|
|
|
TupleTableSlot *planSlot;
|
Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table. That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables. However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping. This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.
To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table. We can conveniently call it from
preprocess_targetlist. Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist. (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)
Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't. (That's
really an independent bug, but it manifests in much the same cases.)
Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.
Back-patch to 9.5 where the problem was introduced.
Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat
Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
2017-11-27 23:53:56 +01:00
|
|
|
ItemPointer tupleid;
|
2009-10-10 03:43:50 +02:00
|
|
|
ItemPointerData tuple_ctid;
|
2014-03-23 07:16:34 +01:00
|
|
|
HeapTupleData oldtupdata;
|
|
|
|
HeapTuple oldtuple;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2017-07-26 02:37:17 +02:00
|
|
|
CHECK_FOR_INTERRUPTS();
|
|
|
|
|
2012-01-28 23:43:57 +01:00
|
|
|
/*
|
|
|
|
* This should NOT get called during EvalPlanQual; we should have passed a
|
|
|
|
* subplan tree to EvalPlanQual, instead. Use a runtime test not just
|
|
|
|
* Assert because this condition is easy to miss in testing. (Note:
|
|
|
|
* although ModifyTable should not get executed within an EvalPlanQual
|
|
|
|
* operation, we do have to allow it to be initialized and shut down in
|
|
|
|
* case it is within a CTE subplan. Hence this test must be here, not in
|
|
|
|
* ExecInitModifyTable.)
|
|
|
|
*/
|
|
|
|
if (estate->es_epqTuple != NULL)
|
|
|
|
elog(ERROR, "ModifyTable should not be called during EvalPlanQual");
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
/*
|
|
|
|
* If we've already completed processing, don't try to do more. We need
|
|
|
|
* this test because ExecPostprocessPlan might call us an extra time, and
|
|
|
|
* our subplan's nodes aren't necessarily robust against being called
|
|
|
|
* extra times.
|
|
|
|
*/
|
|
|
|
if (node->mt_done)
|
|
|
|
return NULL;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* On first call, fire BEFORE STATEMENT triggers before proceeding.
|
|
|
|
*/
|
|
|
|
if (node->fireBSTriggers)
|
|
|
|
{
|
|
|
|
fireBSTriggers(node);
|
|
|
|
node->fireBSTriggers = false;
|
|
|
|
}
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
/* Preload local variables */
|
|
|
|
resultRelInfo = node->resultRelInfo + node->mt_whichplan;
|
|
|
|
subplanstate = node->mt_plans[node->mt_whichplan];
|
|
|
|
junkfilter = resultRelInfo->ri_junkFilter;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* es_result_relation_info must point to the currently active result
|
2014-05-06 18:12:18 +02:00
|
|
|
* relation while we are within this ModifyTable node. Even though
|
2011-02-26 00:56:23 +01:00
|
|
|
* ModifyTable nodes can't be nested statically, they can be nested
|
|
|
|
* dynamically (since our subplan could include a reference to a modifying
|
|
|
|
* CTE). So we have to save and restore the caller's value.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
2011-02-26 00:56:23 +01:00
|
|
|
saved_resultRelInfo = estate->es_result_relation_info;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
estate->es_result_relation_info = resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Fetch rows from subplan(s), and execute the required table modification
|
|
|
|
* for each row.
|
|
|
|
*/
|
|
|
|
for (;;)
|
|
|
|
{
|
2010-08-18 23:52:24 +02:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Reset the per-output-tuple exprcontext. This is needed because
|
2010-08-18 23:52:24 +02:00
|
|
|
* triggers expect to use that context as workspace. It's a bit ugly
|
|
|
|
* to do this below the top level of the plan, however. We might need
|
|
|
|
* to rethink this later.
|
|
|
|
*/
|
|
|
|
ResetPerTupleExprContext(estate);
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
planSlot = ExecProcNode(subplanstate);
|
|
|
|
|
|
|
|
if (TupIsNull(planSlot))
|
|
|
|
{
|
|
|
|
/* advance to next subplan if any */
|
|
|
|
node->mt_whichplan++;
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
if (node->mt_whichplan < node->mt_nplans)
|
|
|
|
{
|
2011-02-26 00:56:23 +01:00
|
|
|
resultRelInfo++;
|
2009-10-10 03:43:50 +02:00
|
|
|
subplanstate = node->mt_plans[node->mt_whichplan];
|
2011-02-26 00:56:23 +01:00
|
|
|
junkfilter = resultRelInfo->ri_junkFilter;
|
|
|
|
estate->es_result_relation_info = resultRelInfo;
|
2011-01-13 02:47:02 +01:00
|
|
|
EvalPlanQualSetPlan(&node->mt_epqstate, subplanstate->plan,
|
|
|
|
node->mt_arowmarks[node->mt_whichplan]);
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
/* Prepare to convert transition tuples from this child. */
|
2017-06-28 19:55:03 +02:00
|
|
|
if (node->mt_transition_capture != NULL)
|
|
|
|
{
|
|
|
|
node->mt_transition_capture->tcs_map =
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
tupconv_map_for_subplan(node, node->mt_whichplan);
|
2017-06-28 19:55:03 +02:00
|
|
|
}
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
if (node->mt_oc_transition_capture != NULL)
|
|
|
|
{
|
|
|
|
node->mt_oc_transition_capture->tcs_map =
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
tupconv_map_for_subplan(node, node->mt_whichplan);
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2016-03-18 18:48:58 +01:00
|
|
|
/*
|
|
|
|
* If resultRelInfo->ri_usesFdwDirectModify is true, all we need to do
|
|
|
|
* here is compute the RETURNING expressions.
|
|
|
|
*/
|
|
|
|
if (resultRelInfo->ri_usesFdwDirectModify)
|
|
|
|
{
|
|
|
|
Assert(resultRelInfo->ri_projectReturning);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* A scan slot containing the data that was actually inserted,
|
|
|
|
* updated or deleted has already been made available to
|
|
|
|
* ExecProcessReturning by IterateDirectModify, so no need to
|
|
|
|
* provide it here.
|
|
|
|
*/
|
|
|
|
slot = ExecProcessReturning(resultRelInfo, NULL, planSlot);
|
|
|
|
|
|
|
|
estate->es_result_relation_info = saved_resultRelInfo;
|
|
|
|
return slot;
|
|
|
|
}
|
|
|
|
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
EvalPlanQualSetSlot(&node->mt_epqstate, planSlot);
|
2009-10-10 03:43:50 +02:00
|
|
|
slot = planSlot;
|
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
if (operation == CMD_MERGE)
|
|
|
|
{
|
|
|
|
ExecMerge(node, estate, slot, junkfilter, resultRelInfo);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table. That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables. However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping. This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.
To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table. We can conveniently call it from
preprocess_targetlist. Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist. (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)
Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't. (That's
really an independent bug, but it manifests in much the same cases.)
Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.
Back-patch to 9.5 where the problem was introduced.
Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat
Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
2017-11-27 23:53:56 +01:00
|
|
|
tupleid = NULL;
|
2014-03-23 07:16:34 +01:00
|
|
|
oldtuple = NULL;
|
2009-10-10 03:43:50 +02:00
|
|
|
if (junkfilter != NULL)
|
|
|
|
{
|
|
|
|
/*
|
2010-10-10 19:43:33 +02:00
|
|
|
* extract the 'ctid' or 'wholerow' junk attribute.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
|
|
|
if (operation == CMD_UPDATE || operation == CMD_DELETE)
|
|
|
|
{
|
2013-03-10 19:14:53 +01:00
|
|
|
char relkind;
|
2009-10-10 03:43:50 +02:00
|
|
|
Datum datum;
|
|
|
|
bool isNull;
|
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
|
2013-07-16 19:55:44 +02:00
|
|
|
if (relkind == RELKIND_RELATION || relkind == RELKIND_MATVIEW)
|
2010-10-10 19:43:33 +02:00
|
|
|
{
|
|
|
|
datum = ExecGetJunkAttribute(slot,
|
|
|
|
junkfilter->jf_junkAttNo,
|
|
|
|
&isNull);
|
|
|
|
/* shouldn't ever get a null result... */
|
|
|
|
if (isNull)
|
|
|
|
elog(ERROR, "ctid is NULL");
|
|
|
|
|
|
|
|
tupleid = (ItemPointer) DatumGetPointer(datum);
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
tuple_ctid = *tupleid; /* be sure we don't free ctid!! */
|
2010-10-10 19:43:33 +02:00
|
|
|
tupleid = &tuple_ctid;
|
|
|
|
}
|
2014-05-06 18:12:18 +02:00
|
|
|
|
2014-03-23 07:16:34 +01:00
|
|
|
/*
|
|
|
|
* Use the wholerow attribute, when available, to reconstruct
|
|
|
|
* the old relation tuple.
|
|
|
|
*
|
|
|
|
* Foreign table updates have a wholerow attribute when the
|
2017-08-03 18:47:00 +02:00
|
|
|
* relation has a row-level trigger. Note that the wholerow
|
2014-03-23 07:16:34 +01:00
|
|
|
* attribute does not carry system columns. Foreign table
|
|
|
|
* triggers miss seeing those, except that we know enough here
|
|
|
|
* to set t_tableOid. Quite separately from this, the FDW may
|
|
|
|
* fetch its own junk attrs to identify the row.
|
|
|
|
*
|
|
|
|
* Other relevant relkinds, currently limited to views, always
|
|
|
|
* have a wholerow attribute.
|
|
|
|
*/
|
|
|
|
else if (AttributeNumberIsValid(junkfilter->jf_junkAttNo))
|
2010-10-10 19:43:33 +02:00
|
|
|
{
|
|
|
|
datum = ExecGetJunkAttribute(slot,
|
|
|
|
junkfilter->jf_junkAttNo,
|
|
|
|
&isNull);
|
|
|
|
/* shouldn't ever get a null result... */
|
|
|
|
if (isNull)
|
|
|
|
elog(ERROR, "wholerow is NULL");
|
|
|
|
|
2014-03-23 07:16:34 +01:00
|
|
|
oldtupdata.t_data = DatumGetHeapTupleHeader(datum);
|
|
|
|
oldtupdata.t_len =
|
|
|
|
HeapTupleHeaderGetDatumLength(oldtupdata.t_data);
|
|
|
|
ItemPointerSetInvalid(&(oldtupdata.t_self));
|
|
|
|
/* Historically, view triggers see invalid t_tableOid. */
|
|
|
|
oldtupdata.t_tableOid =
|
|
|
|
(relkind == RELKIND_VIEW) ? InvalidOid :
|
|
|
|
RelationGetRelid(resultRelInfo->ri_RelationDesc);
|
|
|
|
|
|
|
|
oldtuple = &oldtupdata;
|
2010-10-10 19:43:33 +02:00
|
|
|
}
|
2014-03-23 07:16:34 +01:00
|
|
|
else
|
|
|
|
Assert(relkind == RELKIND_FOREIGN_TABLE);
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* apply the junkfilter if needed.
|
|
|
|
*/
|
|
|
|
if (operation != CMD_DELETE)
|
|
|
|
slot = ExecFilterJunk(junkfilter, slot);
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (operation)
|
|
|
|
{
|
|
|
|
case CMD_INSERT:
|
2018-03-19 21:43:57 +01:00
|
|
|
/* Prepare for tuple routing if needed. */
|
|
|
|
if (proute)
|
|
|
|
slot = ExecPrepareTupleRouting(node, estate, proute,
|
|
|
|
resultRelInfo, slot);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
slot = ExecInsert(node, slot, planSlot,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
estate, NULL, node->canSetTag);
|
2018-03-19 21:43:57 +01:00
|
|
|
/* Revert ExecPrepareTupleRouting's state change. */
|
|
|
|
if (proute)
|
|
|
|
estate->es_result_relation_info = resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
case CMD_UPDATE:
|
2017-06-28 19:55:03 +02:00
|
|
|
slot = ExecUpdate(node, tupleid, oldtuple, slot, planSlot,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
&node->mt_epqstate, estate,
|
|
|
|
NULL, NULL, NULL, node->canSetTag);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
case CMD_DELETE:
|
2017-06-28 19:55:03 +02:00
|
|
|
slot = ExecDelete(node, tupleid, oldtuple, planSlot,
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
&node->mt_epqstate, estate,
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
NULL, true, NULL, NULL, node->canSetTag);
|
2009-10-10 03:43:50 +02:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unknown operation");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we got a RETURNING result, return it to caller. We'll continue
|
|
|
|
* the work on next call.
|
|
|
|
*/
|
|
|
|
if (slot)
|
|
|
|
{
|
2011-02-26 00:56:23 +01:00
|
|
|
estate->es_result_relation_info = saved_resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
return slot;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
/* Restore es_result_relation_info before exiting */
|
|
|
|
estate->es_result_relation_info = saved_resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We're done, but fire AFTER STATEMENT triggers before exiting.
|
|
|
|
*/
|
|
|
|
fireASTriggers(node);
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
node->mt_done = true;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecInitModifyTable
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
ModifyTableState *
|
|
|
|
ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
|
|
|
|
{
|
|
|
|
ModifyTableState *mtstate;
|
|
|
|
CmdType operation = node->operation;
|
|
|
|
int nplans = list_length(node->plans);
|
2011-02-26 00:56:23 +01:00
|
|
|
ResultRelInfo *saved_resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
ResultRelInfo *resultRelInfo;
|
|
|
|
Plan *subplan;
|
|
|
|
ListCell *l;
|
|
|
|
int i;
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
Relation rel;
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
bool update_tuple_routing_needed = node->partColsUpdated;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/* check for unsupported flags */
|
|
|
|
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* create state structure
|
|
|
|
*/
|
|
|
|
mtstate = makeNode(ModifyTableState);
|
|
|
|
mtstate->ps.plan = (Plan *) node;
|
|
|
|
mtstate->ps.state = estate;
|
2017-07-17 09:33:49 +02:00
|
|
|
mtstate->ps.ExecProcNode = ExecModifyTable;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
mtstate->operation = operation;
|
|
|
|
mtstate->canSetTag = node->canSetTag;
|
|
|
|
mtstate->mt_done = false;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
mtstate->mt_plans = (PlanState **) palloc0(sizeof(PlanState *) * nplans);
|
2011-02-26 00:56:23 +01:00
|
|
|
mtstate->resultRelInfo = estate->es_result_relations + node->resultRelIndex;
|
2017-05-01 14:23:01 +02:00
|
|
|
|
|
|
|
/* If modifying a partitioned table, initialize the root table info */
|
|
|
|
if (node->rootResultRelIndex >= 0)
|
|
|
|
mtstate->rootResultRelInfo = estate->es_root_result_relations +
|
2017-05-17 22:31:56 +02:00
|
|
|
node->rootResultRelIndex;
|
2017-05-01 14:23:01 +02:00
|
|
|
|
2011-01-13 02:47:02 +01:00
|
|
|
mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
|
2009-10-10 03:43:50 +02:00
|
|
|
mtstate->mt_nplans = nplans;
|
2011-02-26 00:56:23 +01:00
|
|
|
|
2011-01-13 02:47:02 +01:00
|
|
|
/* set up epqstate with dummy subplan data for the moment */
|
|
|
|
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
|
2009-10-10 03:43:50 +02:00
|
|
|
mtstate->fireBSTriggers = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* call ExecInitNode on each of the plans to be executed and save the
|
2011-04-10 17:42:00 +02:00
|
|
|
* results into the array "mt_plans". This is also a convenient place to
|
|
|
|
* verify that the proposed target relations are valid and open their
|
2014-05-06 18:12:18 +02:00
|
|
|
* indexes for insertion of new index entries. Note we *must* set
|
2009-10-10 03:43:50 +02:00
|
|
|
* estate->es_result_relation_info correctly while we initialize each
|
|
|
|
* sub-plan; ExecContextForcesOids depends on that!
|
|
|
|
*/
|
2011-02-26 00:56:23 +01:00
|
|
|
saved_resultRelInfo = estate->es_result_relation_info;
|
|
|
|
|
|
|
|
resultRelInfo = mtstate->resultRelInfo;
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* mergeTargetRelation must be set if we're running MERGE and mustn't be
|
|
|
|
* set if we're not.
|
|
|
|
*/
|
|
|
|
Assert(operation != CMD_MERGE || node->mergeTargetRelation > 0);
|
|
|
|
Assert(operation == CMD_MERGE || node->mergeTargetRelation == 0);
|
|
|
|
|
|
|
|
resultRelInfo->ri_mergeTargetRTI = node->mergeTargetRelation;
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
i = 0;
|
|
|
|
foreach(l, node->plans)
|
|
|
|
{
|
|
|
|
subplan = (Plan *) lfirst(l);
|
2011-02-26 00:56:23 +01:00
|
|
|
|
2016-03-18 18:48:58 +01:00
|
|
|
/* Initialize the usesFdwDirectModify flag */
|
|
|
|
resultRelInfo->ri_usesFdwDirectModify = bms_is_member(i,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
node->fdwDirectModifyPlans);
|
2016-03-18 18:48:58 +01:00
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
/*
|
|
|
|
* Verify result relation is a valid target for the current operation
|
|
|
|
*/
|
2017-09-07 16:55:45 +02:00
|
|
|
CheckValidResultRel(resultRelInfo, operation);
|
2011-02-26 00:56:23 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If there are indices on the result relation, open them and save
|
|
|
|
* descriptors in the result relation info, so that we can add new
|
2014-05-06 18:12:18 +02:00
|
|
|
* index entries for the tuples we add/update. We need not do this
|
2012-06-10 21:20:04 +02:00
|
|
|
* for a DELETE, however, since deletion doesn't affect indexes. Also,
|
|
|
|
* inside an EvalPlanQual operation, the indexes might be open
|
2012-01-28 23:43:57 +01:00
|
|
|
* already, since we share the resultrel state with the original
|
|
|
|
* query.
|
2011-02-26 00:56:23 +01:00
|
|
|
*/
|
|
|
|
if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex &&
|
2012-01-28 23:43:57 +01:00
|
|
|
operation != CMD_DELETE &&
|
|
|
|
resultRelInfo->ri_IndexRelationDescs == NULL)
|
2018-03-19 22:09:43 +01:00
|
|
|
ExecOpenIndices(resultRelInfo,
|
|
|
|
node->onConflictAction != ONCONFLICT_NONE);
|
2011-02-26 00:56:23 +01:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* If this is an UPDATE and a BEFORE UPDATE trigger is present, the
|
|
|
|
* trigger itself might modify the partition-key values. So arrange
|
|
|
|
* for tuple routing.
|
|
|
|
*/
|
|
|
|
if (resultRelInfo->ri_TrigDesc &&
|
|
|
|
resultRelInfo->ri_TrigDesc->trig_update_before_row &&
|
|
|
|
operation == CMD_UPDATE)
|
|
|
|
update_tuple_routing_needed = true;
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
/* Now init the plan for this result rel */
|
|
|
|
estate->es_result_relation_info = resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
mtstate->mt_plans[i] = ExecInitNode(subplan, estate, eflags);
|
2011-02-26 00:56:23 +01:00
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
/* Also let FDWs init themselves for foreign-table result rels */
|
2016-03-18 18:48:58 +01:00
|
|
|
if (!resultRelInfo->ri_usesFdwDirectModify &&
|
|
|
|
resultRelInfo->ri_FdwRoutine != NULL &&
|
2013-03-10 19:14:53 +01:00
|
|
|
resultRelInfo->ri_FdwRoutine->BeginForeignModify != NULL)
|
|
|
|
{
|
|
|
|
List *fdw_private = (List *) list_nth(node->fdwPrivLists, i);
|
|
|
|
|
|
|
|
resultRelInfo->ri_FdwRoutine->BeginForeignModify(mtstate,
|
|
|
|
resultRelInfo,
|
|
|
|
fdw_private,
|
|
|
|
i,
|
|
|
|
eflags);
|
|
|
|
}
|
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
resultRelInfo++;
|
2009-10-10 03:43:50 +02:00
|
|
|
i++;
|
|
|
|
}
|
2011-02-26 00:56:23 +01:00
|
|
|
|
|
|
|
estate->es_result_relation_info = saved_resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/* Get the target relation */
|
|
|
|
rel = (getTargetResultRelInfo(mtstate))->ri_RelationDesc;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If it's not a partitioned table after all, UPDATE tuple routing should
|
|
|
|
* not be attempted.
|
|
|
|
*/
|
|
|
|
if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
|
|
|
|
update_tuple_routing_needed = false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Build state for tuple routing if it's an INSERT or if it's an UPDATE of
|
|
|
|
* partition key.
|
|
|
|
*/
|
|
|
|
if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
(operation == CMD_INSERT || operation == CMD_MERGE ||
|
|
|
|
update_tuple_routing_needed))
|
Be lazier about partition tuple routing.
It's not necessary to fully initialize the executor data structures
for partitions to which no tuples are ever routed. Consider, for
example, an INSERT statement that inserts only one row: it only cares
about the partition to which that one row is routed. The new function
ExecInitPartitionInfo performs the initialization in question only
when a particular partition is about to receive a tuple. This includes
creating, validating, and saving a pointer to the ResultRelInfo,
setting up for speculative insertions, translating WCOs and
initializing the resulting expressions, translating returning lists
and building the appropriate projection information, and setting up a
tuple conversion map.
One thing that's not deferred is locking the child partitions; that
seems desirable but would need more thought. Still, testing shows
that this makes single-row inserts significantly faster on a table
with many partitions without harming the bulk-insert case.
Amit Langote, reviewed by Etsuro Fujita, with a few changes by me
Discussion: http://postgr.es/m/8975331d-d961-cbdd-f862-fdd3d97dc2d0@lab.ntt.co.jp
2018-02-22 16:55:54 +01:00
|
|
|
mtstate->mt_partition_tuple_routing =
|
|
|
|
ExecSetupPartitionTupleRouting(mtstate, rel);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
Fix SQL-spec incompatibilities in new transition table feature.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
2017-09-16 19:20:32 +02:00
|
|
|
/*
|
|
|
|
* Build state for collecting transition tuples. This requires having a
|
|
|
|
* valid trigger query context, so skip it in explain-only mode.
|
|
|
|
*/
|
|
|
|
if (!(eflags & EXEC_FLAG_EXPLAIN_ONLY))
|
|
|
|
ExecSetupTransitionCaptureState(mtstate, estate);
|
2017-06-28 19:55:03 +02:00
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
/*
|
|
|
|
* If we are doing MERGE then setup child-parent mapping. This will be
|
|
|
|
* required in case we end up doing a partition-key update, triggering a
|
|
|
|
* tuple routing.
|
|
|
|
*/
|
|
|
|
if (mtstate->operation == CMD_MERGE &&
|
|
|
|
mtstate->mt_partition_tuple_routing != NULL)
|
|
|
|
ExecSetupChildParentMapForLeaf(mtstate->mt_partition_tuple_routing);
|
|
|
|
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
/*
|
|
|
|
* Construct mapping from each of the per-subplan partition attnos to the
|
|
|
|
* root attno. This is required when during update row movement the tuple
|
|
|
|
* descriptor of a source partition does not match the root partitioned
|
|
|
|
* table descriptor. In such a case we need to convert tuples to the root
|
|
|
|
* tuple descriptor, because the search for destination partition starts
|
|
|
|
* from the root. Skip this setup if it's not a partition key update.
|
|
|
|
*/
|
|
|
|
if (update_tuple_routing_needed)
|
|
|
|
ExecSetupChildParentMapForSubplan(mtstate);
|
|
|
|
|
2013-07-18 23:10:16 +02:00
|
|
|
/*
|
|
|
|
* Initialize any WITH CHECK OPTION constraints if needed.
|
|
|
|
*/
|
|
|
|
resultRelInfo = mtstate->resultRelInfo;
|
|
|
|
i = 0;
|
|
|
|
foreach(l, node->withCheckOptionLists)
|
|
|
|
{
|
|
|
|
List *wcoList = (List *) lfirst(l);
|
|
|
|
List *wcoExprs = NIL;
|
|
|
|
ListCell *ll;
|
|
|
|
|
|
|
|
foreach(ll, wcoList)
|
|
|
|
{
|
|
|
|
WithCheckOption *wco = (WithCheckOption *) lfirst(ll);
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
ExprState *wcoExpr = ExecInitQual((List *) wco->qual,
|
2013-07-18 23:10:16 +02:00
|
|
|
mtstate->mt_plans[i]);
|
2014-05-06 18:12:18 +02:00
|
|
|
|
2013-07-18 23:10:16 +02:00
|
|
|
wcoExprs = lappend(wcoExprs, wcoExpr);
|
|
|
|
}
|
|
|
|
|
|
|
|
resultRelInfo->ri_WithCheckOptions = wcoList;
|
|
|
|
resultRelInfo->ri_WithCheckOptionExprs = wcoExprs;
|
|
|
|
resultRelInfo++;
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* Initialize RETURNING projections if needed.
|
|
|
|
*/
|
|
|
|
if (node->returningLists)
|
|
|
|
{
|
|
|
|
TupleTableSlot *slot;
|
|
|
|
ExprContext *econtext;
|
|
|
|
|
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* Initialize result tuple slot and assign its rowtype using the first
|
2014-05-06 18:12:18 +02:00
|
|
|
* RETURNING list. We assume the rest will look the same.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
2017-12-29 21:26:29 +01:00
|
|
|
mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/* Set up a slot for the output of the RETURNING projection(s) */
|
2018-02-17 06:17:38 +01:00
|
|
|
ExecInitResultTupleSlotTL(estate, &mtstate->ps);
|
2009-10-10 03:43:50 +02:00
|
|
|
slot = mtstate->ps.ps_ResultTupleSlot;
|
|
|
|
|
|
|
|
/* Need an econtext too */
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
if (mtstate->ps.ps_ExprContext == NULL)
|
|
|
|
ExecAssignExprContext(estate, &mtstate->ps);
|
|
|
|
econtext = mtstate->ps.ps_ExprContext;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Build a projection for each result rel.
|
|
|
|
*/
|
2011-02-26 00:56:23 +01:00
|
|
|
resultRelInfo = mtstate->resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
foreach(l, node->returningLists)
|
|
|
|
{
|
|
|
|
List *rlist = (List *) lfirst(l);
|
|
|
|
|
|
|
|
resultRelInfo->ri_projectReturning =
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
ExecBuildProjectionInfo(rlist, econtext, slot, &mtstate->ps,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
resultRelInfo->ri_RelationDesc->rd_att);
|
2009-10-10 03:43:50 +02:00
|
|
|
resultRelInfo++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* We still must construct a dummy result tuple type, because InitPlan
|
|
|
|
* expects one (maybe should change that?).
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
2017-12-29 21:26:29 +01:00
|
|
|
mtstate->ps.plan->targetlist = NIL;
|
2018-02-17 06:17:38 +01:00
|
|
|
ExecInitResultTupleSlotTL(estate, &mtstate->ps);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
|
|
|
mtstate->ps.ps_ExprContext = NULL;
|
|
|
|
}
|
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
/* Set the list of arbiter indexes if needed for ON CONFLICT */
|
|
|
|
resultRelInfo = mtstate->resultRelInfo;
|
|
|
|
if (node->onConflictAction != ONCONFLICT_NONE)
|
|
|
|
resultRelInfo->ri_onConflictArbiterIndexes = node->arbiterIndexes;
|
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
/*
|
|
|
|
* If needed, Initialize target list, projection and qual for ON CONFLICT
|
|
|
|
* DO UPDATE.
|
|
|
|
*/
|
|
|
|
if (node->onConflictAction == ONCONFLICT_UPDATE)
|
|
|
|
{
|
|
|
|
ExprContext *econtext;
|
2018-02-17 06:17:38 +01:00
|
|
|
TupleDesc relationDesc;
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
TupleDesc tupDesc;
|
|
|
|
|
|
|
|
/* insert may only have one plan, inheritance is not expanded */
|
|
|
|
Assert(nplans == 1);
|
|
|
|
|
|
|
|
/* already exists if created by RETURNING processing above */
|
|
|
|
if (mtstate->ps.ps_ExprContext == NULL)
|
|
|
|
ExecAssignExprContext(estate, &mtstate->ps);
|
|
|
|
|
|
|
|
econtext = mtstate->ps.ps_ExprContext;
|
2018-02-17 06:17:38 +01:00
|
|
|
relationDesc = resultRelInfo->ri_RelationDesc->rd_att;
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
/*
|
|
|
|
* Initialize slot for the existing tuple. If we'll be performing
|
|
|
|
* tuple routing, the tuple descriptor to use for this will be
|
|
|
|
* determined based on which relation the update is actually applied
|
|
|
|
* to, so we don't set its tuple descriptor here.
|
|
|
|
*/
|
2018-02-17 06:17:38 +01:00
|
|
|
mtstate->mt_existing =
|
2018-03-26 15:43:54 +02:00
|
|
|
ExecInitExtraTupleSlot(mtstate->ps.state,
|
|
|
|
mtstate->mt_partition_tuple_routing ?
|
|
|
|
NULL : relationDesc);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
2015-05-13 00:13:22 +02:00
|
|
|
/* carried forward solely for the benefit of explain */
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
mtstate->mt_excludedtlist = node->exclRelTlist;
|
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
/* create state for DO UPDATE SET operation */
|
|
|
|
resultRelInfo->ri_onConflict = makeNode(OnConflictSetState);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create the tuple slot for the UPDATE SET projection.
|
|
|
|
*
|
|
|
|
* Just like mt_existing above, we leave it without a tuple descriptor
|
|
|
|
* in the case of partitioning tuple routing, so that it can be
|
|
|
|
* changed by ExecPrepareTupleRouting. In that case, we still save
|
|
|
|
* the tupdesc in the parent's state: it can be reused by partitions
|
|
|
|
* with an identical descriptor to the parent.
|
|
|
|
*/
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
tupDesc = ExecTypeFromTL((List *) node->onConflictSet,
|
2018-02-17 06:17:38 +01:00
|
|
|
relationDesc->tdhasoid);
|
|
|
|
mtstate->mt_conflproj =
|
2018-03-26 15:43:54 +02:00
|
|
|
ExecInitExtraTupleSlot(mtstate->ps.state,
|
|
|
|
mtstate->mt_partition_tuple_routing ?
|
|
|
|
NULL : tupDesc);
|
|
|
|
resultRelInfo->ri_onConflict->oc_ProjTupdesc = tupDesc;
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
/* build UPDATE SET projection state */
|
2018-03-26 15:43:54 +02:00
|
|
|
resultRelInfo->ri_onConflict->oc_ProjInfo =
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
ExecBuildProjectionInfo(node->onConflictSet, econtext,
|
|
|
|
mtstate->mt_conflproj, &mtstate->ps,
|
2018-02-17 06:17:38 +01:00
|
|
|
relationDesc);
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
|
2018-03-26 15:43:54 +02:00
|
|
|
/* initialize state to evaluate the WHERE clause, if any */
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
if (node->onConflictWhere)
|
|
|
|
{
|
|
|
|
ExprState *qualexpr;
|
|
|
|
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
qualexpr = ExecInitQual((List *) node->onConflictWhere,
|
2015-05-19 01:55:10 +02:00
|
|
|
&mtstate->ps);
|
2018-03-26 15:43:54 +02:00
|
|
|
resultRelInfo->ri_onConflict->oc_WhereClause = qualexpr;
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* If we have any secondary relations in an UPDATE or DELETE, they need to
|
|
|
|
* be treated like non-locked relations in SELECT FOR UPDATE, ie, the
|
2014-05-06 18:12:18 +02:00
|
|
|
* EvalPlanQual mechanism needs to be told about them. Locate the
|
2010-02-26 03:01:40 +01:00
|
|
|
* relevant ExecRowMarks.
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
*/
|
|
|
|
foreach(l, node->rowMarks)
|
|
|
|
{
|
Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 19:51:29 +02:00
|
|
|
PlanRowMark *rc = lfirst_node(PlanRowMark, l);
|
2011-01-13 02:47:02 +01:00
|
|
|
ExecRowMark *erm;
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
|
|
|
|
/* ignore "parent" rowmarks; they are irrelevant at runtime */
|
|
|
|
if (rc->isParent)
|
|
|
|
continue;
|
|
|
|
|
2011-01-13 02:47:02 +01:00
|
|
|
/* find ExecRowMark (same for all subplans) */
|
Add support for doing late row locking in FDWs.
Previously, FDWs could only do "early row locking", that is lock a row as
soon as it's fetched, even though local restriction/join conditions might
discard the row later. This patch adds callbacks that allow FDWs to do
late locking in the same way that it's done for regular tables.
To make use of this feature, an FDW must support the "ctid" column as a
unique row identifier. Currently, since ctid has to be of type TID,
the feature is of limited use, though in principle it could be used by
postgres_fdw. We may eventually allow FDWs to specify another data type
for ctid, which would make it possible for more FDWs to use this feature.
This commit does not modify postgres_fdw to use late locking. We've
tested some prototype code for that, but it's not in committable shape,
and besides it's quite unclear whether it actually makes sense to do late
locking against a remote server. The extra round trips required are likely
to outweigh any benefit from improved concurrency.
Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me
2015-05-12 20:10:10 +02:00
|
|
|
erm = ExecFindRowMark(estate, rc->rti, false);
|
2011-01-13 02:47:02 +01:00
|
|
|
|
|
|
|
/* build ExecAuxRowMark for each subplan */
|
|
|
|
for (i = 0; i < nplans; i++)
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
{
|
2011-01-13 02:47:02 +01:00
|
|
|
ExecAuxRowMark *aerm;
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
|
2011-01-13 02:47:02 +01:00
|
|
|
subplan = mtstate->mt_plans[i]->plan;
|
|
|
|
aerm = ExecBuildAuxRowMark(erm, subplan->targetlist);
|
|
|
|
mtstate->mt_arowmarks[i] = lappend(mtstate->mt_arowmarks[i], aerm);
|
|
|
|
}
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
}
|
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
resultRelInfo = mtstate->resultRelInfo;
|
|
|
|
|
|
|
|
if (node->mergeActionList)
|
|
|
|
{
|
|
|
|
ListCell *l;
|
|
|
|
ExprContext *econtext;
|
|
|
|
List *mergeMatchedActionStates = NIL;
|
|
|
|
List *mergeNotMatchedActionStates = NIL;
|
|
|
|
TupleDesc relationDesc = resultRelInfo->ri_RelationDesc->rd_att;
|
|
|
|
|
|
|
|
mtstate->mt_merge_subcommands = 0;
|
|
|
|
|
|
|
|
if (mtstate->ps.ps_ExprContext == NULL)
|
|
|
|
ExecAssignExprContext(estate, &mtstate->ps);
|
|
|
|
|
|
|
|
econtext = mtstate->ps.ps_ExprContext;
|
|
|
|
|
|
|
|
/* initialize slot for the existing tuple */
|
|
|
|
Assert(mtstate->mt_existing == NULL);
|
|
|
|
mtstate->mt_existing =
|
|
|
|
ExecInitExtraTupleSlot(mtstate->ps.state,
|
|
|
|
mtstate->mt_partition_tuple_routing ?
|
|
|
|
NULL : relationDesc);
|
|
|
|
|
|
|
|
/* initialize slot for merge actions */
|
|
|
|
Assert(mtstate->mt_mergeproj == NULL);
|
|
|
|
mtstate->mt_mergeproj =
|
|
|
|
ExecInitExtraTupleSlot(mtstate->ps.state,
|
|
|
|
mtstate->mt_partition_tuple_routing ?
|
|
|
|
NULL : relationDesc);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a MergeActionState for each action on the mergeActionList
|
|
|
|
* and add it to either a list of matched actions or not-matched
|
|
|
|
* actions.
|
|
|
|
*/
|
|
|
|
foreach(l, node->mergeActionList)
|
|
|
|
{
|
|
|
|
MergeAction *action = (MergeAction *) lfirst(l);
|
|
|
|
MergeActionState *action_state = makeNode(MergeActionState);
|
|
|
|
TupleDesc tupDesc;
|
|
|
|
|
|
|
|
action_state->matched = action->matched;
|
|
|
|
action_state->commandType = action->commandType;
|
|
|
|
action_state->whenqual = ExecInitQual((List *) action->qual,
|
|
|
|
&mtstate->ps);
|
|
|
|
|
|
|
|
/* create target slot for this action's projection */
|
|
|
|
tupDesc = ExecTypeFromTL((List *) action->targetList,
|
|
|
|
resultRelInfo->ri_RelationDesc->rd_rel->relhasoids);
|
|
|
|
action_state->tupDesc = tupDesc;
|
|
|
|
|
|
|
|
/* build action projection state */
|
|
|
|
action_state->proj =
|
|
|
|
ExecBuildProjectionInfo(action->targetList, econtext,
|
|
|
|
mtstate->mt_mergeproj, &mtstate->ps,
|
|
|
|
resultRelInfo->ri_RelationDesc->rd_att);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We create two lists - one for WHEN MATCHED actions and one
|
|
|
|
* for WHEN NOT MATCHED actions - and stick the
|
|
|
|
* MergeActionState into the appropriate list.
|
|
|
|
*/
|
|
|
|
if (action_state->matched)
|
|
|
|
mergeMatchedActionStates =
|
|
|
|
lappend(mergeMatchedActionStates, action_state);
|
|
|
|
else
|
|
|
|
mergeNotMatchedActionStates =
|
|
|
|
lappend(mergeNotMatchedActionStates, action_state);
|
|
|
|
|
|
|
|
switch (action->commandType)
|
|
|
|
{
|
|
|
|
case CMD_INSERT:
|
|
|
|
ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
|
|
|
|
action->targetList);
|
|
|
|
mtstate->mt_merge_subcommands |= MERGE_INSERT;
|
|
|
|
break;
|
|
|
|
case CMD_UPDATE:
|
|
|
|
ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
|
|
|
|
action->targetList);
|
|
|
|
mtstate->mt_merge_subcommands |= MERGE_UPDATE;
|
|
|
|
break;
|
|
|
|
case CMD_DELETE:
|
|
|
|
mtstate->mt_merge_subcommands |= MERGE_DELETE;
|
|
|
|
break;
|
|
|
|
case CMD_NOTHING:
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unknown operation");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
resultRelInfo->ri_mergeState->matchedActionStates =
|
|
|
|
mergeMatchedActionStates;
|
|
|
|
resultRelInfo->ri_mergeState->notMatchedActionStates =
|
|
|
|
mergeNotMatchedActionStates;
|
|
|
|
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-01-13 02:47:02 +01:00
|
|
|
/* select first subplan */
|
|
|
|
mtstate->mt_whichplan = 0;
|
|
|
|
subplan = (Plan *) linitial(node->plans);
|
|
|
|
EvalPlanQualSetPlan(&mtstate->mt_epqstate, subplan,
|
|
|
|
mtstate->mt_arowmarks[0]);
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* Initialize the junk filter(s) if needed. INSERT queries need a filter
|
2010-02-26 03:01:40 +01:00
|
|
|
* if there are any junk attrs in the tlist. UPDATE and DELETE always
|
2017-08-03 18:47:00 +02:00
|
|
|
* need a filter, since there's always at least one junk attribute present
|
|
|
|
* --- no need to look first. Typically, this will be a 'ctid' or
|
|
|
|
* 'wholerow' attribute, but in the case of a foreign data wrapper it
|
|
|
|
* might be a set of junk attributes sufficient to identify the remote
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
* row. We follow this logic for MERGE, so it always has a junk attributes.
|
2009-10-10 03:43:50 +02:00
|
|
|
*
|
|
|
|
* If there are multiple result relations, each one needs its own junk
|
2014-05-06 18:12:18 +02:00
|
|
|
* filter. Note multiple rels are only possible for UPDATE/DELETE, so we
|
2009-10-10 03:43:50 +02:00
|
|
|
* can't be fooled by some needing a filter and some not.
|
|
|
|
*
|
|
|
|
* This section of code is also a convenient place to verify that the
|
|
|
|
* output of an INSERT or UPDATE matches the target table(s).
|
|
|
|
*/
|
|
|
|
{
|
|
|
|
bool junk_filter_needed = false;
|
|
|
|
|
|
|
|
switch (operation)
|
|
|
|
{
|
|
|
|
case CMD_INSERT:
|
|
|
|
foreach(l, subplan->targetlist)
|
|
|
|
{
|
|
|
|
TargetEntry *tle = (TargetEntry *) lfirst(l);
|
|
|
|
|
|
|
|
if (tle->resjunk)
|
|
|
|
{
|
|
|
|
junk_filter_needed = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case CMD_UPDATE:
|
|
|
|
case CMD_DELETE:
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
case CMD_MERGE:
|
2009-10-10 03:43:50 +02:00
|
|
|
junk_filter_needed = true;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unknown operation");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (junk_filter_needed)
|
|
|
|
{
|
2011-02-26 00:56:23 +01:00
|
|
|
resultRelInfo = mtstate->resultRelInfo;
|
2009-10-10 03:43:50 +02:00
|
|
|
for (i = 0; i < nplans; i++)
|
|
|
|
{
|
|
|
|
JunkFilter *j;
|
|
|
|
|
|
|
|
subplan = mtstate->mt_plans[i]->plan;
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
if (operation == CMD_INSERT || operation == CMD_UPDATE)
|
|
|
|
ExecCheckPlanOutput(resultRelInfo->ri_RelationDesc,
|
|
|
|
subplan->targetlist);
|
|
|
|
|
|
|
|
j = ExecInitJunkFilter(subplan->targetlist,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
resultRelInfo->ri_RelationDesc->rd_att->tdhasoid,
|
2018-02-17 06:17:38 +01:00
|
|
|
ExecInitExtraTupleSlot(estate, NULL));
|
2009-10-10 03:43:50 +02:00
|
|
|
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
if (operation == CMD_UPDATE ||
|
|
|
|
operation == CMD_DELETE ||
|
|
|
|
operation == CMD_MERGE)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
2010-10-10 19:43:33 +02:00
|
|
|
/* For UPDATE/DELETE, find the appropriate junk attr now */
|
2013-03-10 19:14:53 +01:00
|
|
|
char relkind;
|
|
|
|
|
|
|
|
relkind = resultRelInfo->ri_RelationDesc->rd_rel->relkind;
|
2013-07-16 19:55:44 +02:00
|
|
|
if (relkind == RELKIND_RELATION ||
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
relkind == RELKIND_MATVIEW ||
|
|
|
|
relkind == RELKIND_PARTITIONED_TABLE)
|
2010-10-10 19:43:33 +02:00
|
|
|
{
|
|
|
|
j->jf_junkAttNo = ExecFindJunkAttribute(j, "ctid");
|
|
|
|
if (!AttributeNumberIsValid(j->jf_junkAttNo))
|
|
|
|
elog(ERROR, "could not find junk ctid column");
|
MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
Various issues reported via sqlsmith by Andreas Seltenreich
Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 10:28:16 +02:00
|
|
|
|
|
|
|
if (operation == CMD_MERGE &&
|
|
|
|
relkind == RELKIND_PARTITIONED_TABLE)
|
|
|
|
{
|
|
|
|
j->jf_otherJunkAttNo = ExecFindJunkAttribute(j, "tableoid");
|
|
|
|
if (!AttributeNumberIsValid(j->jf_otherJunkAttNo))
|
|
|
|
elog(ERROR, "could not find junk tableoid column");
|
|
|
|
|
|
|
|
}
|
2010-10-10 19:43:33 +02:00
|
|
|
}
|
2013-03-10 19:14:53 +01:00
|
|
|
else if (relkind == RELKIND_FOREIGN_TABLE)
|
|
|
|
{
|
2014-03-23 07:16:34 +01:00
|
|
|
/*
|
2017-08-14 23:29:33 +02:00
|
|
|
* When there is a row-level trigger, there should be
|
|
|
|
* a wholerow attribute.
|
2014-03-23 07:16:34 +01:00
|
|
|
*/
|
|
|
|
j->jf_junkAttNo = ExecFindJunkAttribute(j, "wholerow");
|
2013-03-10 19:14:53 +01:00
|
|
|
}
|
2010-10-10 19:43:33 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
j->jf_junkAttNo = ExecFindJunkAttribute(j, "wholerow");
|
|
|
|
if (!AttributeNumberIsValid(j->jf_junkAttNo))
|
|
|
|
elog(ERROR, "could not find junk wholerow column");
|
|
|
|
}
|
2009-10-10 03:43:50 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
resultRelInfo->ri_junkFilter = j;
|
|
|
|
resultRelInfo++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (operation == CMD_INSERT)
|
2011-02-26 00:56:23 +01:00
|
|
|
ExecCheckPlanOutput(mtstate->resultRelInfo->ri_RelationDesc,
|
2009-10-10 03:43:50 +02:00
|
|
|
subplan->targetlist);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* Set up a tuple table slot for use for trigger output tuples. In a plan
|
|
|
|
* containing multiple ModifyTable nodes, all can share one such slot, so
|
|
|
|
* we keep it in the estate.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
|
|
|
if (estate->es_trig_tuple_slot == NULL)
|
2018-02-17 06:17:38 +01:00
|
|
|
estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate, NULL);
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2011-02-26 00:56:23 +01:00
|
|
|
/*
|
|
|
|
* Lastly, if this is not the primary (canSetTag) ModifyTable node, add it
|
|
|
|
* to estate->es_auxmodifytables so that it will be run to completion by
|
|
|
|
* ExecPostprocessPlan. (It'd actually work fine to add the primary
|
2011-04-10 17:42:00 +02:00
|
|
|
* ModifyTable node too, but there's no need.) Note the use of lcons not
|
|
|
|
* lappend: we need later-initialized ModifyTable nodes to be shut down
|
|
|
|
* before earlier ones. This ensures that we don't throw away RETURNING
|
|
|
|
* rows that need to be seen by a later CTE subplan.
|
2011-02-26 00:56:23 +01:00
|
|
|
*/
|
|
|
|
if (!mtstate->canSetTag)
|
2011-02-26 05:53:34 +01:00
|
|
|
estate->es_auxmodifytables = lcons(mtstate,
|
|
|
|
estate->es_auxmodifytables);
|
2011-02-26 00:56:23 +01:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
return mtstate;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecEndModifyTable
|
|
|
|
*
|
|
|
|
* Shuts down the plan.
|
|
|
|
*
|
|
|
|
* Returns nothing of interest.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
ExecEndModifyTable(ModifyTableState *node)
|
|
|
|
{
|
2010-02-26 03:01:40 +01:00
|
|
|
int i;
|
2009-10-10 03:43:50 +02:00
|
|
|
|
2013-03-10 19:14:53 +01:00
|
|
|
/*
|
|
|
|
* Allow any FDWs to shut down
|
|
|
|
*/
|
|
|
|
for (i = 0; i < node->mt_nplans; i++)
|
|
|
|
{
|
|
|
|
ResultRelInfo *resultRelInfo = node->resultRelInfo + i;
|
|
|
|
|
2016-03-18 18:48:58 +01:00
|
|
|
if (!resultRelInfo->ri_usesFdwDirectModify &&
|
|
|
|
resultRelInfo->ri_FdwRoutine != NULL &&
|
2013-03-10 19:14:53 +01:00
|
|
|
resultRelInfo->ri_FdwRoutine->EndForeignModify != NULL)
|
|
|
|
resultRelInfo->ri_FdwRoutine->EndForeignModify(node->ps.state,
|
|
|
|
resultRelInfo);
|
|
|
|
}
|
|
|
|
|
2018-01-04 21:48:15 +01:00
|
|
|
/* Close all the partitioned tables, leaf partitions, and their indices */
|
|
|
|
if (node->mt_partition_tuple_routing)
|
|
|
|
ExecCleanupTupleRouting(node->mt_partition_tuple_routing);
|
2017-01-04 19:05:29 +01:00
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* Free the exprcontext
|
|
|
|
*/
|
|
|
|
ExecFreeExprContext(&node->ps);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* clean out the tuple table
|
|
|
|
*/
|
|
|
|
ExecClearTuple(node->ps.ps_ResultTupleSlot);
|
|
|
|
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
/*
|
|
|
|
* Terminate EPQ execution if active
|
|
|
|
*/
|
|
|
|
EvalPlanQualEnd(&node->mt_epqstate);
|
|
|
|
|
2009-10-10 03:43:50 +02:00
|
|
|
/*
|
|
|
|
* shut down subplans
|
|
|
|
*/
|
2010-02-26 03:01:40 +01:00
|
|
|
for (i = 0; i < node->mt_nplans; i++)
|
2009-10-10 03:43:50 +02:00
|
|
|
ExecEndNode(node->mt_plans[i]);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2010-07-12 19:01:06 +02:00
|
|
|
ExecReScanModifyTable(ModifyTableState *node)
|
2009-10-10 03:43:50 +02:00
|
|
|
{
|
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* Currently, we don't need to support rescan on ModifyTable nodes. The
|
|
|
|
* semantics of that would be a bit debatable anyway.
|
2009-10-10 03:43:50 +02:00
|
|
|
*/
|
|
|
|
elog(ERROR, "ExecReScanModifyTable is not implemented");
|
|
|
|
}
|