2000-09-29 20:21:41 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* nodeSubqueryscan.c
|
|
|
|
* Support routines for scanning subqueries (subselects in rangetable).
|
|
|
|
*
|
2000-10-05 21:11:39 +02:00
|
|
|
* This is just enough different from sublinks (nodeSubplan.c) to mean that
|
|
|
|
* we need two sets of code. Ought to look at trying to unify the cases.
|
|
|
|
*
|
|
|
|
*
|
2015-01-06 17:43:47 +01:00
|
|
|
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
|
2000-09-29 20:21:41 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/executor/nodeSubqueryscan.c
|
2000-09-29 20:21:41 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
/*
|
|
|
|
* INTERFACE ROUTINES
|
|
|
|
* ExecSubqueryScan scans a subquery.
|
|
|
|
* ExecSubqueryNext retrieve next tuple in sequential order.
|
|
|
|
* ExecInitSubqueryScan creates and initializes a subqueryscan node.
|
|
|
|
* ExecEndSubqueryScan releases any storage allocated.
|
2010-07-12 19:01:06 +02:00
|
|
|
* ExecReScanSubqueryScan rescans the relation
|
2000-09-29 20:21:41 +02:00
|
|
|
*
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
|
|
#include "executor/execdebug.h"
|
|
|
|
#include "executor/nodeSubqueryscan.h"
|
|
|
|
|
2002-12-05 16:50:39 +01:00
|
|
|
static TupleTableSlot *SubqueryNext(SubqueryScanState *node);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* Scan Support
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* SubqueryNext
|
|
|
|
*
|
|
|
|
* This is a workhorse for ExecSubqueryScan
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
static TupleTableSlot *
|
2002-12-05 16:50:39 +01:00
|
|
|
SubqueryNext(SubqueryScanState *node)
|
2000-09-29 20:21:41 +02:00
|
|
|
{
|
|
|
|
TupleTableSlot *slot;
|
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
2006-12-26 22:37:20 +01:00
|
|
|
* Get the next tuple from the sub-query.
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
slot = ExecProcNode(node->subplan);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2005-05-23 00:30:20 +02:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* We just return the subplan's result slot, rather than expending extra
|
|
|
|
* cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for
|
|
|
|
* EvalPlanQual rechecks.)
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
*
|
|
|
|
* We do need to mark the slot contents read-only to prevent interference
|
|
|
|
* between different functions reading the same datum from the slot. It's
|
|
|
|
* a bit hokey to do this to the subplan's slot, but should be safe
|
|
|
|
* enough.
|
2005-05-23 00:30:20 +02:00
|
|
|
*/
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
if (!TupIsNull(slot))
|
|
|
|
slot = ExecMakeSlotContentsReadOnly(slot);
|
|
|
|
|
2000-09-29 20:21:41 +02:00
|
|
|
return slot;
|
|
|
|
}
|
|
|
|
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
/*
|
|
|
|
* SubqueryRecheck -- access method routine to recheck a tuple in EvalPlanQual
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
SubqueryRecheck(SubqueryScanState *node, TupleTableSlot *slot)
|
|
|
|
{
|
|
|
|
/* nothing to check */
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2000-09-29 20:21:41 +02:00
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecSubqueryScan(node)
|
|
|
|
*
|
|
|
|
* Scans the subquery sequentially and returns the next qualifying
|
|
|
|
* tuple.
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
* We call the ExecScan() routine and pass it the appropriate
|
|
|
|
* access method functions.
|
|
|
|
* ----------------------------------------------------------------
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
|
|
|
TupleTableSlot *
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecSubqueryScan(SubqueryScanState *node)
|
2000-09-29 20:21:41 +02:00
|
|
|
{
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
return ExecScan(&node->ss,
|
|
|
|
(ExecScanAccessMtd) SubqueryNext,
|
|
|
|
(ExecScanRecheckMtd) SubqueryRecheck);
|
2000-09-29 20:21:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecInitSubqueryScan
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
SubqueryScanState *
|
2006-02-28 05:10:28 +01:00
|
|
|
ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags)
|
2000-09-29 20:21:41 +02:00
|
|
|
{
|
|
|
|
SubqueryScanState *subquerystate;
|
|
|
|
|
2006-02-28 05:10:28 +01:00
|
|
|
/* check for unsupported flags */
|
|
|
|
Assert(!(eflags & EXEC_FLAG_MARK));
|
|
|
|
|
2011-09-03 21:35:12 +02:00
|
|
|
/* SubqueryScan should not have any "normal" children */
|
2002-12-05 16:50:39 +01:00
|
|
|
Assert(outerPlan(node) == NULL);
|
|
|
|
Assert(innerPlan(node) == NULL);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
2002-12-05 16:50:39 +01:00
|
|
|
* create state structure
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
|
|
|
subquerystate = makeNode(SubqueryScanState);
|
2002-12-05 16:50:39 +01:00
|
|
|
subquerystate->ss.ps.plan = (Plan *) node;
|
|
|
|
subquerystate->ss.ps.state = estate;
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
|
|
|
* Miscellaneous initialization
|
2000-09-29 20:21:41 +02:00
|
|
|
*
|
2001-03-22 07:16:21 +01:00
|
|
|
* create expression context for node
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecAssignExprContext(estate, &subquerystate->ss.ps);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* initialize child expressions
|
|
|
|
*/
|
|
|
|
subquerystate->ss.ps.targetlist = (List *)
|
2002-12-13 20:46:01 +01:00
|
|
|
ExecInitExpr((Expr *) node->scan.plan.targetlist,
|
2002-12-05 16:50:39 +01:00
|
|
|
(PlanState *) subquerystate);
|
|
|
|
subquerystate->ss.ps.qual = (List *)
|
2002-12-13 20:46:01 +01:00
|
|
|
ExecInitExpr((Expr *) node->scan.plan.qual,
|
2002-12-05 16:50:39 +01:00
|
|
|
(PlanState *) subquerystate);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
|
|
|
* tuple table initialization
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecInitResultTupleSlot(estate, &subquerystate->ss.ps);
|
2005-05-23 00:30:20 +02:00
|
|
|
ExecInitScanTupleSlot(estate, &subquerystate->ss);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
|
|
|
* initialize subquery
|
2002-12-15 17:17:59 +01:00
|
|
|
*/
|
2007-02-27 02:11:26 +01:00
|
|
|
subquerystate->subplan = ExecInitNode(node->subplan, estate, eflags);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2002-12-05 16:50:39 +01:00
|
|
|
subquerystate->ss.ps.ps_TupFromTlist = false;
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2005-05-23 00:30:20 +02:00
|
|
|
/*
|
2007-02-27 02:11:26 +01:00
|
|
|
* Initialize scan tuple type (needed by ExecAssignScanProjectionInfo)
|
2005-05-23 00:30:20 +02:00
|
|
|
*/
|
|
|
|
ExecAssignScanType(&subquerystate->ss,
|
2007-02-27 02:11:26 +01:00
|
|
|
ExecGetResultType(subquerystate->subplan));
|
2005-05-23 00:30:20 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
2003-01-12 23:01:38 +01:00
|
|
|
* Initialize result tuple type and projection info.
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecAssignResultTypeFromTL(&subquerystate->ss.ps);
|
2005-05-23 00:30:20 +02:00
|
|
|
ExecAssignScanProjectionInfo(&subquerystate->ss);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2002-12-05 16:50:39 +01:00
|
|
|
return subquerystate;
|
2000-09-29 20:21:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* ExecEndSubqueryScan
|
|
|
|
*
|
|
|
|
* frees any storage allocated through C routines.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
void
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecEndSubqueryScan(SubqueryScanState *node)
|
2000-09-29 20:21:41 +02:00
|
|
|
{
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
2002-12-15 17:17:59 +01:00
|
|
|
* Free the exprcontext
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecFreeExprContext(&node->ss.ps);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
2002-12-05 16:50:39 +01:00
|
|
|
* clean out the upper tuple table
|
2000-09-29 20:21:41 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
ExecClearTuple(node->ss.ps.ps_ResultTupleSlot);
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
ExecClearTuple(node->ss.ss_ScanTupleSlot);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-03-22 07:16:21 +01:00
|
|
|
/*
|
2000-09-29 20:21:41 +02:00
|
|
|
* close down subquery
|
|
|
|
*/
|
2007-02-27 02:11:26 +01:00
|
|
|
ExecEndNode(node->subplan);
|
2000-09-29 20:21:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
2010-07-12 19:01:06 +02:00
|
|
|
* ExecReScanSubqueryScan
|
2000-09-29 20:21:41 +02:00
|
|
|
*
|
|
|
|
* Rescans the relation.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
void
|
2010-07-12 19:01:06 +02:00
|
|
|
ExecReScanSubqueryScan(SubqueryScanState *node)
|
2000-09-29 20:21:41 +02:00
|
|
|
{
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
ExecScanReScan(&node->ss);
|
2000-09-29 20:21:41 +02:00
|
|
|
|
2001-05-08 21:47:02 +02:00
|
|
|
/*
|
|
|
|
* ExecReScan doesn't know about my subplan, so I have to do
|
2014-05-06 18:12:18 +02:00
|
|
|
* changed-parameter signaling myself. This is just as well, because the
|
2005-10-15 04:49:52 +02:00
|
|
|
* subplan has its own memory context in which its chgParam state lives.
|
2001-05-08 21:47:02 +02:00
|
|
|
*/
|
2002-12-05 16:50:39 +01:00
|
|
|
if (node->ss.ps.chgParam != NULL)
|
2003-02-09 01:30:41 +01:00
|
|
|
UpdateChangedParamSet(node->subplan, node->ss.ps.chgParam);
|
2001-10-25 07:50:21 +02:00
|
|
|
|
2001-05-08 21:47:02 +02:00
|
|
|
/*
|
|
|
|
* if chgParam of subnode is not null then plan will be re-scanned by
|
|
|
|
* first ExecProcNode.
|
|
|
|
*/
|
|
|
|
if (node->subplan->chgParam == NULL)
|
2010-07-12 19:01:06 +02:00
|
|
|
ExecReScan(node->subplan);
|
2000-09-29 20:21:41 +02:00
|
|
|
}
|