postgresql/src/include/executor/execPartition.h

213 lines
8.8 KiB
C
Raw Normal View History

/*--------------------------------------------------------------------
* execPartition.h
* POSTGRES partitioning executor interface
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* src/include/executor/execPartition.h
*--------------------------------------------------------------------
*/
#ifndef EXECPARTITION_H
#define EXECPARTITION_H
#include "nodes/execnodes.h"
#include "nodes/parsenodes.h"
#include "nodes/plannodes.h"
Support partition pruning at execution time Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
2018-04-07 22:54:31 +02:00
#include "partitioning/partprune.h"
/*-----------------------
* PartitionDispatch - information about one partitioned table in a partition
* hierarchy required to route a tuple to one of its partitions
*
* reldesc Relation descriptor of the table
* key Partition key information of the table
* keystate Execution state required for expressions in the partition key
* partdesc Partition descriptor of the table
* tupslot A standalone TupleTableSlot initialized with this table's tuple
* descriptor
* tupmap TupleConversionMap to convert from the parent's rowtype to
* this table's rowtype (when extracting the partition key of a
* tuple just before routing it through this table)
* indexes Array with partdesc->nparts members (for details on what
* individual members represent, see how they are set in
* get_partition_dispatch_recurse())
*-----------------------
*/
typedef struct PartitionDispatchData
{
Relation reldesc;
PartitionKey key;
List *keystate; /* list of ExprState */
PartitionDesc partdesc;
TupleTableSlot *tupslot;
TupleConversionMap *tupmap;
int *indexes;
} PartitionDispatchData;
typedef struct PartitionDispatchData *PartitionDispatch;
/*-----------------------
* PartitionTupleRouting - Encapsulates all information required to execute
* tuple-routing between partitions.
*
* partition_dispatch_info Array of PartitionDispatch objects with one
* entry for every partitioned table in the
* partition tree.
* num_dispatch number of partitioned tables in the partition
* tree (= length of partition_dispatch_info[])
* partition_oids Array of leaf partitions OIDs with one entry
* for every leaf partition in the partition tree,
* initialized in full by
* ExecSetupPartitionTupleRouting.
* partitions Array of ResultRelInfo* objects with one entry
* for every leaf partition in the partition tree,
* initialized lazily by ExecInitPartitionInfo.
* num_partitions Number of leaf partitions in the partition tree
* (= 'partitions_oid'/'partitions' array length)
* parent_child_tupconv_maps Array of TupleConversionMap objects with one
* entry for every leaf partition (required to
* convert tuple from the root table's rowtype to
* a leaf partition's rowtype after tuple routing
* is done)
* child_parent_tupconv_maps Array of TupleConversionMap objects with one
* entry for every leaf partition (required to
* convert an updated tuple from the leaf
* partition's rowtype to the root table's rowtype
* so that tuple routing can be done)
* child_parent_map_not_required Array of bool. True value means that a map is
* determined to be not required for the given
* partition. False means either we haven't yet
* checked if a map is required, or it was
* determined to be required.
* subplan_partition_offsets Integer array ordered by UPDATE subplans. Each
* element of this array has the index into the
* corresponding partition in partitions array.
* num_subplan_partition_offsets Length of 'subplan_partition_offsets' array
* partition_tuple_slot TupleTableSlot to be used to manipulate any
* given leaf partition's rowtype after that
* partition is chosen for insertion by
* tuple-routing.
*-----------------------
*/
typedef struct PartitionTupleRouting
{
PartitionDispatch *partition_dispatch_info;
int num_dispatch;
Oid *partition_oids;
ResultRelInfo **partitions;
int num_partitions;
TupleConversionMap **parent_child_tupconv_maps;
TupleConversionMap **child_parent_tupconv_maps;
bool *child_parent_map_not_required;
int *subplan_partition_offsets;
int num_subplan_partition_offsets;
TupleTableSlot *partition_tuple_slot;
TupleTableSlot *root_tuple_slot;
} PartitionTupleRouting;
Support partition pruning at execution time Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
2018-04-07 22:54:31 +02:00
/*-----------------------
* PartitionPruningData - Encapsulates all information required to support
* elimination of partitions in node types which support arbitrary Lists of
* subplans. Information stored here allows the planner's partition pruning
* functions to be called and the return value of partition indexes translated
* into the subpath indexes of node types such as Append, thus allowing us to
* bypass certain subnodes when we have proofs that indicate that no tuple
* matching the 'pruning_steps' will be found within.
*
* subnode_map An array containing the subnode index which
* matches this partition index, or -1 if the
* subnode has been pruned already.
* subpart_map An array containing the offset into the
* 'partprunedata' array in PartitionPruning, or
* -1 if there is no such element in that array.
* present_parts A Bitmapset of the partition index that we have
* subnodes mapped for.
* context Contains the context details required to call
* the partition pruning code.
* pruning_steps Contains a list of PartitionPruneStep used to
* perform the actual pruning.
* extparams Contains paramids of external params found
* matching partition keys in 'pruning_steps'.
* allparams As 'extparams' but also including exec params.
*-----------------------
*/
typedef struct PartitionPruningData
{
int *subnode_map;
int *subpart_map;
Bitmapset *present_parts;
PartitionPruneContext context;
List *pruning_steps;
Bitmapset *extparams;
Bitmapset *allparams;
} PartitionPruningData;
/*-----------------------
* PartitionPruneState - State object required for executor nodes to perform
* partition pruning elimination of their subnodes. This encapsulates a
* flattened hierarchy of PartitionPruningData structs and also stores all
* paramids which were found to match the partition keys of each partition.
* This struct can be attached to node types which support arbitrary Lists of
* subnodes containing partitions to allow subnodes to be eliminated due to
* the clauses being unable to match to any tuple that the subnode could
* possibly produce.
*
* partprunedata Array of PartitionPruningData for the node's target
* partitioned relation. First element contains the
* details for the target partitioned table.
* num_partprunedata Number of items in 'partprunedata' array.
* prune_context A memory context which can be used to call the query
* planner's partition prune functions.
* extparams All PARAM_EXTERN paramids which were found to match a
* partition key in each of the contained
* PartitionPruningData structs.
* execparams As above but for PARAM_EXEC.
* allparams Union of 'extparams' and 'execparams', saved to avoid
* recalculation.
*-----------------------
*/
typedef struct PartitionPruneState
{
PartitionPruningData *partprunedata;
int num_partprunedata;
MemoryContext prune_context;
Bitmapset *extparams;
Bitmapset *execparams;
Bitmapset *allparams;
} PartitionPruneState;
extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
Relation rel);
extern int ExecFindPartition(ResultRelInfo *resultRelInfo,
PartitionDispatch *pd,
TupleTableSlot *slot,
EState *estate);
extern ResultRelInfo *ExecInitPartitionInfo(ModifyTableState *mtstate,
ResultRelInfo *resultRelInfo,
PartitionTupleRouting *proute,
EState *estate, int partidx);
extern void ExecInitRoutingInfo(ModifyTableState *mtstate,
EState *estate,
PartitionTupleRouting *proute,
ResultRelInfo *partRelInfo,
int partidx);
extern void ExecSetupChildParentMapForLeaf(PartitionTupleRouting *proute);
extern TupleConversionMap *TupConvMapForLeaf(PartitionTupleRouting *proute,
ResultRelInfo *rootRelInfo, int leaf_index);
extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map,
HeapTuple tuple,
TupleTableSlot *new_slot,
TupleTableSlot **p_my_slot);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute);
Support partition pruning at execution time Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
2018-04-07 22:54:31 +02:00
extern PartitionPruneState *ExecSetupPartitionPruneState(PlanState *planstate,
List *partitionpruneinfo);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
int nsubnodes);
#endif /* EXECPARTITION_H */