postgresql/src/include/executor/nodeHashjoin.h

/*-------------------------------------------------------------------------
 *
 * nodeHashjoin.h
 *	  prototypes for nodeHashjoin.c
 *
 *
 * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * src/include/executor/nodeHashjoin.h
 *
 *-------------------------------------------------------------------------
 */
#ifndef NODEHASHJOIN_H
#define NODEHASHJOIN_H

#include "access/parallel.h"
#include "nodes/execnodes.h"
#include "storage/buffile.h"

extern HashJoinState *ExecInitHashJoin(HashJoin *node, EState *estate, int eflags);
extern void ExecEndHashJoin(HashJoinState *node);
extern void ExecReScanHashJoin(HashJoinState *node);
extern void ExecShutdownHashJoin(HashJoinState *node);
extern void ExecHashJoinEstimate(HashJoinState *state, ParallelContext *pcxt);
extern void ExecHashJoinInitializeDSM(HashJoinState *state, ParallelContext *pcxt);
extern void ExecHashJoinReInitializeDSM(HashJoinState *state, ParallelContext *pcxt);
extern void ExecHashJoinInitializeWorker(HashJoinState *state,
							 ParallelWorkerContext *pwcxt);

extern void ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
					  BufFile **fileptr);

#endif							/* NODEHASHJOIN_H */
Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 08:22:35 +02:00			`/*-------------------------------------------------------------------------`
			`*`
Change my-function-name-- to my_function_name, and optimizer renames. 1999-02-14 00:22:53 +01:00			`* nodeHashjoin.h`
Revise hash join code so that we can increase the number of batches on-the-fly, and thereby avoid blowing out memory when the planner has underestimated the hash table size. Hash join will now obey the work_mem limit with some faithfulness. Per my recent proposal (hash aggregate part isn't done yet though). 2005-03-06 23:15:05 +01:00			`* prototypes for nodeHashjoin.c`
Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 08:22:35 +02:00			`*`
			`*`
Update copyright for 2018 Backpatch-through: certain files through 9.3 2018-01-03 05:30:12 +01:00			`* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group`
Add: * Portions Copyright (c) 1996-2000, PostgreSQL, Inc to all files copyright Regents of Berkeley. Man, that's a lot of files. 2000-01-26 06:58:53 +01:00			`* Portions Copyright (c) 1994, Regents of the University of California`
Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 08:22:35 +02:00			`*`
Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00			`* src/include/executor/nodeHashjoin.h`
Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 08:22:35 +02:00			`*`
			`*-------------------------------------------------------------------------`
			`*/`
Massive commit to run PGINDENT on all .c and .h files. 1997-09-07 07:04:48 +02:00			`#ifndef NODEHASHJOIN_H`
			`#define NODEHASHJOIN_H`
Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 08:22:35 +02:00
Add includes to make header files self-contained 2017-12-26 16:21:27 +01:00			`#include "access/parallel.h"`
Phase 1 of read-only-plans project: cause executor state nodes to point to plan nodes, not vice-versa. All executor state nodes now inherit from struct PlanState. Copying of plan trees has been simplified by not storing a list of SubPlans in Plan nodes (eliminating duplicate links). The executor still needs such a list, but it can build it during ExecutorStart since it has to scan the plan tree anyway. No initdb forced since no stored-on-disk structures changed, but you will need a full recompile because of node-numbering changes. 2002-12-05 16:50:39 +01:00			`#include "nodes/execnodes.h"`
Revise hash join code so that we can increase the number of batches on-the-fly, and thereby avoid blowing out memory when the planner has underestimated the hash table size. Hash join will now obey the work_mem limit with some faithfulness. Per my recent proposal (hash aggregate part isn't done yet though). 2005-03-06 23:15:05 +01:00			`#include "storage/buffile.h"`
Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 08:22:35 +02:00
Extend the ExecInitNode API so that plan nodes receive a set of flag bits indicating which optional capabilities can actually be exercised at runtime. This will allow Sort and Material nodes, and perhaps later other nodes, to avoid unnecessary overhead in common cases. This commit just adds the infrastructure and arranges to pass the correct flag values down to plan nodes; none of the actual optimizations are here yet. I'm committing this separately in case anyone wants to measure the added overhead. (It should be negligible.) Simon Riggs and Tom Lane 2006-02-28 05:10:28 +01:00			`extern HashJoinState ExecInitHashJoin(HashJoin node, EState *estate, int eflags);`
Phase 1 of read-only-plans project: cause executor state nodes to point to plan nodes, not vice-versa. All executor state nodes now inherit from struct PlanState. Copying of plan trees has been simplified by not storing a list of SubPlans in Plan nodes (eliminating duplicate links). The executor still needs such a list, but it can build it during ExecutorStart since it has to scan the plan tree anyway. No initdb forced since no stored-on-disk structures changed, but you will need a full recompile because of node-numbering changes. 2002-12-05 16:50:39 +01:00			`extern void ExecEndHashJoin(HashJoinState *node);`
Make NestLoop plan nodes pass outer-relation variables into their inner relation using the general PARAM_EXEC executor parameter mechanism, rather than the ad-hoc kluge of passing the outer tuple down through ExecReScan. The previous method was hard to understand and could never be extended to handle parameters coming from multiple join levels. This patch doesn't change the set of possible plans nor have any significant performance effect, but it's necessary infrastructure for future generalization of the concept of an inner indexscan plan. ExecReScan's second parameter is now unused, so it's removed. 2010-07-12 19:01:06 +02:00			`extern void ExecReScanHashJoin(HashJoinState *node);`
Add parallel-aware hash joins. Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, meaning that the work of the inner subplan was duplicated in every worker. After this commit, the planner will consider using a partial subplan on the inner side too, using the Parallel Hash node to divide the work over the available CPU cores and combine its results in shared memory. If the join needs to be split into multiple batches in order to respect work_mem, then workers process different batches as much as possible and then work together on the remaining batches. The advantages of a parallel-aware hash join over a parallel-oblivious hash join used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUs One disadvantage is that there is some communication between the participating CPUs which might outweigh the benefits of parallelism in the case of small hash tables. This is avoided by the planner's existing reluctance to supply partial plans for small scans, but it may be necessary to estimate synchronization costs in future if that situation changes. Another is that outer batch 0 must be written to disk if multiple batches are required. A potential future advantage of parallel-aware hash joins is that right and full outer joins could be supported, since there is a single set of matched bits for each hashtable, but that is not yet implemented. A new GUC enable_parallel_hash is defined to control the feature, defaulting to on. Author: Thomas Munro Reviewed-By: Andres Freund, Robert Haas Tested-By: Rafia Sabih, Prabhat Sahu Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com 2017-12-21 08:39:21 +01:00			`extern void ExecShutdownHashJoin(HashJoinState *node);`
			`extern void ExecHashJoinEstimate(HashJoinState state, ParallelContext pcxt);`
			`extern void ExecHashJoinInitializeDSM(HashJoinState state, ParallelContext pcxt);`
			`extern void ExecHashJoinReInitializeDSM(HashJoinState state, ParallelContext pcxt);`
			`extern void ExecHashJoinInitializeWorker(HashJoinState *state,`
			`ParallelWorkerContext *pwcxt);`
Phase 1 of read-only-plans project: cause executor state nodes to point to plan nodes, not vice-versa. All executor state nodes now inherit from struct PlanState. Copying of plan trees has been simplified by not storing a list of SubPlans in Plan nodes (eliminating duplicate links). The executor still needs such a list, but it can build it during ExecutorStart since it has to scan the plan tree anyway. No initdb forced since no stored-on-disk structures changed, but you will need a full recompile because of node-numbering changes. 2002-12-05 16:50:39 +01:00
Rework temp_tablespaces patch so that temp tablespaces are assigned separately for each temp file, rather than once per sort or hashjoin; this allows spreading the data of a large sort or join across multiple tablespaces. (I remain dubious that this will make any difference in practice, but certain people insisted.) Arrange to cache the results of parsing the GUC variable instead of recomputing from scratch on every demand, and push usage of the cache down to the bottommost fd.c level. 2007-06-07 21:19:57 +02:00			`extern void ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,`
			`BufFile **fileptr);`
Another pgindent run. Fixes enum indenting, and improves #endif spacing. Also adds space for one-line comments. 2001-10-28 07:26:15 +01:00
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us 2017-06-21 21:18:54 +02:00			`#endif /* NODEHASHJOIN_H */`