1996-07-09 08:22:35 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* nodeHashjoin.h
|
2005-03-06 23:15:05 +01:00
|
|
|
* prototypes for nodeHashjoin.c
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
*
|
2018-01-03 05:30:12 +01:00
|
|
|
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/executor/nodeHashjoin.h
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
1997-09-07 07:04:48 +02:00
|
|
|
#ifndef NODEHASHJOIN_H
|
|
|
|
#define NODEHASHJOIN_H
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2017-12-26 16:21:27 +01:00
|
|
|
#include "access/parallel.h"
|
2002-12-05 16:50:39 +01:00
|
|
|
#include "nodes/execnodes.h"
|
2005-03-06 23:15:05 +01:00
|
|
|
#include "storage/buffile.h"
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2006-02-28 05:10:28 +01:00
|
|
|
extern HashJoinState *ExecInitHashJoin(HashJoin *node, EState *estate, int eflags);
|
2002-12-05 16:50:39 +01:00
|
|
|
extern void ExecEndHashJoin(HashJoinState *node);
|
2010-07-12 19:01:06 +02:00
|
|
|
extern void ExecReScanHashJoin(HashJoinState *node);
|
Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash. While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.
After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory. If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.
The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:
* avoids wasting memory on duplicated hash tables
* avoids wasting disk space on duplicated batch files
* divides the work of building the hash table over the CPUs
One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables. This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes. Another is that
outer batch 0 must be written to disk if multiple batches are required.
A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.
A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.
Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 08:39:21 +01:00
|
|
|
extern void ExecShutdownHashJoin(HashJoinState *node);
|
|
|
|
extern void ExecHashJoinEstimate(HashJoinState *state, ParallelContext *pcxt);
|
|
|
|
extern void ExecHashJoinInitializeDSM(HashJoinState *state, ParallelContext *pcxt);
|
|
|
|
extern void ExecHashJoinReInitializeDSM(HashJoinState *state, ParallelContext *pcxt);
|
|
|
|
extern void ExecHashJoinInitializeWorker(HashJoinState *state,
|
|
|
|
ParallelWorkerContext *pwcxt);
|
2002-12-05 16:50:39 +01:00
|
|
|
|
2007-06-07 21:19:57 +02:00
|
|
|
extern void ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
|
|
|
|
BufFile **fileptr);
|
2001-10-28 07:26:15 +01:00
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* NODEHASHJOIN_H */
|