1999-10-16 21:49:28 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* logtape.h
|
|
|
|
* Management of "logical tapes" within temporary files.
|
|
|
|
*
|
|
|
|
* See logtape.c for explanations.
|
|
|
|
*
|
2019-01-02 18:44:25 +01:00
|
|
|
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1999-10-16 21:49:28 +02:00
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/utils/logtape.h
|
1999-10-16 21:49:28 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef LOGTAPE_H
|
|
|
|
#define LOGTAPE_H
|
|
|
|
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
#include "storage/sharedfileset.h"
|
|
|
|
|
1999-10-16 21:49:28 +02:00
|
|
|
/* LogicalTapeSet is an opaque type whose details are not known outside logtape.c. */
|
|
|
|
|
|
|
|
typedef struct LogicalTapeSet LogicalTapeSet;
|
|
|
|
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
/*
|
|
|
|
* The approach tuplesort.c takes to parallel external sorts is that workers,
|
|
|
|
* whose state is almost the same as independent serial sorts, are made to
|
|
|
|
* produce a final materialized tape of sorted output in all cases. This is
|
|
|
|
* frozen, just like any case requiring a final materialized tape. However,
|
|
|
|
* there is one difference, which is that freezing will also export an
|
|
|
|
* underlying shared fileset BufFile for sharing. Freezing produces TapeShare
|
|
|
|
* metadata for the worker when this happens, which is passed along through
|
|
|
|
* shared memory to leader.
|
|
|
|
*
|
|
|
|
* The leader process can then pass an array of TapeShare metadata (one per
|
|
|
|
* worker participant) to LogicalTapeSetCreate(), alongside a handle to a
|
|
|
|
* shared fileset, which is sufficient to construct a new logical tapeset that
|
|
|
|
* consists of each of the tapes materialized by workers.
|
|
|
|
*
|
|
|
|
* Note that while logtape.c does create an empty leader tape at the end of the
|
|
|
|
* tapeset in the leader case, it can never be written to due to a restriction
|
|
|
|
* in the shared buffile infrastructure.
|
|
|
|
*/
|
|
|
|
typedef struct TapeShare
|
|
|
|
{
|
|
|
|
/*
|
2018-05-02 16:23:13 +02:00
|
|
|
* Currently, all the leader process needs is the location of the
|
|
|
|
* materialized tape's first block.
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
*/
|
|
|
|
long firstblocknumber;
|
|
|
|
} TapeShare;
|
|
|
|
|
1999-10-16 21:49:28 +02:00
|
|
|
/*
|
|
|
|
* prototypes for functions in logtape.c
|
|
|
|
*/
|
|
|
|
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
extern LogicalTapeSet *LogicalTapeSetCreate(int ntapes, TapeShare *shared,
|
|
|
|
SharedFileSet *fileset, int worker);
|
1999-10-16 21:49:28 +02:00
|
|
|
extern void LogicalTapeSetClose(LogicalTapeSet *lts);
|
2006-03-07 20:06:50 +01:00
|
|
|
extern void LogicalTapeSetForgetFreeSpace(LogicalTapeSet *lts);
|
1999-10-16 21:49:28 +02:00
|
|
|
extern size_t LogicalTapeRead(LogicalTapeSet *lts, int tapenum,
|
2000-04-12 19:17:23 +02:00
|
|
|
void *ptr, size_t size);
|
1999-10-16 21:49:28 +02:00
|
|
|
extern void LogicalTapeWrite(LogicalTapeSet *lts, int tapenum,
|
2000-04-12 19:17:23 +02:00
|
|
|
void *ptr, size_t size);
|
2016-10-12 11:05:45 +02:00
|
|
|
extern void LogicalTapeRewindForRead(LogicalTapeSet *lts, int tapenum,
|
|
|
|
size_t buffer_size);
|
|
|
|
extern void LogicalTapeRewindForWrite(LogicalTapeSet *lts, int tapenum);
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
extern void LogicalTapeFreeze(LogicalTapeSet *lts, int tapenum,
|
|
|
|
TapeShare *share);
|
2016-12-22 17:45:00 +01:00
|
|
|
extern size_t LogicalTapeBackspace(LogicalTapeSet *lts, int tapenum,
|
2000-04-12 19:17:23 +02:00
|
|
|
size_t size);
|
2016-12-22 17:45:00 +01:00
|
|
|
extern void LogicalTapeSeek(LogicalTapeSet *lts, int tapenum,
|
2000-04-12 19:17:23 +02:00
|
|
|
long blocknum, int offset);
|
1999-10-16 21:49:28 +02:00
|
|
|
extern void LogicalTapeTell(LogicalTapeSet *lts, int tapenum,
|
2000-04-12 19:17:23 +02:00
|
|
|
long *blocknum, int *offset);
|
2005-10-19 00:59:37 +02:00
|
|
|
extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);
|
2001-10-28 07:26:15 +01:00
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* LOGTAPE_H */
|