postgresql/src/backend
David Rowley 56788d2156 Allocate consecutive blocks during parallel seqscans
Previously we would allocate blocks to parallel workers during a parallel
sequential scan 1 block at a time.  Since other workers were likely to
request a block before a worker returns for another block number to work
on, this could lead to non-sequential I/O patterns in each worker which
could cause the operating system's readahead to perform poorly or not at
all.

Here we change things so that we allocate consecutive "chunks" of blocks
to workers and have them work on those until they're done, at which time
we allocate another chunk for the worker.  The size of these chunks is
based on the size of the relation.

Initial patch here was by Thomas Munro which showed some good improvements
just having a fixed chunk size of 64 blocks with a simple ramp-down near
the end of the scan. The revisions of the patch to make the chunk size
based on the relation size and the adjusted ramp-down in powers of two was
done by me, along with quite extensive benchmarking to determine the
optimal chunk sizes.

For the most part, benchmarks have shown significant performance
improvements for large parallel sequential scans on Linux, FreeBSD and
Windows using SSDs.  It's less clear how this affects the performance of
cloud providers.  Tests done so far are unable to obtain stable enough
performance to provide meaningful benchmark results.  It is possible that
this could cause some performance regressions on more obscure filesystems,
so we may need to later provide users with some ability to get something
closer to the old behavior.  For now, let's leave that until we see that
it's really required.

Author: Thomas Munro, David Rowley
Reviewed-by: Ranier Vilela, Soumyadeep Chakraborty, Robert Haas
Reviewed-by: Amit Kapila, Kirk Jamison
Discussion: https://postgr.es/m/CA+hUKGJ_EErDv41YycXcbMbCBkztA34+z1ts9VQH+ACRuvpxig@mail.gmail.com
2020-07-26 21:02:45 +12:00
..
access Allocate consecutive blocks during parallel seqscans 2020-07-26 21:02:45 +12:00
bootstrap Be more careful about marking catalog columns NOT NULL by default. 2020-07-21 13:03:48 -04:00
catalog Rename configure.in to configure.ac 2020-07-24 10:42:08 +02:00
commands Improve performance of binary COPY FROM through better buffering. 2020-07-25 16:34:35 -04:00
executor Fix buffer usage stats for nodes above Gather Merge. 2020-07-25 10:20:39 +05:30
foreign Update copyrights for 2020 2020-01-01 12:21:45 -05:00
jit pgindent run prior to branching v13. 2020-06-07 16:57:08 -04:00
lib Move src/backend/utils/hash/hashfn.c to src/common 2020-02-27 09:25:41 +05:30
libpq code: replace most remaining uses of 'master'. 2020-07-08 13:24:35 -07:00
main Clean up includes of s_lock.h. 2020-06-18 19:41:05 -07:00
nodes Rename field "relkind" to "objtype" for CTAS and ALTER TABLE nodes 2020-07-11 13:32:28 +09:00
optimizer Use MinimalTuple for tuple queues. 2020-07-17 15:04:16 +12:00
parser Rename field "relkind" to "objtype" for CTAS and ALTER TABLE nodes 2020-07-11 13:32:28 +09:00
partitioning Fix two typos in a comment 2020-05-22 17:39:16 -04:00
po Translation updates 2020-05-18 12:49:30 +02:00
port Add huge_page_size setting for use on Linux. 2020-07-17 14:33:00 +12:00
postmaster Use RAND_poll() for seeding randomness after fork(). 2020-07-25 14:50:59 -07:00
regex Dial back -Wimplicit-fallthrough to level 3 2020-05-13 15:31:14 -04:00
replication WAL Log invalidations at command end with wal_level=logical. 2020-07-23 08:34:48 +05:30
rewrite Add missing invocations to object access hooks 2020-05-23 14:03:04 +09:00
snowball code: replace most remaining uses of 'master'. 2020-07-08 13:24:35 -07:00
statistics Run pgindent with new pg_bsd_indent version 2.1.1. 2020-05-16 11:54:51 -04:00
storage Fix error message. 2020-07-23 21:10:49 +12:00
tcop Rename field "relkind" to "objtype" for CTAS and ALTER TABLE nodes 2020-07-11 13:32:28 +09:00
tsearch Fix assorted bugs by changing TS_execute's callback API to ternary logic. 2020-07-24 15:26:51 -04:00
utils Tweak behavior of pg_stat_activity.leader_pid 2020-07-26 16:32:11 +09:00
.gitignore Add .gitignore entries for AIX-specific intermediate build artifacts. 2015-07-08 20:44:22 -04:00
Makefile Update copyrights for 2020 2020-01-01 12:21:45 -05:00
common.mk Remove PARTIAL_LINKING build mode. 2018-03-30 17:33:04 -07:00
nls.mk Add missing gettext triggers 2020-04-28 13:35:40 +02:00