From d37ab378b6e773c278c14b9554a1ea23b355aab9 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Mon, 14 Aug 2023 09:54:03 -0700 Subject: [PATCH] hio: Take number of prior relation extensions into account The new relation extension logic, introduced in 00d1e02be24, could lead to slowdowns in some scenarios. E.g., when loading narrow rows into a table using COPY, the caller of RelationGetBufferForTuple() will only request a small number of pages. Without concurrency, we just extended using pwritev() in that case. However, if there is *some* concurrency, we switched between extending by a small number of pages and a larger number of pages, depending on the number of waiters for the relation extension logic. However, some filesystems, XFS in particular, do not perform well when switching between extending files using fallocate() and pwritev(). To avoid that issue, remember the number of prior relation extensions in BulkInsertState and extend more aggressively if there were prior relation extensions. That not just avoids the aforementioned slowdown, but also leads to noticeable performance gains in other situations, primarily due to extending more aggressively when there is no concurrency. I should have done it this way from the get go. Reported-by: Masahiko Sawada Author: Andres Freund Discussion: https://postgr.es/m/CAD21AoDvDmUQeJtZrau1ovnT_smN940=Kp6mszNGK3bq9yRN6g@mail.gmail.com Backpatch: 16-, where the new relation extension code was added --- src/backend/access/heap/heapam.c | 1 + src/backend/access/heap/hio.c | 19 +++++++++++++++++++ src/include/access/hio.h | 13 ++++++++++--- 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 7ed72abe59..6a66214a58 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -1776,6 +1776,7 @@ GetBulkInsertState(void) bistate->current_buf = InvalidBuffer; bistate->next_free = InvalidBlockNumber; bistate->last_free = InvalidBlockNumber; + bistate->already_extended_by = 0; return bistate; } diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c index c275b08494..21f808fecb 100644 --- a/src/backend/access/heap/hio.c +++ b/src/backend/access/heap/hio.c @@ -283,6 +283,24 @@ RelationAddBlocks(Relation relation, BulkInsertState bistate, */ extend_by_pages += extend_by_pages * waitcount; + /* --- + * If we previously extended using the same bistate, it's very likely + * we'll extend some more. Try to extend by as many pages as + * before. This can be important for performance for several reasons, + * including: + * + * - It prevents mdzeroextend() switching between extending the + * relation in different ways, which is inefficient for some + * filesystems. + * + * - Contention is often intermittent. Even if we currently don't see + * other waiters (see above), extending by larger amounts can + * prevent future contention. + * --- + */ + if (bistate) + extend_by_pages = Max(extend_by_pages, bistate->already_extended_by); + /* * Can't extend by more than MAX_BUFFERS_TO_EXTEND_BY, we need to pin * them all concurrently. @@ -409,6 +427,7 @@ RelationAddBlocks(Relation relation, BulkInsertState bistate, /* maintain bistate->current_buf */ IncrBufferRefCount(buffer); bistate->current_buf = buffer; + bistate->already_extended_by += extend_by_pages; } return buffer; diff --git a/src/include/access/hio.h b/src/include/access/hio.h index 228433ee4a..9bc563b762 100644 --- a/src/include/access/hio.h +++ b/src/include/access/hio.h @@ -32,15 +32,22 @@ typedef struct BulkInsertStateData Buffer current_buf; /* current insertion target page */ /* - * State for bulk extensions. Further pages that were unused at the time - * of the extension. They might be in use by the time we use them though, - * so rechecks are needed. + * State for bulk extensions. + * + * last_free..next_free are further pages that were unused at the time of + * the last extension. They might be in use by the time we use them + * though, so rechecks are needed. * * XXX: Eventually these should probably live in RelationData instead, * alongside targetblock. + * + * already_extended_by is the number of pages that this bulk inserted + * extended by. If we already extended by a significant number of pages, + * we can be more aggressive about extending going forward. */ BlockNumber next_free; BlockNumber last_free; + uint32 already_extended_by; } BulkInsertStateData;