1996-07-09 08:22:35 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* aset.c
|
1996-07-09 08:22:35 +02:00
|
|
|
* Allocation set definitions.
|
|
|
|
*
|
2000-06-28 05:33:33 +02:00
|
|
|
* AllocSet is our standard implementation of the abstract MemoryContext
|
|
|
|
* type.
|
|
|
|
*
|
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/utils/mmgr/aset.c
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
1999-02-06 17:50:34 +01:00
|
|
|
* NOTE:
|
|
|
|
* This is a new (Feb. 05, 1999) implementation of the allocation set
|
|
|
|
* routines. AllocSet...() does not use OrderedSet...() any more.
|
|
|
|
* Instead it manages allocations in a block pool by itself, combining
|
1999-08-24 22:11:19 +02:00
|
|
|
* many small allocations in a few bigger blocks. AllocSetFree() normally
|
|
|
|
* doesn't free() memory really. It just add's the free'd area to some
|
1999-02-06 17:50:34 +01:00
|
|
|
* list for later reuse by AllocSetAlloc(). All memory blocks are free()'d
|
1999-05-23 01:19:37 +02:00
|
|
|
* at once on AllocSetReset(), which happens when the memory context gets
|
1999-02-06 17:50:34 +01:00
|
|
|
* destroyed.
|
|
|
|
* Jan Wieck
|
1999-08-24 22:11:19 +02:00
|
|
|
*
|
|
|
|
* Performance improvement from Tom Lane, 8/99: for extremely large request
|
|
|
|
* sizes, we do want to be able to give the memory back to free() as soon
|
|
|
|
* as it is pfree()'d. Otherwise we risk tying up a lot of memory in
|
|
|
|
* freelist entries that might never be usable. This is specially needed
|
|
|
|
* when the caller is repeatedly repalloc()'ing a block bigger and bigger;
|
|
|
|
* the previous instances of the block were guaranteed to be wasted until
|
|
|
|
* AllocSetReset() under the old way.
|
2000-12-01 06:16:45 +01:00
|
|
|
*
|
2000-12-01 19:14:29 +01:00
|
|
|
* Further improvement 12/00: as the code stood, request sizes in the
|
|
|
|
* midrange between "small" and "large" were handled very inefficiently,
|
|
|
|
* because any sufficiently large free chunk would be used to satisfy a
|
|
|
|
* request, even if it was much larger than necessary. This led to more
|
|
|
|
* and more wasted space in allocated chunks over time. To fix, get rid
|
|
|
|
* of the midrange behavior: we now handle only "small" power-of-2-size
|
|
|
|
* chunks as chunks. Anything "large" is passed off to malloc(). Change
|
|
|
|
* the number of freelists to change the small/large boundary.
|
|
|
|
*
|
1996-07-09 08:22:35 +02:00
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
2000-06-28 05:33:33 +02:00
|
|
|
|
1996-11-06 11:32:10 +01:00
|
|
|
#include "postgres.h"
|
2000-06-28 05:33:33 +02:00
|
|
|
|
2019-12-28 23:21:17 +01:00
|
|
|
#include "port/pg_bitutils.h"
|
2013-06-27 02:00:08 +02:00
|
|
|
#include "utils/memdebug.h"
|
1999-07-16 05:14:30 +02:00
|
|
|
#include "utils/memutils.h"
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#include "utils/memutils_internal.h"
|
2024-03-13 15:07:00 +01:00
|
|
|
#include "utils/memutils_memorychunk.h"
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
/*--------------------
|
|
|
|
* Chunk freelist k holds chunks of size 1 << (k + ALLOC_MINBITS),
|
|
|
|
* for k = 0 .. ALLOCSET_NUM_FREELISTS-1.
|
|
|
|
*
|
|
|
|
* Note that all chunks in the freelists have power-of-2 sizes. This
|
|
|
|
* improves recyclability: we may waste some space, but the wasted space
|
|
|
|
* should stay pretty constant as requests are made and released.
|
|
|
|
*
|
|
|
|
* A request too large for the last freelist is handled by allocating a
|
|
|
|
* dedicated block from malloc(). The block still has a block header and
|
|
|
|
* chunk header, but when the chunk is freed we'll return the whole block
|
|
|
|
* to malloc(), not put it on our freelists.
|
|
|
|
*
|
|
|
|
* CAUTION: ALLOC_MINBITS must be large enough so that
|
|
|
|
* 1<<ALLOC_MINBITS is at least MAXALIGN,
|
|
|
|
* or we may fail to align the smallest chunks adequately.
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
* 8-byte alignment is enough on all currently known machines. This 8-byte
|
|
|
|
* minimum also allows us to store a pointer to the next freelist item within
|
|
|
|
* the chunk of memory itself.
|
2000-12-01 19:14:29 +01:00
|
|
|
*
|
|
|
|
* With the current parameters, request sizes up to 8K are treated as chunks,
|
|
|
|
* larger requests go into dedicated blocks. Change ALLOCSET_NUM_FREELISTS
|
Fix bogus "out of memory" reports in tuplestore.c.
The tuplesort/tuplestore memory management logic assumed that the chunk
allocation overhead for its memtuples array could not increase when
increasing the array size. This is and always was true for tuplesort,
but we (I, I think) blindly copied that logic into tuplestore.c without
noticing that the assumption failed to hold for the much smaller array
elements used by tuplestore. Given rather small work_mem, this could
result in an improper complaint about "unexpected out-of-memory situation",
as reported by Brent DeSpain in bug #13530.
The easiest way to fix this is just to increase tuplestore's initial
array size so that the assumption holds. Rather than relying on magic
constants, though, let's export a #define from aset.c that represents
the safe allocation threshold, and make tuplestore's calculation depend
on that.
Do the same in tuplesort.c to keep the logic looking parallel, even though
tuplesort.c isn't actually at risk at present. This will keep us from
breaking it if we ever muck with the allocation parameters in aset.c.
Back-patch to all supported versions. The error message doesn't occur
pre-9.3, not so much because the problem can't happen as because the
pre-9.3 tuplestore code neglected to check for it. (The chance of
trouble is a great deal larger as of 9.3, though, due to changes in the
array-size-increasing strategy.) However, allowing LACKMEM() to become
true unexpectedly could still result in less-than-desirable behavior,
so let's patch it all the way back.
2015-08-05 00:18:46 +02:00
|
|
|
* to adjust the boundary point; and adjust ALLOCSET_SEPARATE_THRESHOLD in
|
|
|
|
* memutils.h to agree. (Note: in contexts with small maxBlockSize, we may
|
|
|
|
* set the allocChunkLimit to less than 8K, so as to avoid space wastage.)
|
2000-12-01 19:14:29 +01:00
|
|
|
*--------------------
|
2000-06-28 05:33:33 +02:00
|
|
|
*/
|
2000-12-01 19:14:29 +01:00
|
|
|
|
2004-05-26 21:44:15 +02:00
|
|
|
#define ALLOC_MINBITS 3 /* smallest chunk size is 8 bytes */
|
|
|
|
#define ALLOCSET_NUM_FREELISTS 11
|
2000-12-01 19:14:29 +01:00
|
|
|
#define ALLOC_CHUNK_LIMIT (1 << (ALLOCSET_NUM_FREELISTS-1+ALLOC_MINBITS))
|
|
|
|
/* Size of largest chunk that we use a fixed size for */
|
2011-05-02 18:08:08 +02:00
|
|
|
#define ALLOC_CHUNK_FRACTION 4
|
|
|
|
/* We allow chunks to be at most 1/4 of maxBlockSize (less overhead) */
|
2000-12-01 19:14:29 +01:00
|
|
|
|
|
|
|
/*--------------------
|
|
|
|
* The first block allocated for an allocset has size initBlockSize.
|
|
|
|
* Each time we have to allocate another block, we double the block size
|
|
|
|
* (if possible, and without exceeding maxBlockSize), so as to reduce
|
|
|
|
* the bookkeeping load on malloc().
|
|
|
|
*
|
|
|
|
* Blocks allocated to hold oversize chunks do not follow this rule, however;
|
|
|
|
* they are just however big they need to be to hold that single chunk.
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
*
|
|
|
|
* Also, if a minContextSize is specified, the first block has that size,
|
|
|
|
* and then initBlockSize is used for the next one.
|
2000-12-01 19:14:29 +01:00
|
|
|
*--------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define ALLOC_BLOCKHDRSZ MAXALIGN(sizeof(AllocBlockData))
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#define ALLOC_CHUNKHDRSZ sizeof(MemoryChunk)
|
2013-06-27 02:00:08 +02:00
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
typedef struct AllocBlockData *AllocBlock; /* forward reference */
|
2000-06-28 05:33:33 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* AllocPointer
|
|
|
|
* Aligned pointer which may be a member of an allocation set.
|
|
|
|
*/
|
|
|
|
typedef void *AllocPointer;
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/*
|
|
|
|
* AllocFreeListLink
|
|
|
|
* When pfreeing memory, if we maintain a freelist for the given chunk's
|
|
|
|
* size then we use a AllocFreeListLink to point to the current item in
|
|
|
|
* the AllocSetContext's freelist and then set the given freelist element
|
|
|
|
* to point to the chunk being freed.
|
|
|
|
*/
|
|
|
|
typedef struct AllocFreeListLink
|
|
|
|
{
|
|
|
|
MemoryChunk *next;
|
|
|
|
} AllocFreeListLink;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Obtain a AllocFreeListLink for the given chunk. Allocation sizes are
|
|
|
|
* always at least sizeof(AllocFreeListLink), so we reuse the pointer's memory
|
|
|
|
* itself to store the freelist link.
|
|
|
|
*/
|
|
|
|
#define GetFreeListLink(chkptr) \
|
|
|
|
(AllocFreeListLink *) ((char *) (chkptr) + ALLOC_CHUNKHDRSZ)
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/* Validate a freelist index retrieved from a chunk header */
|
|
|
|
#define FreeListIdxIsValid(fidx) \
|
|
|
|
((fidx) >= 0 && (fidx) < ALLOCSET_NUM_FREELISTS)
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* Determine the size of the chunk based on the freelist index */
|
|
|
|
#define GetChunkSizeFromFreeListIdx(fidx) \
|
|
|
|
((((Size) 1) << ALLOC_MINBITS) << (fidx))
|
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
/*
|
|
|
|
* AllocSetContext is our standard implementation of MemoryContext.
|
2005-09-01 20:15:42 +02:00
|
|
|
*
|
2011-05-24 23:56:52 +02:00
|
|
|
* Note: header.isReset means there is nothing for AllocSetReset to do.
|
|
|
|
* This is different from the aset being physically empty (empty blocks list)
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
* because we will still have a keeper block. It's also different from the set
|
2011-05-24 23:56:52 +02:00
|
|
|
* being logically empty, because we don't attempt to detect pfree'ing the
|
|
|
|
* last active chunk.
|
2000-12-01 19:14:29 +01:00
|
|
|
*/
|
|
|
|
typedef struct AllocSetContext
|
|
|
|
{
|
|
|
|
MemoryContextData header; /* Standard memory-context fields */
|
|
|
|
/* Info about storage allocated in this context: */
|
|
|
|
AllocBlock blocks; /* head of list of blocks in this set */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *freelist[ALLOCSET_NUM_FREELISTS]; /* free chunk lists */
|
2000-12-01 19:14:29 +01:00
|
|
|
/* Allocation parameters for this context: */
|
2023-07-17 01:16:56 +02:00
|
|
|
uint32 initBlockSize; /* initial block size */
|
|
|
|
uint32 maxBlockSize; /* maximum block size */
|
|
|
|
uint32 nextBlockSize; /* next block size to allocate */
|
|
|
|
uint32 allocChunkLimit; /* effective chunk size limit */
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* freelist this context could be put in, or -1 if not a candidate: */
|
|
|
|
int freeListIndex; /* index in context_freelists[], or -1 */
|
2000-12-01 19:14:29 +01:00
|
|
|
} AllocSetContext;
|
|
|
|
|
|
|
|
typedef AllocSetContext *AllocSet;
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
/*
|
|
|
|
* AllocBlock
|
|
|
|
* An AllocBlock is the unit of memory that is obtained by aset.c
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
* from malloc(). It contains one or more MemoryChunks, which are
|
|
|
|
* the units requested by palloc() and freed by pfree(). MemoryChunks
|
2000-06-28 05:33:33 +02:00
|
|
|
* cannot be returned to malloc() individually, instead they are put
|
|
|
|
* on freelists by pfree() and re-used by the next palloc() that has
|
|
|
|
* a matching request size.
|
|
|
|
*
|
|
|
|
* AllocBlockData is the header data for a block --- the usable space
|
|
|
|
* within the block begins at the next alignment boundary.
|
|
|
|
*/
|
|
|
|
typedef struct AllocBlockData
|
|
|
|
{
|
|
|
|
AllocSet aset; /* aset that owns this block */
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
AllocBlock prev; /* prev block in aset's blocks list, if any */
|
|
|
|
AllocBlock next; /* next block in aset's blocks list, if any */
|
2000-06-28 05:33:33 +02:00
|
|
|
char *freeptr; /* start of free space in this block */
|
|
|
|
char *endptr; /* end of space in this block */
|
|
|
|
} AllocBlockData;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* AllocPointerIsValid
|
|
|
|
* True iff pointer is valid allocation pointer.
|
|
|
|
*/
|
|
|
|
#define AllocPointerIsValid(pointer) PointerIsValid(pointer)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* AllocSetIsValid
|
|
|
|
* True iff set is valid allocation set.
|
|
|
|
*/
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
#define AllocSetIsValid(set) \
|
|
|
|
(PointerIsValid(set) && IsA(set, AllocSetContext))
|
|
|
|
|
|
|
|
/*
|
|
|
|
* AllocBlockIsValid
|
|
|
|
* True iff block is valid block of allocation set.
|
|
|
|
*/
|
|
|
|
#define AllocBlockIsValid(block) \
|
|
|
|
(PointerIsValid(block) && AllocSetIsValid((block)->aset))
|
1996-07-09 08:22:35 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/*
|
|
|
|
* We always store external chunks on a dedicated block. This makes fetching
|
|
|
|
* the block from an external chunk easy since it's always the first and only
|
|
|
|
* chunk on the block.
|
|
|
|
*/
|
|
|
|
#define ExternalChunkGetBlock(chunk) \
|
|
|
|
(AllocBlock) ((char *) chunk - ALLOC_BLOCKHDRSZ)
|
1996-07-09 08:22:35 +02:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/*
|
|
|
|
* Rather than repeatedly creating and deleting memory contexts, we keep some
|
|
|
|
* freed contexts in freelists so that we can hand them out again with little
|
|
|
|
* work. Before putting a context in a freelist, we reset it so that it has
|
|
|
|
* only its initial malloc chunk and no others. To be a candidate for a
|
|
|
|
* freelist, a context must have the same minContextSize/initBlockSize as
|
|
|
|
* other contexts in the list; but its maxBlockSize is irrelevant since that
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
* doesn't affect the size of the initial chunk.
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
*
|
|
|
|
* We currently provide one freelist for ALLOCSET_DEFAULT_SIZES contexts
|
|
|
|
* and one for ALLOCSET_SMALL_SIZES contexts; the latter works for
|
|
|
|
* ALLOCSET_START_SMALL_SIZES too, since only the maxBlockSize differs.
|
|
|
|
*
|
|
|
|
* Ordinarily, we re-use freelist contexts in last-in-first-out order, in
|
|
|
|
* hopes of improving locality of reference. But if there get to be too
|
|
|
|
* many contexts in the list, we'd prefer to drop the most-recently-created
|
|
|
|
* contexts in hopes of keeping the process memory map compact.
|
|
|
|
* We approximate that by simply deleting all existing entries when the list
|
|
|
|
* overflows, on the assumption that queries that allocate a lot of contexts
|
|
|
|
* will probably free them in more or less reverse order of allocation.
|
|
|
|
*
|
|
|
|
* Contexts in a freelist are chained via their nextchild pointers.
|
|
|
|
*/
|
|
|
|
#define MAX_FREE_CONTEXTS 100 /* arbitrary limit on freelist length */
|
|
|
|
|
2023-07-17 01:16:56 +02:00
|
|
|
/* Obtain the keeper block for an allocation set */
|
|
|
|
#define KeeperBlock(set) \
|
|
|
|
((AllocBlock) (((char *) set) + MAXALIGN(sizeof(AllocSetContext))))
|
|
|
|
|
|
|
|
/* Check if the block is the keeper block of the given allocation set */
|
|
|
|
#define IsKeeperBlock(set, block) ((block) == (KeeperBlock(set)))
|
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
typedef struct AllocSetFreeList
|
|
|
|
{
|
|
|
|
int num_free; /* current list length */
|
|
|
|
AllocSetContext *first_free; /* list header */
|
|
|
|
} AllocSetFreeList;
|
|
|
|
|
|
|
|
/* context_freelists[0] is for default params, [1] for small params */
|
|
|
|
static AllocSetFreeList context_freelists[2] =
|
|
|
|
{
|
|
|
|
{
|
|
|
|
0, NULL
|
|
|
|
},
|
|
|
|
{
|
|
|
|
0, NULL
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
1999-02-06 17:50:34 +01:00
|
|
|
/* ----------
|
|
|
|
* AllocSetFreeIndex -
|
|
|
|
*
|
|
|
|
* Depending on the size of an allocation compute which freechunk
|
2000-12-01 19:14:29 +01:00
|
|
|
* list of the alloc set it belongs to. Caller must have verified
|
|
|
|
* that size <= ALLOC_CHUNK_LIMIT.
|
1999-02-06 17:50:34 +01:00
|
|
|
* ----------
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1999-02-06 17:50:34 +01:00
|
|
|
static inline int
|
|
|
|
AllocSetFreeIndex(Size size)
|
|
|
|
{
|
2009-07-21 21:53:12 +02:00
|
|
|
int idx;
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2009-07-21 21:53:12 +02:00
|
|
|
if (size > (1 << ALLOC_MINBITS))
|
1999-02-06 17:50:34 +01:00
|
|
|
{
|
2019-12-28 23:21:17 +01:00
|
|
|
/*----------
|
|
|
|
* At this point we must compute ceil(log2(size >> ALLOC_MINBITS)).
|
|
|
|
* This is the same as
|
|
|
|
* pg_leftmost_one_pos32((size - 1) >> ALLOC_MINBITS) + 1
|
|
|
|
* or equivalently
|
|
|
|
* pg_leftmost_one_pos32(size - 1) - ALLOC_MINBITS + 1
|
|
|
|
*
|
2023-02-08 06:05:58 +01:00
|
|
|
* However, for platforms without intrinsic support, we duplicate the
|
2019-12-28 23:21:17 +01:00
|
|
|
* logic here, allowing an additional optimization. It's reasonable
|
|
|
|
* to assume that ALLOC_CHUNK_LIMIT fits in 16 bits, so we can unroll
|
|
|
|
* the byte-at-a-time loop in pg_leftmost_one_pos32 and just handle
|
|
|
|
* the last two bytes.
|
|
|
|
*
|
|
|
|
* Yes, this function is enough of a hot-spot to make it worth this
|
|
|
|
* much trouble.
|
|
|
|
*----------
|
2009-07-21 21:53:12 +02:00
|
|
|
*/
|
2023-02-08 06:05:58 +01:00
|
|
|
#ifdef HAVE_BITSCAN_REVERSE
|
|
|
|
idx = pg_leftmost_one_pos32((uint32) size - 1) - ALLOC_MINBITS + 1;
|
2019-12-28 23:21:17 +01:00
|
|
|
#else
|
|
|
|
uint32 t,
|
|
|
|
tsize;
|
|
|
|
|
|
|
|
/* Statically assert that we only have a 16-bit input value. */
|
2022-12-08 14:30:01 +01:00
|
|
|
StaticAssertDecl(ALLOC_CHUNK_LIMIT < (1 << 16),
|
2019-12-28 23:21:17 +01:00
|
|
|
"ALLOC_CHUNK_LIMIT must be less than 64kB");
|
|
|
|
|
|
|
|
tsize = size - 1;
|
2009-07-21 21:53:12 +02:00
|
|
|
t = tsize >> 8;
|
2019-12-28 23:21:17 +01:00
|
|
|
idx = t ? pg_leftmost_one_pos[t] + 8 : pg_leftmost_one_pos[tsize];
|
|
|
|
idx -= ALLOC_MINBITS - 1;
|
|
|
|
#endif
|
2009-07-21 21:53:12 +02:00
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
Assert(idx < ALLOCSET_NUM_FREELISTS);
|
1999-02-06 17:50:34 +01:00
|
|
|
}
|
2009-07-21 21:53:12 +02:00
|
|
|
else
|
|
|
|
idx = 0;
|
1996-07-09 08:22:35 +02:00
|
|
|
|
1999-02-06 17:50:34 +01:00
|
|
|
return idx;
|
|
|
|
}
|
1999-05-25 18:15:34 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Public routines
|
|
|
|
*/
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2018-10-12 20:26:56 +02:00
|
|
|
* AllocSetContextCreateInternal
|
2000-06-28 05:33:33 +02:00
|
|
|
* Create a new AllocSet context.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2000-06-28 05:33:33 +02:00
|
|
|
* parent: parent context, or NULL if top-level context
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
* name: name of context (must be statically allocated)
|
2000-06-28 05:33:33 +02:00
|
|
|
* minContextSize: minimum context size
|
|
|
|
* initBlockSize: initial allocation block size
|
|
|
|
* maxBlockSize: maximum allocation block size
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
*
|
|
|
|
* Most callers should abstract the context size parameters using a macro
|
2018-10-12 20:26:56 +02:00
|
|
|
* such as ALLOCSET_DEFAULT_SIZES.
|
|
|
|
*
|
|
|
|
* Note: don't call this directly; go through the wrapper macro
|
|
|
|
* AllocSetContextCreate.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2000-06-28 05:33:33 +02:00
|
|
|
MemoryContext
|
2018-10-12 20:26:56 +02:00
|
|
|
AllocSetContextCreateInternal(MemoryContext parent,
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
const char *name,
|
|
|
|
Size minContextSize,
|
|
|
|
Size initBlockSize,
|
|
|
|
Size maxBlockSize)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
int freeListIndex;
|
|
|
|
Size firstBlockSize;
|
2015-02-22 08:17:52 +01:00
|
|
|
AllocSet set;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
AllocBlock block;
|
2000-06-28 05:33:33 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* ensure MemoryChunk's size is properly maxaligned */
|
2022-12-08 14:30:01 +01:00
|
|
|
StaticAssertDecl(ALLOC_CHUNKHDRSZ == MAXALIGN(ALLOC_CHUNKHDRSZ),
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
"sizeof(MemoryChunk) is not maxaligned");
|
|
|
|
/* check we have enough space to store the freelist link */
|
2022-12-08 14:30:01 +01:00
|
|
|
StaticAssertDecl(sizeof(AllocFreeListLink) <= (1 << ALLOC_MINBITS),
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
"sizeof(AllocFreeListLink) larger than minimum allocation size");
|
2017-02-28 08:32:22 +01:00
|
|
|
|
2016-08-29 15:29:26 +02:00
|
|
|
/*
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
* First, validate allocation parameters. Once these were regular runtime
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
* tests and elog's, but in practice Asserts seem sufficient because
|
|
|
|
* nobody varies their parameters at runtime. We somewhat arbitrarily
|
|
|
|
* enforce a minimum 1K block size. We restrict the maximum block size to
|
|
|
|
* MEMORYCHUNK_MAX_BLOCKOFFSET as MemoryChunks are limited to this in
|
|
|
|
* regards to addressing the offset between the chunk and the block that
|
|
|
|
* the chunk is stored on. We would be unable to store the offset between
|
|
|
|
* the chunk and block for any chunks that were beyond
|
|
|
|
* MEMORYCHUNK_MAX_BLOCKOFFSET bytes into the block if the block was to be
|
|
|
|
* larger than this.
|
2016-08-29 15:29:26 +02:00
|
|
|
*/
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
Assert(initBlockSize == MAXALIGN(initBlockSize) &&
|
|
|
|
initBlockSize >= 1024);
|
|
|
|
Assert(maxBlockSize == MAXALIGN(maxBlockSize) &&
|
|
|
|
maxBlockSize >= initBlockSize &&
|
|
|
|
AllocHugeSizeIsValid(maxBlockSize)); /* must be safe to double */
|
|
|
|
Assert(minContextSize == 0 ||
|
|
|
|
(minContextSize == MAXALIGN(minContextSize) &&
|
|
|
|
minContextSize >= 1024 &&
|
|
|
|
minContextSize <= maxBlockSize));
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
Assert(maxBlockSize <= MEMORYCHUNK_MAX_BLOCKOFFSET);
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Check whether the parameters match either available freelist. We do
|
|
|
|
* not need to demand a match of maxBlockSize.
|
|
|
|
*/
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
if (minContextSize == ALLOCSET_DEFAULT_MINSIZE &&
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
initBlockSize == ALLOCSET_DEFAULT_INITSIZE)
|
|
|
|
freeListIndex = 0;
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
else if (minContextSize == ALLOCSET_SMALL_MINSIZE &&
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
initBlockSize == ALLOCSET_SMALL_INITSIZE)
|
|
|
|
freeListIndex = 1;
|
|
|
|
else
|
|
|
|
freeListIndex = -1;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If a suitable freelist entry exists, just recycle that context.
|
|
|
|
*/
|
|
|
|
if (freeListIndex >= 0)
|
|
|
|
{
|
|
|
|
AllocSetFreeList *freelist = &context_freelists[freeListIndex];
|
|
|
|
|
|
|
|
if (freelist->first_free != NULL)
|
|
|
|
{
|
|
|
|
/* Remove entry from freelist */
|
|
|
|
set = freelist->first_free;
|
|
|
|
freelist->first_free = (AllocSet) set->header.nextchild;
|
|
|
|
freelist->num_free--;
|
|
|
|
|
|
|
|
/* Update its maxBlockSize; everything else should be OK */
|
|
|
|
set->maxBlockSize = maxBlockSize;
|
|
|
|
|
|
|
|
/* Reinitialize its header, installing correct name and parent */
|
|
|
|
MemoryContextCreate((MemoryContext) set,
|
|
|
|
T_AllocSetContext,
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MCTX_ASET_ID,
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
parent,
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
name);
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
2020-03-19 19:20:39 +01:00
|
|
|
((MemoryContext) set)->mem_allocated =
|
2023-07-17 01:16:56 +02:00
|
|
|
KeeperBlock(set)->endptr - ((char *) set);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
return (MemoryContext) set;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Determine size of initial block */
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
firstBlockSize = MAXALIGN(sizeof(AllocSetContext)) +
|
|
|
|
ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
if (minContextSize != 0)
|
|
|
|
firstBlockSize = Max(firstBlockSize, minContextSize);
|
|
|
|
else
|
|
|
|
firstBlockSize = Max(firstBlockSize, initBlockSize);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Allocate the initial block. Unlike other aset.c blocks, it starts with
|
|
|
|
* the context header and its block header follows that.
|
|
|
|
*/
|
|
|
|
set = (AllocSet) malloc(firstBlockSize);
|
|
|
|
if (set == NULL)
|
|
|
|
{
|
|
|
|
if (TopMemoryContext)
|
|
|
|
MemoryContextStats(TopMemoryContext);
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory"),
|
|
|
|
errdetail("Failed while creating memory context \"%s\".",
|
|
|
|
name)));
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Avoid writing code that can fail between here and MemoryContextCreate;
|
|
|
|
* we'd leak the header/initial block if we ereport in this stretch.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* Fill in the initial block's block header */
|
2023-07-17 01:16:56 +02:00
|
|
|
block = KeeperBlock(set);
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
block->aset = set;
|
|
|
|
block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
|
|
|
|
block->endptr = ((char *) set) + firstBlockSize;
|
|
|
|
block->prev = NULL;
|
|
|
|
block->next = NULL;
|
|
|
|
|
|
|
|
/* Mark unallocated space NOACCESS; leave the block header alone. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(block->freeptr, block->endptr - block->freeptr);
|
|
|
|
|
|
|
|
/* Remember block as part of block list */
|
|
|
|
set->blocks = block;
|
|
|
|
|
|
|
|
/* Finish filling in aset-specific parts of the context header */
|
|
|
|
MemSetAligned(set->freelist, 0, sizeof(set->freelist));
|
|
|
|
|
2023-07-17 01:16:56 +02:00
|
|
|
set->initBlockSize = (uint32) initBlockSize;
|
|
|
|
set->maxBlockSize = (uint32) maxBlockSize;
|
|
|
|
set->nextBlockSize = (uint32) initBlockSize;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
set->freeListIndex = freeListIndex;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2006-12-27 23:30:48 +01:00
|
|
|
/*
|
|
|
|
* Compute the allocation chunk size limit for this context. It can't be
|
|
|
|
* more than ALLOC_CHUNK_LIMIT because of the fixed number of freelists.
|
2011-05-02 18:08:08 +02:00
|
|
|
* If maxBlockSize is small then requests exceeding the maxBlockSize, or
|
|
|
|
* even a significant fraction of it, should be treated as large chunks
|
|
|
|
* too. For the typical case of maxBlockSize a power of 2, the chunk size
|
|
|
|
* limit will be at most 1/8th maxBlockSize, so that given a stream of
|
|
|
|
* requests that are all the maximum chunk size we will waste at most
|
|
|
|
* 1/8th of the allocated space.
|
|
|
|
*
|
Fix bogus "out of memory" reports in tuplestore.c.
The tuplesort/tuplestore memory management logic assumed that the chunk
allocation overhead for its memtuples array could not increase when
increasing the array size. This is and always was true for tuplesort,
but we (I, I think) blindly copied that logic into tuplestore.c without
noticing that the assumption failed to hold for the much smaller array
elements used by tuplestore. Given rather small work_mem, this could
result in an improper complaint about "unexpected out-of-memory situation",
as reported by Brent DeSpain in bug #13530.
The easiest way to fix this is just to increase tuplestore's initial
array size so that the assumption holds. Rather than relying on magic
constants, though, let's export a #define from aset.c that represents
the safe allocation threshold, and make tuplestore's calculation depend
on that.
Do the same in tuplesort.c to keep the logic looking parallel, even though
tuplesort.c isn't actually at risk at present. This will keep us from
breaking it if we ever muck with the allocation parameters in aset.c.
Back-patch to all supported versions. The error message doesn't occur
pre-9.3, not so much because the problem can't happen as because the
pre-9.3 tuplestore code neglected to check for it. (The chance of
trouble is a great deal larger as of 9.3, though, due to changes in the
array-size-increasing strategy.) However, allowing LACKMEM() to become
true unexpectedly could still result in less-than-desirable behavior,
so let's patch it all the way back.
2015-08-05 00:18:46 +02:00
|
|
|
* Also, allocChunkLimit must not exceed ALLOCSET_SEPARATE_THRESHOLD.
|
2006-12-27 23:30:48 +01:00
|
|
|
*/
|
Fix bogus "out of memory" reports in tuplestore.c.
The tuplesort/tuplestore memory management logic assumed that the chunk
allocation overhead for its memtuples array could not increase when
increasing the array size. This is and always was true for tuplesort,
but we (I, I think) blindly copied that logic into tuplestore.c without
noticing that the assumption failed to hold for the much smaller array
elements used by tuplestore. Given rather small work_mem, this could
result in an improper complaint about "unexpected out-of-memory situation",
as reported by Brent DeSpain in bug #13530.
The easiest way to fix this is just to increase tuplestore's initial
array size so that the assumption holds. Rather than relying on magic
constants, though, let's export a #define from aset.c that represents
the safe allocation threshold, and make tuplestore's calculation depend
on that.
Do the same in tuplesort.c to keep the logic looking parallel, even though
tuplesort.c isn't actually at risk at present. This will keep us from
breaking it if we ever muck with the allocation parameters in aset.c.
Back-patch to all supported versions. The error message doesn't occur
pre-9.3, not so much because the problem can't happen as because the
pre-9.3 tuplestore code neglected to check for it. (The chance of
trouble is a great deal larger as of 9.3, though, due to changes in the
array-size-increasing strategy.) However, allowing LACKMEM() to become
true unexpectedly could still result in less-than-desirable behavior,
so let's patch it all the way back.
2015-08-05 00:18:46 +02:00
|
|
|
StaticAssertStmt(ALLOC_CHUNK_LIMIT == ALLOCSET_SEPARATE_THRESHOLD,
|
|
|
|
"ALLOC_CHUNK_LIMIT != ALLOCSET_SEPARATE_THRESHOLD");
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/*
|
|
|
|
* Determine the maximum size that a chunk can be before we allocate an
|
|
|
|
* entire AllocBlock dedicated for that chunk. We set the absolute limit
|
|
|
|
* of that size as ALLOC_CHUNK_LIMIT but we reduce it further so that we
|
|
|
|
* can fit about ALLOC_CHUNK_FRACTION chunks this size on a maximally
|
|
|
|
* sized block. (We opt to keep allocChunkLimit a power-of-2 value
|
|
|
|
* primarily for legacy reasons rather than calculating it so that exactly
|
|
|
|
* ALLOC_CHUNK_FRACTION chunks fit on a maximally sized block.)
|
|
|
|
*/
|
2015-02-22 08:17:52 +01:00
|
|
|
set->allocChunkLimit = ALLOC_CHUNK_LIMIT;
|
|
|
|
while ((Size) (set->allocChunkLimit + ALLOC_CHUNKHDRSZ) >
|
2011-05-02 18:08:08 +02:00
|
|
|
(Size) ((maxBlockSize - ALLOC_BLOCKHDRSZ) / ALLOC_CHUNK_FRACTION))
|
2015-02-22 08:17:52 +01:00
|
|
|
set->allocChunkLimit >>= 1;
|
2006-12-27 23:30:48 +01:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Finally, do the type-independent part of context creation */
|
|
|
|
MemoryContextCreate((MemoryContext) set,
|
|
|
|
T_AllocSetContext,
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MCTX_ASET_ID,
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
parent,
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
name);
|
2000-06-28 05:33:33 +02:00
|
|
|
|
2020-03-19 19:20:39 +01:00
|
|
|
((MemoryContext) set)->mem_allocated = firstBlockSize;
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2015-02-22 08:17:52 +01:00
|
|
|
return (MemoryContext) set;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
1999-02-14 00:22:53 +01:00
|
|
|
* AllocSetReset
|
1999-08-24 22:11:19 +02:00
|
|
|
* Frees all memory which is allocated in the given set.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2000-06-28 05:33:33 +02:00
|
|
|
* Actually, this routine has some discretion about what to do.
|
2002-12-15 22:01:34 +01:00
|
|
|
* It should mark all allocated chunks freed, but it need not necessarily
|
|
|
|
* give back all the resources the set owns. Our actual implementation is
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
* that we give back all but the "keeper" block (which we must keep, since
|
|
|
|
* it shares a malloc chunk with the context header). In this way, we don't
|
|
|
|
* thrash malloc() when a context is repeatedly reset after small allocations,
|
|
|
|
* which is typical behavior for per-tuple contexts.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocSetReset(MemoryContext context)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocSet set = (AllocSet) context;
|
2005-09-01 20:15:42 +02:00
|
|
|
AllocBlock block;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
Size keepersize PG_USED_FOR_ASSERTS_ONLY;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocSetIsValid(set));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Check for corruption and leaks before freeing */
|
|
|
|
AllocSetCheck(context);
|
|
|
|
#endif
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/* Remember keeper block size for Assert below */
|
2023-07-17 01:16:56 +02:00
|
|
|
keepersize = KeeperBlock(set)->endptr - ((char *) set);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
2001-01-12 22:54:01 +01:00
|
|
|
/* Clear chunk freelists */
|
2005-05-14 22:29:13 +02:00
|
|
|
MemSetAligned(set->freelist, 0, sizeof(set->freelist));
|
|
|
|
|
2005-09-01 20:15:42 +02:00
|
|
|
block = set->blocks;
|
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* New blocks list will be just the keeper block */
|
2023-07-17 01:16:56 +02:00
|
|
|
set->blocks = KeeperBlock(set);
|
2001-01-12 22:54:01 +01:00
|
|
|
|
2005-09-01 20:15:42 +02:00
|
|
|
while (block != NULL)
|
1999-02-06 17:50:34 +01:00
|
|
|
{
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocBlock next = block->next;
|
|
|
|
|
2023-07-17 01:16:56 +02:00
|
|
|
if (IsKeeperBlock(set, block))
|
2000-06-28 05:33:33 +02:00
|
|
|
{
|
|
|
|
/* Reset the block, but don't return it to malloc */
|
2000-07-12 04:37:39 +02:00
|
|
|
char *datastart = ((char *) block) + ALLOC_BLOCKHDRSZ;
|
|
|
|
|
2000-03-09 00:42:58 +01:00
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
2013-06-27 01:56:03 +02:00
|
|
|
wipe_mem(datastart, block->freeptr - datastart);
|
2013-06-27 02:00:08 +02:00
|
|
|
#else
|
|
|
|
/* wipe_mem() would have done this */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(datastart, block->freeptr - datastart);
|
2000-03-09 00:42:58 +01:00
|
|
|
#endif
|
2000-07-12 04:37:39 +02:00
|
|
|
block->freeptr = datastart;
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
block->prev = NULL;
|
2000-07-12 04:37:39 +02:00
|
|
|
block->next = NULL;
|
2000-06-28 05:33:33 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Normal case, release the block */
|
2020-03-19 19:20:39 +01:00
|
|
|
context->mem_allocated -= block->endptr - ((char *) block);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
2013-06-27 01:56:03 +02:00
|
|
|
wipe_mem(block, block->freeptr - ((char *) block));
|
2000-06-28 05:33:33 +02:00
|
|
|
#endif
|
|
|
|
free(block);
|
|
|
|
}
|
1999-02-06 17:50:34 +01:00
|
|
|
block = next;
|
|
|
|
}
|
2005-09-01 20:15:42 +02:00
|
|
|
|
2020-03-19 19:20:39 +01:00
|
|
|
Assert(context->mem_allocated == keepersize);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2006-11-08 20:27:24 +01:00
|
|
|
/* Reset block size allocation sequence, too */
|
|
|
|
set->nextBlockSize = set->initBlockSize;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2000-06-28 05:33:33 +02:00
|
|
|
* AllocSetDelete
|
|
|
|
* Frees all memory which is allocated in the given set,
|
|
|
|
* in preparation for deletion of the set.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2000-06-28 05:33:33 +02:00
|
|
|
* Unlike AllocSetReset, this *must* free all resources of the set.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocSetDelete(MemoryContext context)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocSet set = (AllocSet) context;
|
|
|
|
AllocBlock block = set->blocks;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
Size keepersize PG_USED_FOR_ASSERTS_ONLY;
|
2000-06-28 05:33:33 +02:00
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocSetIsValid(set));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Check for corruption and leaks before freeing */
|
|
|
|
AllocSetCheck(context);
|
|
|
|
#endif
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/* Remember keeper block size for Assert below */
|
2023-07-17 01:16:56 +02:00
|
|
|
keepersize = KeeperBlock(set)->endptr - ((char *) set);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/*
|
|
|
|
* If the context is a candidate for a freelist, put it into that freelist
|
|
|
|
* instead of destroying it.
|
|
|
|
*/
|
|
|
|
if (set->freeListIndex >= 0)
|
|
|
|
{
|
|
|
|
AllocSetFreeList *freelist = &context_freelists[set->freeListIndex];
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset the context, if it needs it, so that we aren't hanging on to
|
|
|
|
* more than the initial malloc chunk.
|
|
|
|
*/
|
|
|
|
if (!context->isReset)
|
|
|
|
MemoryContextResetOnly(context);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the freelist is full, just discard what's already in it. See
|
|
|
|
* comments with context_freelists[].
|
|
|
|
*/
|
|
|
|
if (freelist->num_free >= MAX_FREE_CONTEXTS)
|
|
|
|
{
|
|
|
|
while (freelist->first_free != NULL)
|
|
|
|
{
|
|
|
|
AllocSetContext *oldset = freelist->first_free;
|
|
|
|
|
|
|
|
freelist->first_free = (AllocSetContext *) oldset->header.nextchild;
|
|
|
|
freelist->num_free--;
|
|
|
|
|
|
|
|
/* All that remains is to free the header/initial block */
|
|
|
|
free(oldset);
|
|
|
|
}
|
|
|
|
Assert(freelist->num_free == 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Now add the just-deleted context to the freelist. */
|
|
|
|
set->header.nextchild = (MemoryContext) freelist->first_free;
|
|
|
|
freelist->first_free = set;
|
|
|
|
freelist->num_free++;
|
|
|
|
|
|
|
|
return;
|
|
|
|
}
|
2001-01-12 22:54:01 +01:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Free all blocks, except the keeper which is part of context header */
|
2000-06-28 05:33:33 +02:00
|
|
|
while (block != NULL)
|
|
|
|
{
|
|
|
|
AllocBlock next = block->next;
|
|
|
|
|
2023-07-17 01:16:56 +02:00
|
|
|
if (!IsKeeperBlock(set, block))
|
2020-03-19 19:20:39 +01:00
|
|
|
context->mem_allocated -= block->endptr - ((char *) block);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
2013-06-27 01:56:03 +02:00
|
|
|
wipe_mem(block, block->freeptr - ((char *) block));
|
2000-06-28 05:33:33 +02:00
|
|
|
#endif
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
2023-07-17 01:16:56 +02:00
|
|
|
if (!IsKeeperBlock(set, block))
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
free(block);
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
block = next;
|
|
|
|
}
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
2020-03-19 19:20:39 +01:00
|
|
|
Assert(context->mem_allocated == keepersize);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Finally, free the context header, including the keeper block */
|
|
|
|
free(set);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2024-02-28 02:20:43 +01:00
|
|
|
* Helper for AllocSetAlloc() that allocates an entire block for the chunk.
|
2017-11-25 01:28:19 +01:00
|
|
|
*
|
2024-02-28 02:20:43 +01:00
|
|
|
* AllocSetAlloc()'s comment explains why this is separate.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2024-02-28 02:20:43 +01:00
|
|
|
pg_noinline
|
|
|
|
static void *
|
|
|
|
AllocSetAllocLarge(MemoryContext context, Size size, int flags)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocSet set = (AllocSet) context;
|
1999-02-06 17:50:34 +01:00
|
|
|
AllocBlock block;
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk;
|
1999-02-06 17:50:34 +01:00
|
|
|
Size chunk_size;
|
1999-05-23 01:19:37 +02:00
|
|
|
Size blksize;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* validate 'size' is within the limits for the given 'flags' */
|
|
|
|
MemoryContextCheckSize(context, size, flags);
|
|
|
|
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* ensure there's always space for the sentinel byte */
|
|
|
|
chunk_size = MAXALIGN(size + 1);
|
|
|
|
#else
|
|
|
|
chunk_size = MAXALIGN(size);
|
|
|
|
#endif
|
|
|
|
|
|
|
|
blksize = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
|
|
|
|
block = (AllocBlock) malloc(blksize);
|
|
|
|
if (block == NULL)
|
|
|
|
return MemoryContextAllocationFailure(context, size, flags);
|
|
|
|
|
|
|
|
context->mem_allocated += blksize;
|
|
|
|
|
|
|
|
block->aset = set;
|
|
|
|
block->freeptr = block->endptr = ((char *) block) + blksize;
|
|
|
|
|
|
|
|
chunk = (MemoryChunk *) (((char *) block) + ALLOC_BLOCKHDRSZ);
|
|
|
|
|
|
|
|
/* mark the MemoryChunk as externally managed */
|
|
|
|
MemoryChunkSetHdrMaskExternal(chunk, MCTX_ASET_ID);
|
|
|
|
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
chunk->requested_size = size;
|
|
|
|
/* set mark to catch clobber of "unused" space */
|
|
|
|
Assert(size < chunk_size);
|
|
|
|
set_sentinel(MemoryChunkGetPointer(chunk), size);
|
|
|
|
#endif
|
|
|
|
#ifdef RANDOMIZE_ALLOCATED_MEMORY
|
|
|
|
/* fill the allocated space with junk */
|
|
|
|
randomize_mem((char *) MemoryChunkGetPointer(chunk), size);
|
|
|
|
#endif
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-02-06 17:50:34 +01:00
|
|
|
/*
|
2024-02-28 02:20:43 +01:00
|
|
|
* Stick the new block underneath the active allocation block, if any, so
|
|
|
|
* that we don't lose the use of the space remaining therein.
|
2000-12-01 19:14:29 +01:00
|
|
|
*/
|
2024-02-28 02:20:43 +01:00
|
|
|
if (set->blocks != NULL)
|
|
|
|
{
|
|
|
|
block->prev = set->blocks;
|
|
|
|
block->next = set->blocks->next;
|
|
|
|
if (block->next)
|
|
|
|
block->next->prev = block;
|
|
|
|
set->blocks->next = block;
|
|
|
|
}
|
|
|
|
else
|
2000-12-01 19:14:29 +01:00
|
|
|
{
|
2024-02-28 02:20:43 +01:00
|
|
|
block->prev = NULL;
|
|
|
|
block->next = NULL;
|
|
|
|
set->blocks = block;
|
|
|
|
}
|
2024-02-27 04:39:42 +01:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* Ensure any padding bytes are marked NOACCESS. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS((char *) MemoryChunkGetPointer(chunk) + size,
|
|
|
|
chunk_size - size);
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
return MemoryChunkGetPointer(chunk);
|
|
|
|
}
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/*
|
|
|
|
* Small helper for allocating a new chunk from a chunk, to avoid duplicating
|
|
|
|
* the code between AllocSetAlloc() and AllocSetAllocFromNewBlock().
|
|
|
|
*/
|
|
|
|
static inline void *
|
|
|
|
AllocSetAllocChunkFromBlock(MemoryContext context, AllocBlock block,
|
|
|
|
Size size, Size chunk_size, int fidx)
|
|
|
|
{
|
|
|
|
MemoryChunk *chunk;
|
2000-12-01 19:14:29 +01:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
chunk = (MemoryChunk *) (block->freeptr);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* Prepare to initialize the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_UNDEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
block->freeptr += (chunk_size + ALLOC_CHUNKHDRSZ);
|
|
|
|
Assert(block->freeptr <= block->endptr);
|
|
|
|
|
|
|
|
/* store the free list index in the value field */
|
|
|
|
MemoryChunkSetHdrMask(chunk, block, fidx, MCTX_ASET_ID);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
2024-02-28 02:20:43 +01:00
|
|
|
chunk->requested_size = size;
|
|
|
|
/* set mark to catch clobber of "unused" space */
|
|
|
|
if (size < chunk_size)
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
set_sentinel(MemoryChunkGetPointer(chunk), size);
|
2000-12-01 19:14:29 +01:00
|
|
|
#endif
|
2008-04-12 00:54:23 +02:00
|
|
|
#ifdef RANDOMIZE_ALLOCATED_MEMORY
|
2024-02-28 02:20:43 +01:00
|
|
|
/* fill the allocated space with junk */
|
|
|
|
randomize_mem((char *) MemoryChunkGetPointer(chunk), size);
|
2008-04-12 00:54:23 +02:00
|
|
|
#endif
|
2000-12-01 19:14:29 +01:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* Ensure any padding bytes are marked NOACCESS. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS((char *) MemoryChunkGetPointer(chunk) + size,
|
|
|
|
chunk_size - size);
|
|
|
|
|
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
return MemoryChunkGetPointer(chunk);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Helper for AllocSetAlloc() that allocates a new block and returns a chunk
|
|
|
|
* allocated from it.
|
|
|
|
*
|
|
|
|
* AllocSetAlloc()'s comment explains why this is separate.
|
|
|
|
*/
|
|
|
|
pg_noinline
|
|
|
|
static void *
|
|
|
|
AllocSetAllocFromNewBlock(MemoryContext context, Size size, int flags,
|
|
|
|
int fidx)
|
|
|
|
{
|
|
|
|
AllocSet set = (AllocSet) context;
|
|
|
|
AllocBlock block;
|
|
|
|
Size availspace;
|
|
|
|
Size blksize;
|
|
|
|
Size required_size;
|
|
|
|
Size chunk_size;
|
|
|
|
|
|
|
|
/* due to the keeper block set->blocks should always be valid */
|
|
|
|
Assert(set->blocks != NULL);
|
|
|
|
block = set->blocks;
|
|
|
|
availspace = block->endptr - block->freeptr;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The existing active (top) block does not have enough room for the
|
|
|
|
* requested allocation, but it might still have a useful amount of space
|
|
|
|
* in it. Once we push it down in the block list, we'll never try to
|
|
|
|
* allocate more space from it. So, before we do that, carve up its free
|
|
|
|
* space into chunks that we can put on the set's freelists.
|
|
|
|
*
|
|
|
|
* Because we can only get here when there's less than ALLOC_CHUNK_LIMIT
|
|
|
|
* left in the block, this loop cannot iterate more than
|
|
|
|
* ALLOCSET_NUM_FREELISTS-1 times.
|
|
|
|
*/
|
|
|
|
while (availspace >= ((1 << ALLOC_MINBITS) + ALLOC_CHUNKHDRSZ))
|
|
|
|
{
|
|
|
|
AllocFreeListLink *link;
|
|
|
|
MemoryChunk *chunk;
|
|
|
|
Size availchunk = availspace - ALLOC_CHUNKHDRSZ;
|
|
|
|
int a_fidx = AllocSetFreeIndex(availchunk);
|
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
/*
|
2024-02-28 02:20:43 +01:00
|
|
|
* In most cases, we'll get back the index of the next larger freelist
|
|
|
|
* than the one we need to put this chunk on. The exception is when
|
|
|
|
* availchunk is exactly a power of 2.
|
2000-12-01 19:14:29 +01:00
|
|
|
*/
|
2024-02-28 02:20:43 +01:00
|
|
|
if (availchunk != GetChunkSizeFromFreeListIdx(a_fidx))
|
2000-12-01 19:14:29 +01:00
|
|
|
{
|
2024-02-28 02:20:43 +01:00
|
|
|
a_fidx--;
|
|
|
|
Assert(a_fidx >= 0);
|
|
|
|
availchunk = GetChunkSizeFromFreeListIdx(a_fidx);
|
2000-12-01 19:14:29 +01:00
|
|
|
}
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
chunk = (MemoryChunk *) (block->freeptr);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* Prepare to initialize the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_UNDEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
block->freeptr += (availchunk + ALLOC_CHUNKHDRSZ);
|
|
|
|
availspace -= (availchunk + ALLOC_CHUNKHDRSZ);
|
2013-06-27 02:00:08 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* store the freelist index in the value field */
|
|
|
|
MemoryChunkSetHdrMask(chunk, block, a_fidx, MCTX_ASET_ID);
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
chunk->requested_size = InvalidAllocSize; /* mark it free */
|
|
|
|
#endif
|
|
|
|
/* push this chunk onto the free list */
|
|
|
|
link = GetFreeListLink(chunk);
|
|
|
|
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(link, sizeof(AllocFreeListLink));
|
|
|
|
link->next = set->freelist[a_fidx];
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(link, sizeof(AllocFreeListLink));
|
|
|
|
|
|
|
|
set->freelist[a_fidx] = chunk;
|
2000-12-01 19:14:29 +01:00
|
|
|
}
|
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/*
|
|
|
|
* The first such block has size initBlockSize, and we double the space in
|
|
|
|
* each succeeding block, but not more than maxBlockSize.
|
|
|
|
*/
|
|
|
|
blksize = set->nextBlockSize;
|
|
|
|
set->nextBlockSize <<= 1;
|
|
|
|
if (set->nextBlockSize > set->maxBlockSize)
|
|
|
|
set->nextBlockSize = set->maxBlockSize;
|
|
|
|
|
|
|
|
/* Choose the actual chunk size to allocate */
|
|
|
|
chunk_size = GetChunkSizeFromFreeListIdx(fidx);
|
|
|
|
Assert(chunk_size >= size);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If initBlockSize is less than ALLOC_CHUNK_LIMIT, we could need more
|
|
|
|
* space... but try to keep it a power of 2.
|
|
|
|
*/
|
|
|
|
required_size = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
|
|
|
|
while (blksize < required_size)
|
|
|
|
blksize <<= 1;
|
|
|
|
|
|
|
|
/* Try to allocate it */
|
|
|
|
block = (AllocBlock) malloc(blksize);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We could be asking for pretty big blocks here, so cope if malloc fails.
|
|
|
|
* But give up if there's less than 1 MB or so available...
|
|
|
|
*/
|
|
|
|
while (block == NULL && blksize > 1024 * 1024)
|
|
|
|
{
|
|
|
|
blksize >>= 1;
|
|
|
|
if (blksize < required_size)
|
|
|
|
break;
|
|
|
|
block = (AllocBlock) malloc(blksize);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (block == NULL)
|
|
|
|
return MemoryContextAllocationFailure(context, size, flags);
|
|
|
|
|
|
|
|
context->mem_allocated += blksize;
|
|
|
|
|
|
|
|
block->aset = set;
|
|
|
|
block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
|
|
|
|
block->endptr = ((char *) block) + blksize;
|
|
|
|
|
|
|
|
/* Mark unallocated space NOACCESS. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(block->freeptr,
|
|
|
|
blksize - ALLOC_BLOCKHDRSZ);
|
|
|
|
|
|
|
|
block->prev = NULL;
|
|
|
|
block->next = set->blocks;
|
|
|
|
if (block->next)
|
|
|
|
block->next->prev = block;
|
|
|
|
set->blocks = block;
|
|
|
|
|
|
|
|
return AllocSetAllocChunkFromBlock(context, block, size, chunk_size, fidx);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* AllocSetAlloc
|
2024-03-04 05:42:10 +01:00
|
|
|
* Returns a pointer to allocated memory of given size or raises an ERROR
|
|
|
|
* on allocation failure, or returns NULL when flags contains
|
|
|
|
* MCXT_ALLOC_NO_OOM.
|
2024-02-28 02:20:43 +01:00
|
|
|
*
|
|
|
|
* No request may exceed:
|
|
|
|
* MAXALIGN_DOWN(SIZE_MAX) - ALLOC_BLOCKHDRSZ - ALLOC_CHUNKHDRSZ
|
|
|
|
* All callers use a much-lower limit.
|
|
|
|
*
|
|
|
|
* Note: when using valgrind, it doesn't matter how the returned allocation
|
|
|
|
* is marked, as mcxt.c will set it to UNDEFINED. In some paths we will
|
|
|
|
* return space that is marked NOACCESS - AllocSetRealloc has to beware!
|
|
|
|
*
|
|
|
|
* This function should only contain the most common code paths. Everything
|
2024-03-04 05:42:10 +01:00
|
|
|
* else should be in pg_noinline helper functions, thus avoiding the overhead
|
|
|
|
* of creating a stack frame for the common cases. Allocating memory is often
|
|
|
|
* a bottleneck in many workloads, so avoiding stack frame setup is
|
|
|
|
* worthwhile. Helper functions should always directly return the newly
|
|
|
|
* allocated memory so that we can just return that address directly as a tail
|
|
|
|
* call.
|
2024-02-28 02:20:43 +01:00
|
|
|
*/
|
|
|
|
void *
|
|
|
|
AllocSetAlloc(MemoryContext context, Size size, int flags)
|
|
|
|
{
|
|
|
|
AllocSet set = (AllocSet) context;
|
|
|
|
AllocBlock block;
|
|
|
|
MemoryChunk *chunk;
|
|
|
|
int fidx;
|
|
|
|
Size chunk_size;
|
|
|
|
Size availspace;
|
|
|
|
|
|
|
|
Assert(AllocSetIsValid(set));
|
|
|
|
|
|
|
|
/* due to the keeper block set->blocks should never be NULL */
|
|
|
|
Assert(set->blocks != NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If requested size exceeds maximum for chunks we hand the the request
|
|
|
|
* off to AllocSetAllocLarge().
|
|
|
|
*/
|
|
|
|
if (size > set->allocChunkLimit)
|
|
|
|
return AllocSetAllocLarge(context, size, flags);
|
|
|
|
|
2000-12-01 19:14:29 +01:00
|
|
|
/*
|
|
|
|
* Request is small enough to be treated as a chunk. Look in the
|
|
|
|
* corresponding free list to see if there is a free chunk we could reuse.
|
2000-12-01 06:16:45 +01:00
|
|
|
* If one is found, remove it from the free list, make it again a member
|
|
|
|
* of the alloc set and return its data address.
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
*
|
|
|
|
* Note that we don't attempt to ensure there's space for the sentinel
|
|
|
|
* byte here. We expect a large proportion of allocations to be for sizes
|
|
|
|
* which are already a power of 2. If we were to always make space for a
|
|
|
|
* sentinel byte in MEMORY_CONTEXT_CHECKING builds, then we'd end up
|
|
|
|
* doubling the memory requirements for such allocations.
|
2000-12-01 06:16:45 +01:00
|
|
|
*/
|
2007-04-30 02:12:08 +02:00
|
|
|
fidx = AllocSetFreeIndex(size);
|
|
|
|
chunk = set->freelist[fidx];
|
2000-12-01 06:16:45 +01:00
|
|
|
if (chunk != NULL)
|
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocFreeListLink *link = GetFreeListLink(chunk);
|
2007-04-30 02:12:08 +02:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
Assert(fidx == MemoryChunkGetValue(chunk));
|
2000-12-01 06:16:45 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* pop this chunk off the freelist */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(link, sizeof(AllocFreeListLink));
|
|
|
|
set->freelist[fidx] = link->next;
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(link, sizeof(AllocFreeListLink));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
2000-12-01 06:16:45 +01:00
|
|
|
chunk->requested_size = size;
|
|
|
|
/* set mark to catch clobber of "unused" space */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (size < GetChunkSizeFromFreeListIdx(fidx))
|
|
|
|
set_sentinel(MemoryChunkGetPointer(chunk), size);
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
#endif
|
2008-04-12 00:54:23 +02:00
|
|
|
#ifdef RANDOMIZE_ALLOCATED_MEMORY
|
|
|
|
/* fill the allocated space with junk */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
randomize_mem((char *) MemoryChunkGetPointer(chunk), size);
|
2008-04-12 00:54:23 +02:00
|
|
|
#endif
|
2000-12-01 06:16:45 +01:00
|
|
|
|
2017-11-25 01:28:19 +01:00
|
|
|
/* Ensure any padding bytes are marked NOACCESS. */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
VALGRIND_MAKE_MEM_NOACCESS((char *) MemoryChunkGetPointer(chunk) + size,
|
|
|
|
GetChunkSizeFromFreeListIdx(fidx) - size);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
return MemoryChunkGetPointer(chunk);
|
2000-12-01 06:16:45 +01:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-02-06 17:50:34 +01:00
|
|
|
/*
|
1999-05-23 01:19:37 +02:00
|
|
|
* Choose the actual chunk size to allocate.
|
1999-02-06 17:50:34 +01:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
chunk_size = GetChunkSizeFromFreeListIdx(fidx);
|
1999-05-23 01:19:37 +02:00
|
|
|
Assert(chunk_size >= size);
|
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
block = set->blocks;
|
|
|
|
availspace = block->endptr - block->freeptr;
|
|
|
|
|
1999-05-23 01:19:37 +02:00
|
|
|
/*
|
2000-12-01 19:14:29 +01:00
|
|
|
* If there is enough room in the active allocation block, we will put the
|
|
|
|
* chunk into that block. Else must start a new one.
|
1999-05-23 01:19:37 +02:00
|
|
|
*/
|
2024-02-28 02:20:43 +01:00
|
|
|
if (unlikely(availspace < (chunk_size + ALLOC_CHUNKHDRSZ)))
|
|
|
|
return AllocSetAllocFromNewBlock(context, size, flags, fidx);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
2024-02-28 02:20:43 +01:00
|
|
|
/* There's enough space on the current block, so allocate from that */
|
|
|
|
return AllocSetAllocChunkFromBlock(context, block, size, chunk_size, fidx);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
1999-02-14 00:22:53 +01:00
|
|
|
* AllocSetFree
|
1996-07-09 08:22:35 +02:00
|
|
|
* Frees allocated memory; memory is removed from the set.
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
|
|
|
AllocSetFree(void *pointer)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocSet set;
|
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (MemoryChunkIsExternal(chunk))
|
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/* Release single-chunk block. */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocBlock block = ExternalChunkGetBlock(chunk);
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/*
|
|
|
|
* Try to verify that we have a sane block pointer: the block header
|
|
|
|
* should reference an aset and the freeptr should match the endptr.
|
|
|
|
*/
|
|
|
|
if (!AllocBlockIsValid(block) || block->freeptr != block->endptr)
|
|
|
|
elog(ERROR, "could not find block containing chunk %p", chunk);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set = block->aset;
|
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
{
|
|
|
|
/* Test for someone scribbling on unused space in chunk */
|
2023-01-05 00:56:17 +01:00
|
|
|
Assert(chunk->requested_size < (block->endptr - (char *) pointer));
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
if (!sentinel_ok(pointer, chunk->requested_size))
|
|
|
|
elog(WARNING, "detected write past chunk end in %s %p",
|
|
|
|
set->header.name, chunk);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
}
|
2000-12-01 06:16:45 +01:00
|
|
|
#endif
|
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
/* OK, remove block from aset's list and free it */
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
if (block->prev)
|
|
|
|
block->prev->next = block->next;
|
1999-08-24 22:11:19 +02:00
|
|
|
else
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
set->blocks = block->next;
|
|
|
|
if (block->next)
|
|
|
|
block->next->prev = block->prev;
|
2019-10-01 03:13:39 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set->header.mem_allocated -= block->endptr - ((char *) block);
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2000-03-09 00:42:58 +01:00
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
2013-06-27 01:56:03 +02:00
|
|
|
wipe_mem(block, block->freeptr - ((char *) block));
|
2000-03-09 00:42:58 +01:00
|
|
|
#endif
|
1999-08-24 22:11:19 +02:00
|
|
|
free(block);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocBlock block = MemoryChunkGetBlock(chunk);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
int fidx;
|
|
|
|
AllocFreeListLink *link;
|
1999-08-24 22:11:19 +02:00
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/*
|
|
|
|
* In this path, for speed reasons we just Assert that the referenced
|
|
|
|
* block is good. We can also Assert that the value field is sane.
|
|
|
|
* Future field experience may show that these Asserts had better
|
|
|
|
* become regular runtime test-and-elog checks.
|
|
|
|
*/
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocBlockIsValid(block));
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set = block->aset;
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
fidx = MemoryChunkGetValue(chunk);
|
|
|
|
Assert(FreeListIdxIsValid(fidx));
|
|
|
|
link = GetFreeListLink(chunk);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Test for someone scribbling on unused space in chunk */
|
|
|
|
if (chunk->requested_size < GetChunkSizeFromFreeListIdx(fidx))
|
|
|
|
if (!sentinel_ok(pointer, chunk->requested_size))
|
|
|
|
elog(WARNING, "detected write past chunk end in %s %p",
|
|
|
|
set->header.name, chunk);
|
|
|
|
#endif
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
wipe_mem(pointer, GetChunkSizeFromFreeListIdx(fidx));
|
2000-12-01 06:16:45 +01:00
|
|
|
#endif
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* push this chunk onto the top of the free list */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(link, sizeof(AllocFreeListLink));
|
|
|
|
link->next = set->freelist[fidx];
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(link, sizeof(AllocFreeListLink));
|
|
|
|
set->freelist[fidx] = chunk;
|
2000-12-01 06:16:45 +01:00
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset requested_size to InvalidAllocSize in chunks that are on free
|
|
|
|
* list.
|
|
|
|
*/
|
|
|
|
chunk->requested_size = InvalidAllocSize;
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
#endif
|
1999-08-24 22:11:19 +02:00
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
1999-02-14 00:22:53 +01:00
|
|
|
* AllocSetRealloc
|
2015-01-29 16:23:38 +01:00
|
|
|
* Returns new pointer to allocated memory of given size or NULL if
|
|
|
|
* request could not be completed; this memory is added to the set.
|
|
|
|
* Memory associated with given pointer is copied into the new memory,
|
|
|
|
* and the old memory is freed.
|
2013-06-27 02:00:08 +02:00
|
|
|
*
|
|
|
|
* Without MEMORY_CONTEXT_CHECKING, we don't know the old request size. This
|
|
|
|
* makes our Valgrind client requests less-precise, hazarding false negatives.
|
|
|
|
* (In principle, we could use VALGRIND_GET_VBITS() to rediscover the old
|
|
|
|
* request size.)
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void *
|
2024-02-27 04:39:42 +01:00
|
|
|
AllocSetRealloc(void *pointer, Size size, int flags)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocBlock block;
|
|
|
|
AllocSet set;
|
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
Size oldchksize;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
int fidx;
|
2017-11-25 01:28:19 +01:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (MemoryChunkIsExternal(chunk))
|
1999-08-24 22:11:19 +02:00
|
|
|
{
|
|
|
|
/*
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
* The chunk must have been allocated as a single-chunk block. Use
|
2019-10-03 19:56:26 +02:00
|
|
|
* realloc() to make the containing block bigger, or smaller, with
|
|
|
|
* minimum space wastage.
|
1999-08-24 22:11:19 +02:00
|
|
|
*/
|
2000-12-01 06:16:45 +01:00
|
|
|
Size chksize;
|
1999-08-24 22:11:19 +02:00
|
|
|
Size blksize;
|
2019-10-01 03:13:39 +02:00
|
|
|
Size oldblksize;
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
block = ExternalChunkGetBlock(chunk);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Try to verify that we have a sane block pointer: the block header
|
|
|
|
* should reference an aset and the freeptr should match the endptr.
|
|
|
|
*/
|
|
|
|
if (!AllocBlockIsValid(block) || block->freeptr != block->endptr)
|
|
|
|
elog(ERROR, "could not find block containing chunk %p", chunk);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set = block->aset;
|
|
|
|
|
2024-02-27 04:39:42 +01:00
|
|
|
/* only check size in paths where the limits could be hit */
|
|
|
|
MemoryContextCheckSize((MemoryContext) set, size, flags);
|
|
|
|
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
oldchksize = block->endptr - (char *) pointer;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Test for someone scribbling on unused space in chunk */
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
Assert(chunk->requested_size < oldchksize);
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
if (!sentinel_ok(pointer, chunk->requested_size))
|
|
|
|
elog(WARNING, "detected write past chunk end in %s %p",
|
|
|
|
set->header.name, chunk);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#endif
|
|
|
|
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* ensure there's always space for the sentinel byte */
|
|
|
|
chksize = MAXALIGN(size + 1);
|
|
|
|
#else
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
chksize = MAXALIGN(size);
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
#endif
|
2019-10-03 19:56:26 +02:00
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
/* Do the realloc */
|
2000-12-01 06:16:45 +01:00
|
|
|
blksize = chksize + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
|
2019-10-01 03:13:39 +02:00
|
|
|
oldblksize = block->endptr - ((char *) block);
|
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
block = (AllocBlock) realloc(block, blksize);
|
|
|
|
if (block == NULL)
|
2017-11-25 01:28:19 +01:00
|
|
|
{
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2024-02-27 04:39:42 +01:00
|
|
|
return MemoryContextAllocationFailure(&set->header, size, flags);
|
2017-11-25 01:28:19 +01:00
|
|
|
}
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2019-10-04 16:10:56 +02:00
|
|
|
/* updated separately, not to underflow when (oldblksize > blksize) */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set->header.mem_allocated -= oldblksize;
|
|
|
|
set->header.mem_allocated += blksize;
|
2019-10-01 03:13:39 +02:00
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
block->freeptr = block->endptr = ((char *) block) + blksize;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
/* Update pointers since block has likely been moved */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
chunk = (MemoryChunk *) (((char *) block) + ALLOC_BLOCKHDRSZ);
|
|
|
|
pointer = MemoryChunkGetPointer(chunk);
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
if (block->prev)
|
|
|
|
block->prev->next = block;
|
1999-08-24 22:11:19 +02:00
|
|
|
else
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
set->blocks = block;
|
|
|
|
if (block->next)
|
|
|
|
block->next->prev = block;
|
2001-03-22 05:01:46 +01:00
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
2008-04-12 00:54:23 +02:00
|
|
|
#ifdef RANDOMIZE_ALLOCATED_MEMORY
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We can only randomize the extra space if we know the prior request.
|
|
|
|
* When using Valgrind, randomize_mem() also marks memory UNDEFINED.
|
|
|
|
*/
|
2019-10-03 19:56:26 +02:00
|
|
|
if (size > chunk->requested_size)
|
|
|
|
randomize_mem((char *) pointer + chunk->requested_size,
|
|
|
|
size - chunk->requested_size);
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
#else
|
2008-04-12 00:54:23 +02:00
|
|
|
|
2013-06-27 02:00:08 +02:00
|
|
|
/*
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
* If this is an increase, realloc() will have marked any
|
|
|
|
* newly-allocated part (from oldchksize to chksize) UNDEFINED, but we
|
|
|
|
* also need to adjust trailing bytes from the old allocation (from
|
|
|
|
* chunk->requested_size to oldchksize) as they are marked NOACCESS.
|
|
|
|
* Make sure not to mark too many bytes in case chunk->requested_size
|
|
|
|
* < size < oldchksize.
|
2013-06-27 02:00:08 +02:00
|
|
|
*/
|
2019-10-03 19:56:26 +02:00
|
|
|
#ifdef USE_VALGRIND
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
if (Min(size, oldchksize) > chunk->requested_size)
|
2019-10-03 19:56:26 +02:00
|
|
|
VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + chunk->requested_size,
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
Min(size, oldchksize) - chunk->requested_size);
|
|
|
|
#endif
|
2019-10-03 19:56:26 +02:00
|
|
|
#endif
|
2013-06-27 02:00:08 +02:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
chunk->requested_size = size;
|
|
|
|
/* set mark to catch clobber of "unused" space */
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
Assert(size < chksize);
|
|
|
|
set_sentinel(pointer, size);
|
2013-06-27 02:00:08 +02:00
|
|
|
#else /* !MEMORY_CONTEXT_CHECKING */
|
|
|
|
|
|
|
|
/*
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
* We may need to adjust marking of bytes from the old allocation as
|
|
|
|
* some of them may be marked NOACCESS. We don't know how much of the
|
|
|
|
* old chunk size was the requested size; it could have been as small
|
|
|
|
* as one byte. We have to be conservative and just mark the entire
|
|
|
|
* old portion DEFINED. Make sure not to mark memory beyond the new
|
|
|
|
* allocation in case it's smaller than the old one.
|
2013-06-27 02:00:08 +02:00
|
|
|
*/
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
VALGRIND_MAKE_MEM_DEFINED(pointer, Min(size, oldchksize));
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
#endif
|
2000-12-01 06:16:45 +01:00
|
|
|
|
2017-11-25 01:28:19 +01:00
|
|
|
/* Ensure any padding bytes are marked NOACCESS. */
|
2013-06-27 02:00:08 +02:00
|
|
|
VALGRIND_MAKE_MEM_NOACCESS((char *) pointer + size, chksize - size);
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header . */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
return pointer;
|
1999-08-24 22:11:19 +02:00
|
|
|
}
|
2019-10-03 19:56:26 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
block = MemoryChunkGetBlock(chunk);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* In this path, for speed reasons we just Assert that the referenced
|
|
|
|
* block is good. We can also Assert that the value field is sane. Future
|
|
|
|
* field experience may show that these Asserts had better become regular
|
|
|
|
* runtime test-and-elog checks.
|
|
|
|
*/
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocBlockIsValid(block));
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set = block->aset;
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
fidx = MemoryChunkGetValue(chunk);
|
|
|
|
Assert(FreeListIdxIsValid(fidx));
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
oldchksize = GetChunkSizeFromFreeListIdx(fidx);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Test for someone scribbling on unused space in chunk */
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
if (chunk->requested_size < oldchksize)
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (!sentinel_ok(pointer, chunk->requested_size))
|
|
|
|
elog(WARNING, "detected write past chunk end in %s %p",
|
|
|
|
set->header.name, chunk);
|
|
|
|
#endif
|
|
|
|
|
2019-10-03 19:56:26 +02:00
|
|
|
/*
|
|
|
|
* Chunk sizes are aligned to power of 2 in AllocSetAlloc(). Maybe the
|
|
|
|
* allocated area already is >= the new size. (In particular, we will
|
|
|
|
* fall out here if the requested size is a decrease.)
|
|
|
|
*/
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
if (oldchksize >= size)
|
2019-10-03 19:56:26 +02:00
|
|
|
{
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
Size oldrequest = chunk->requested_size;
|
|
|
|
|
|
|
|
#ifdef RANDOMIZE_ALLOCATED_MEMORY
|
|
|
|
/* We can only fill the extra space if we know the prior request */
|
|
|
|
if (size > oldrequest)
|
|
|
|
randomize_mem((char *) pointer + oldrequest,
|
|
|
|
size - oldrequest);
|
|
|
|
#endif
|
|
|
|
|
|
|
|
chunk->requested_size = size;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If this is an increase, mark any newly-available part UNDEFINED.
|
|
|
|
* Otherwise, mark the obsolete part NOACCESS.
|
|
|
|
*/
|
|
|
|
if (size > oldrequest)
|
|
|
|
VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + oldrequest,
|
|
|
|
size - oldrequest);
|
|
|
|
else
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS((char *) pointer + size,
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
oldchksize - size);
|
2019-10-03 19:56:26 +02:00
|
|
|
|
|
|
|
/* set mark to catch clobber of "unused" space */
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
if (size < oldchksize)
|
2019-10-03 19:56:26 +02:00
|
|
|
set_sentinel(pointer, size);
|
|
|
|
#else /* !MEMORY_CONTEXT_CHECKING */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We don't have the information to determine whether we're growing
|
|
|
|
* the old request or shrinking it, so we conservatively mark the
|
|
|
|
* entire new allocation DEFINED.
|
|
|
|
*/
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(pointer, oldchksize);
|
2019-10-03 19:56:26 +02:00
|
|
|
VALGRIND_MAKE_MEM_DEFINED(pointer, size);
|
|
|
|
#endif
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2019-10-03 19:56:26 +02:00
|
|
|
|
|
|
|
return pointer;
|
|
|
|
}
|
1999-08-24 22:11:19 +02:00
|
|
|
else
|
|
|
|
{
|
2001-01-23 02:01:36 +01:00
|
|
|
/*
|
2019-10-03 19:56:26 +02:00
|
|
|
* Enlarge-a-small-chunk case. We just do this by brute force, ie,
|
|
|
|
* allocate a new chunk and copy the data. Since we know the existing
|
|
|
|
* data isn't huge, this won't involve any great memcpy expense, so
|
|
|
|
* it's not worth being smarter. (At one time we tried to avoid
|
|
|
|
* memcpy when it was possible to enlarge the chunk in-place, but that
|
|
|
|
* turns out to misbehave unpleasantly for repeated cycles of
|
2007-08-12 22:39:14 +02:00
|
|
|
* palloc/repalloc/pfree: the eventually freed chunks go into the
|
|
|
|
* wrong freelist for the next initial palloc request, and so we leak
|
|
|
|
* memory indefinitely. See pgsql-hackers archives for 2007-08-11.)
|
2001-01-23 02:01:36 +01:00
|
|
|
*/
|
|
|
|
AllocPointer newPointer;
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
Size oldsize;
|
2001-01-23 02:01:36 +01:00
|
|
|
|
2024-02-27 04:39:42 +01:00
|
|
|
/* allocate new chunk (this also checks size is valid) */
|
|
|
|
newPointer = AllocSetAlloc((MemoryContext) set, size, flags);
|
1999-08-24 22:11:19 +02:00
|
|
|
|
2015-01-29 16:23:38 +01:00
|
|
|
/* leave immediately if request was not completed */
|
|
|
|
if (newPointer == NULL)
|
2017-11-25 01:28:19 +01:00
|
|
|
{
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2024-02-27 04:39:42 +01:00
|
|
|
return MemoryContextAllocationFailure((MemoryContext) set, size, flags);
|
2017-11-25 01:28:19 +01:00
|
|
|
}
|
2015-01-29 16:23:38 +01:00
|
|
|
|
2013-06-27 02:00:08 +02:00
|
|
|
/*
|
2017-11-25 01:28:19 +01:00
|
|
|
* AllocSetAlloc() may have returned a region that is still NOACCESS.
|
|
|
|
* Change it to UNDEFINED for the moment; memcpy() will then transfer
|
|
|
|
* definedness from the old allocation to the new. If we know the old
|
|
|
|
* allocation, copy just that much. Otherwise, make the entire old
|
|
|
|
* chunk defined to avoid errors as we copy the currently-NOACCESS
|
|
|
|
* trailing bytes.
|
2013-06-27 02:00:08 +02:00
|
|
|
*/
|
|
|
|
VALGRIND_MAKE_MEM_UNDEFINED(newPointer, size);
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
oldsize = chunk->requested_size;
|
|
|
|
#else
|
Fix erroneous Valgrind markings in AllocSetRealloc.
If asked to decrease the size of a large (>8K) palloc chunk,
AllocSetRealloc could improperly change the Valgrind state of memory
beyond the new end of the chunk: it would mark data UNDEFINED as far
as the old end of the chunk after having done the realloc(3) call,
thus tromping on the state of memory that no longer belongs to it.
One would normally expect that memory to now be marked NOACCESS,
so that this mislabeling might prevent detection of later errors.
If realloc() had chosen to move the chunk someplace else (unlikely,
but well within its rights) we could also mismark perfectly-valid
DEFINED data as UNDEFINED, causing false-positive valgrind reports
later. Also, any malloc bookkeeping data placed within this area
might now be wrongly marked, causing additional problems.
Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)".
It's sufficient to mark as far as "size" when that's smaller, because
whatever remains in the new chunk size will be marked NOACCESS below,
and we expect realloc() to have taken care of marking the memory
beyond the new official end of the chunk.
While we're here, also rename the function's "oldsize" variable
to "oldchksize" to more clearly explain what it actually holds,
namely the distance to the end of the chunk (that is, requested size
plus trailing padding). This is more consistent with the use of
"size" and "chksize" to hold the new requested size and chunk size.
Add a new variable "oldsize" in the one stanza where we're actually
talking about the old requested size.
Oversight in commit c477f3e44. Back-patch to all supported branches,
as that was, just in case anybody wants to do valgrind testing on back
branches.
Karina Litskevich
Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
2023-02-22 00:47:46 +01:00
|
|
|
oldsize = oldchksize;
|
2013-06-27 02:00:08 +02:00
|
|
|
VALGRIND_MAKE_MEM_DEFINED(pointer, oldsize);
|
|
|
|
#endif
|
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
/* transfer existing data (certain to fit) */
|
|
|
|
memcpy(newPointer, pointer, oldsize);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-08-24 22:11:19 +02:00
|
|
|
/* free old chunk */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocSetFree(pointer);
|
1999-08-24 22:11:19 +02:00
|
|
|
|
|
|
|
return newPointer;
|
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/*
|
|
|
|
* AllocSetGetChunkContext
|
|
|
|
* Return the MemoryContext that 'pointer' belongs to.
|
|
|
|
*/
|
|
|
|
MemoryContext
|
|
|
|
AllocSetGetChunkContext(void *pointer)
|
|
|
|
{
|
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
|
|
|
AllocBlock block;
|
|
|
|
AllocSet set;
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (MemoryChunkIsExternal(chunk))
|
|
|
|
block = ExternalChunkGetBlock(chunk);
|
|
|
|
else
|
|
|
|
block = (AllocBlock) MemoryChunkGetBlock(chunk);
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocBlockIsValid(block));
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
set = block->aset;
|
|
|
|
|
|
|
|
return &set->header;
|
|
|
|
}
|
|
|
|
|
2002-08-12 02:36:12 +02:00
|
|
|
/*
|
|
|
|
* AllocSetGetChunkSpace
|
|
|
|
* Given a currently-allocated chunk, determine the total space
|
|
|
|
* it occupies (including all memory-allocation overhead).
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
Size
|
|
|
|
AllocSetGetChunkSpace(void *pointer)
|
2002-08-12 02:36:12 +02:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
int fidx;
|
2002-08-12 02:36:12 +02:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (MemoryChunkIsExternal(chunk))
|
|
|
|
{
|
|
|
|
AllocBlock block = ExternalChunkGetBlock(chunk);
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocBlockIsValid(block));
|
2023-04-15 01:59:52 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
return block->endptr - (char *) chunk;
|
|
|
|
}
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
fidx = MemoryChunkGetValue(chunk);
|
|
|
|
Assert(FreeListIdxIsValid(fidx));
|
2023-04-15 01:59:52 +02:00
|
|
|
|
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
return GetChunkSizeFromFreeListIdx(fidx) + ALLOC_CHUNKHDRSZ;
|
2002-08-12 02:36:12 +02:00
|
|
|
}
|
|
|
|
|
2004-09-16 22:17:49 +02:00
|
|
|
/*
|
|
|
|
* AllocSetIsEmpty
|
|
|
|
* Is an allocset empty of any allocated space?
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
bool
|
2004-09-16 22:17:49 +02:00
|
|
|
AllocSetIsEmpty(MemoryContext context)
|
|
|
|
{
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocSetIsValid(context));
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
2004-09-16 22:17:49 +02:00
|
|
|
/*
|
2005-09-01 20:15:42 +02:00
|
|
|
* For now, we say "empty" only if the context is new or just reset. We
|
|
|
|
* could examine the freelists to determine if all space has been freed,
|
|
|
|
* but it's not really worth the trouble for present uses of this
|
|
|
|
* functionality.
|
2004-09-16 22:17:49 +02:00
|
|
|
*/
|
2011-05-21 20:47:19 +02:00
|
|
|
if (context->isReset)
|
2004-09-16 22:17:49 +02:00
|
|
|
return true;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2000-05-21 04:23:30 +02:00
|
|
|
/*
|
|
|
|
* AllocSetStats
|
2015-08-25 19:09:48 +02:00
|
|
|
* Compute stats about memory consumption of an allocset.
|
|
|
|
*
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
* printfunc: if not NULL, pass a human-readable stats string to this.
|
|
|
|
* passthru: pass this pointer through to printfunc.
|
|
|
|
* totals: if not NULL, add stats about this context into *totals.
|
Add function to log the memory contexts of specified backend process.
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
2021-04-06 06:44:15 +02:00
|
|
|
* print_to_stderr: print stats to stderr if true, elog otherwise.
|
2000-05-21 04:23:30 +02:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
AllocSetStats(MemoryContext context,
|
|
|
|
MemoryStatsPrintFunc printfunc, void *passthru,
|
Add function to log the memory contexts of specified backend process.
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
2021-04-06 06:44:15 +02:00
|
|
|
MemoryContextCounters *totals, bool print_to_stderr)
|
2000-05-21 04:23:30 +02:00
|
|
|
{
|
2000-06-28 05:33:33 +02:00
|
|
|
AllocSet set = (AllocSet) context;
|
2014-01-23 23:18:23 +01:00
|
|
|
Size nblocks = 0;
|
2015-08-25 19:09:48 +02:00
|
|
|
Size freechunks = 0;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
Size totalspace;
|
2014-01-23 23:18:23 +01:00
|
|
|
Size freespace = 0;
|
2000-05-21 04:23:30 +02:00
|
|
|
AllocBlock block;
|
|
|
|
int fidx;
|
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(AllocSetIsValid(set));
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Include context header in totalspace */
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
totalspace = MAXALIGN(sizeof(AllocSetContext));
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
2000-05-21 04:23:30 +02:00
|
|
|
for (block = set->blocks; block != NULL; block = block->next)
|
|
|
|
{
|
|
|
|
nblocks++;
|
|
|
|
totalspace += block->endptr - ((char *) block);
|
|
|
|
freespace += block->endptr - block->freeptr;
|
|
|
|
}
|
|
|
|
for (fidx = 0; fidx < ALLOCSET_NUM_FREELISTS; fidx++)
|
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
Size chksz = GetChunkSizeFromFreeListIdx(fidx);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = set->freelist[fidx];
|
2015-08-25 19:09:48 +02:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
while (chunk != NULL)
|
2000-05-21 04:23:30 +02:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocFreeListLink *link = GetFreeListLink(chunk);
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
Assert(MemoryChunkGetValue(chunk) == fidx);
|
2023-04-15 01:59:52 +02:00
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
2015-08-25 19:09:48 +02:00
|
|
|
freechunks++;
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
freespace += chksz + ALLOC_CHUNKHDRSZ;
|
|
|
|
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(link, sizeof(AllocFreeListLink));
|
|
|
|
chunk = link->next;
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(link, sizeof(AllocFreeListLink));
|
2000-05-21 04:23:30 +02:00
|
|
|
}
|
|
|
|
}
|
2007-08-07 08:25:14 +02:00
|
|
|
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
if (printfunc)
|
2015-08-25 19:09:48 +02:00
|
|
|
{
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
char stats_string[200];
|
|
|
|
|
|
|
|
snprintf(stats_string, sizeof(stats_string),
|
2021-12-22 08:34:10 +01:00
|
|
|
"%zu total in %zu blocks; %zu free (%zu chunks); %zu used",
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
totalspace, nblocks, freespace, freechunks,
|
|
|
|
totalspace - freespace);
|
Add function to log the memory contexts of specified backend process.
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
2021-04-06 06:44:15 +02:00
|
|
|
printfunc(context, passthru, stats_string, print_to_stderr);
|
2015-08-25 19:09:48 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
if (totals)
|
|
|
|
{
|
|
|
|
totals->nblocks += nblocks;
|
|
|
|
totals->freechunks += freechunks;
|
|
|
|
totals->totalspace += totalspace;
|
|
|
|
totals->freespace += freespace;
|
|
|
|
}
|
2000-05-21 04:23:30 +02:00
|
|
|
}
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
/*
|
|
|
|
* AllocSetCheck
|
2000-12-01 06:16:45 +01:00
|
|
|
* Walk through chunks and check consistency of memory.
|
2000-12-06 00:40:36 +01:00
|
|
|
*
|
2002-03-06 07:10:59 +01:00
|
|
|
* NOTE: report errors as WARNING, *not* ERROR or FATAL. Otherwise you'll
|
2000-12-06 00:40:36 +01:00
|
|
|
* find yourself in an infinite loop when trouble occurs, because this
|
|
|
|
* routine will be entered again when elog cleanup tries to release memory!
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
AllocSetCheck(MemoryContext context)
|
|
|
|
{
|
|
|
|
AllocSet set = (AllocSet) context;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
const char *name = set->header.name;
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
AllocBlock prevblock;
|
2000-12-01 06:16:45 +01:00
|
|
|
AllocBlock block;
|
2019-10-04 16:10:56 +02:00
|
|
|
Size total_allocated = 0;
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
for (prevblock = NULL, block = set->blocks;
|
|
|
|
block != NULL;
|
|
|
|
prevblock = block, block = block->next)
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
{
|
|
|
|
char *bpoz = ((char *) block) + ALLOC_BLOCKHDRSZ;
|
|
|
|
long blk_used = block->freeptr - bpoz;
|
|
|
|
long blk_data = 0;
|
|
|
|
long nchunks = 0;
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
bool has_external_chunk = false;
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
2023-07-17 01:16:56 +02:00
|
|
|
if (IsKeeperBlock(set, block))
|
2019-10-01 03:13:39 +02:00
|
|
|
total_allocated += block->endptr - ((char *) set);
|
|
|
|
else
|
|
|
|
total_allocated += block->endptr - ((char *) block);
|
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
/*
|
|
|
|
* Empty block - empty can be keeper-block only
|
|
|
|
*/
|
|
|
|
if (!blk_used)
|
|
|
|
{
|
2023-07-17 01:16:56 +02:00
|
|
|
if (!IsKeeperBlock(set, block))
|
2003-07-25 22:18:01 +02:00
|
|
|
elog(WARNING, "problem in alloc set %s: empty block %p",
|
2000-12-01 06:16:45 +01:00
|
|
|
name, block);
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
}
|
2001-03-22 05:01:46 +01:00
|
|
|
|
Use doubly-linked block lists in aset.c to reduce large-chunk overhead.
Large chunks (those too large for any palloc freelist) are managed as
separate blocks. Formerly, realloc'ing or pfree'ing such a chunk required
O(N) time in a context with N blocks, since we had to traipse down the
singly-linked block list to locate the block's predecessor before we could
fix the list links. This can result in O(N^2) runtime in situations where
large numbers of such chunks are manipulated within one context. Cases
like that were not foreseen in the original design of aset.c, and indeed
didn't arise until fairly recently. But such problems can now occur in
reorderbuffer.c and in hash joining, both of which make repeated large
requests without scaling up their request size as they do so, and which
will free their requests in not-necessarily-LIFO order.
To fix, change the block list from singly-linked to doubly-linked.
This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
seem like unacceptable overhead, since aset.c's blocks are normally
8K or more, and never less than 1K in current practice.
In passing, get rid of some redundant AllocChunkGetPointer() calls in
AllocSetRealloc (the compiler might be smart enough to optimize these
away anyway, but no need to assume that) and improve AllocSetCheck's
checking of block header fields.
Back-patch to 9.4 where reorderbuffer.c appeared. We could take this
further back, but currently there's no evidence that it would be useful.
Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
2017-03-08 18:21:12 +01:00
|
|
|
/*
|
|
|
|
* Check block header fields
|
|
|
|
*/
|
|
|
|
if (block->aset != set ||
|
|
|
|
block->prev != prevblock ||
|
|
|
|
block->freeptr < bpoz ||
|
|
|
|
block->freeptr > block->endptr)
|
|
|
|
elog(WARNING, "problem in alloc set %s: corrupt header in block %p",
|
|
|
|
name, block);
|
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
/*
|
|
|
|
* Chunk walker
|
|
|
|
*/
|
2000-12-01 06:16:45 +01:00
|
|
|
while (bpoz < block->freeptr)
|
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = (MemoryChunk *) bpoz;
|
2000-12-01 06:16:45 +01:00
|
|
|
Size chsize,
|
|
|
|
dsize;
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, ALLOC_CHUNKHDRSZ);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (MemoryChunkIsExternal(chunk))
|
|
|
|
{
|
|
|
|
chsize = block->endptr - (char *) MemoryChunkGetPointer(chunk); /* aligned chunk size */
|
|
|
|
has_external_chunk = true;
|
|
|
|
|
|
|
|
/* make sure this chunk consumes the entire block */
|
|
|
|
if (chsize + ALLOC_CHUNKHDRSZ != blk_used)
|
|
|
|
elog(WARNING, "problem in alloc set %s: bad single-chunk %p in block %p",
|
|
|
|
name, chunk, block);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
int fidx = MemoryChunkGetValue(chunk);
|
|
|
|
|
|
|
|
if (!FreeListIdxIsValid(fidx))
|
|
|
|
elog(WARNING, "problem in alloc set %s: bad chunk size for chunk %p in block %p",
|
|
|
|
name, chunk, block);
|
|
|
|
|
|
|
|
chsize = GetChunkSizeFromFreeListIdx(fidx); /* aligned chunk size */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Check the stored block offset correctly references this
|
|
|
|
* block.
|
|
|
|
*/
|
|
|
|
if (block != MemoryChunkGetBlock(chunk))
|
|
|
|
elog(WARNING, "problem in alloc set %s: bad block offset for chunk %p in block %p",
|
|
|
|
name, chunk, block);
|
|
|
|
}
|
2000-12-01 06:16:45 +01:00
|
|
|
dsize = chunk->requested_size; /* real data */
|
2001-03-22 05:01:46 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* an allocated chunk's requested size must be <= the chsize */
|
|
|
|
if (dsize != InvalidAllocSize && dsize > chsize)
|
2003-07-25 22:18:01 +02:00
|
|
|
elog(WARNING, "problem in alloc set %s: req size > alloc size for chunk %p in block %p",
|
2000-12-01 06:16:45 +01:00
|
|
|
name, chunk, block);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
|
|
|
/* chsize must not be smaller than the first freelist's size */
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
if (chsize < (1 << ALLOC_MINBITS))
|
2014-01-23 23:18:23 +01:00
|
|
|
elog(WARNING, "problem in alloc set %s: bad size %zu for chunk %p in block %p",
|
|
|
|
name, chsize, chunk, block);
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
/*
|
2017-11-25 01:28:19 +01:00
|
|
|
* Check for overwrite of padding space in an allocated chunk.
|
2000-12-01 06:16:45 +01:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (dsize != InvalidAllocSize && dsize < chsize &&
|
2013-06-27 01:56:03 +02:00
|
|
|
!sentinel_ok(chunk, ALLOC_CHUNKHDRSZ + dsize))
|
2003-07-25 22:18:01 +02:00
|
|
|
elog(WARNING, "problem in alloc set %s: detected write past chunk end in block %p, chunk %p",
|
2000-12-01 06:16:45 +01:00
|
|
|
name, block, chunk);
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* if chunk is allocated, disallow access to the chunk header */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
if (dsize != InvalidAllocSize)
|
2023-04-15 01:59:52 +02:00
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
|
2017-11-25 01:28:19 +01:00
|
|
|
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
blk_data += chsize;
|
|
|
|
nchunks++;
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
bpoz += ALLOC_CHUNKHDRSZ + chsize;
|
|
|
|
}
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
|
|
|
|
if ((blk_data + (nchunks * ALLOC_CHUNKHDRSZ)) != blk_used)
|
2003-07-25 22:18:01 +02:00
|
|
|
elog(WARNING, "problem in alloc set %s: found inconsistent memory block %p",
|
2000-12-01 06:16:45 +01:00
|
|
|
name, block);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
|
|
|
if (has_external_chunk && nchunks > 1)
|
|
|
|
elog(WARNING, "problem in alloc set %s: external chunk on non-dedicated block %p",
|
|
|
|
name, block);
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
}
|
2019-10-01 03:13:39 +02:00
|
|
|
|
2020-03-19 19:20:39 +01:00
|
|
|
Assert(total_allocated == context->mem_allocated);
|
Here is the patch with memory leak checker. This checker allow detect
in-chunk leaks, overwrite-next-chunk leaks and overwrite block-freeptr leaks.
A in-chunk leak --- if something overwrite space after wanted (via palloc()
size, but it is still inside chunk. For example
x = palloc(12); /* create 16b chunk */
memset(x, '#', 13);
this leak is in the current source total invisible, because chunk is 16b and
leak is in the "align space".
For this feature I add data_size to StandardChunk, and all memory which go
from AllocSetAlloc() is marked as 0x7F.
The MemoryContextCheck() is compiled '#ifdef USE_ASSERT_CHECKING'.
I add this checking to 'tcop/postgres.c' and is active after each backend
query, but it is probably not sufficient, because some MemoryContext exist
only during memory processing --- will good if someone who known where
it is needful (Tom:-) add it for others contexts;
A problem in the current source is that we have still some malloc()
allocation that is not needful and this allocation is total invisible for
all context routines. For example Dllist in backend (pretty dirty it is in
catcache where values in Dllist are palloc-ed, but list is malloc-ed).
--- and BTW. this Dllist design stand in the way for query cache :-)
Tom, if you agree I start replace some mallocs.
BTW. --- Tom, have you idea for across transaction presistent allocation for
SQL functions? (like regex - now it is via malloc)
I almost forget. I add one if() to AllocSetAlloc(), for 'size' that are
greater than ALLOC_BIGCHUNK_LIMIT is not needful check AllocSetFreeIndex(),
because 'fidx' is always 'ALLOCSET_NUM_FREELISTS - 1'. It a little brisk up
allocation for very large chunks. Right?
Karel
2000-07-11 16:30:37 +02:00
|
|
|
}
|
2001-10-28 07:26:15 +01:00
|
|
|
|
2000-12-01 06:16:45 +01:00
|
|
|
#endif /* MEMORY_CONTEXT_CHECKING */
|