Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* slab.c
|
|
|
|
* SLAB allocator definitions.
|
|
|
|
*
|
|
|
|
* SLAB is a MemoryContext implementation designed for cases where large
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* numbers of equally-sized objects can be allocated and freed efficiently
|
|
|
|
* with minimal memory wastage and fragmentation.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
|
|
|
*
|
2023-01-02 21:00:37 +01:00
|
|
|
* Portions Copyright (c) 2017-2023, PostgreSQL Global Development Group
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
|
|
|
* src/backend/utils/mmgr/slab.c
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* NOTE:
|
|
|
|
* The constant allocation size allows significant simplification and various
|
|
|
|
* optimizations over more general purpose allocators. The blocks are carved
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* into chunks of exactly the right size, wasting only the space required to
|
|
|
|
* MAXALIGN the allocated chunks.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Slab can also help reduce memory fragmentation in cases where longer-lived
|
|
|
|
* chunks remain stored on blocks while most of the other chunks have already
|
|
|
|
* been pfree'd. We give priority to putting new allocations into the
|
|
|
|
* "fullest" block. This help avoid having too many sparsely used blocks
|
|
|
|
* around and allows blocks to more easily become completely unused which
|
|
|
|
* allows them to be eventually free'd.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* We identify the "fullest" block to put new allocations on by using a block
|
|
|
|
* from the lowest populated element of the context's "blocklist" array.
|
|
|
|
* This is an array of dlists containing blocks which we partition by the
|
|
|
|
* number of free chunks which block has. Blocks with fewer free chunks are
|
|
|
|
* stored in a lower indexed dlist array slot. Full blocks go on the 0th
|
|
|
|
* element of the blocklist array. So that we don't have to have too many
|
|
|
|
* elements in the array, each dlist in the array is responsible for a range
|
|
|
|
* of free chunks. When a chunk is palloc'd or pfree'd we may need to move
|
|
|
|
* the block onto another dlist if the number of free chunks crosses the
|
|
|
|
* range boundary that the current list is responsible for. Having just a
|
|
|
|
* few blocklist elements reduces the number of times we must move the block
|
|
|
|
* onto another dlist element.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* We keep track of free chunks within each block by using a block-level free
|
|
|
|
* list. We consult this list when we allocate a new chunk in the block.
|
|
|
|
* The free list is a linked list, the head of which is pointed to with
|
|
|
|
* SlabBlock's freehead field. Each subsequent list item is stored in the
|
|
|
|
* free chunk's memory. We ensure chunks are large enough to store this
|
|
|
|
* address.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* When we allocate a new block, technically all chunks are free, however, to
|
|
|
|
* avoid having to write out the entire block to set the linked list for the
|
|
|
|
* free chunks for every chunk in the block, we instead store a pointer to
|
|
|
|
* the next "unused" chunk on the block and keep track of how many of these
|
|
|
|
* unused chunks there are. When a new block is malloc'd, all chunks are
|
|
|
|
* unused. The unused pointer starts with the first chunk on the block and
|
|
|
|
* as chunks are allocated, the unused pointer is incremented. As chunks are
|
|
|
|
* pfree'd, the unused pointer never goes backwards. The unused pointer can
|
|
|
|
* be thought of as a high watermark for the maximum number of chunks in the
|
|
|
|
* block which have been in use concurrently. When a chunk is pfree'd the
|
|
|
|
* chunk is put onto the head of the free list and the unused pointer is not
|
|
|
|
* changed. We only consume more unused chunks if we run out of free chunks
|
|
|
|
* on the free list. This method effectively gives priority to using
|
|
|
|
* previously used chunks over previously unused chunks, which should perform
|
|
|
|
* better due to CPU caching effects.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include "postgres.h"
|
|
|
|
|
2019-11-12 04:00:16 +01:00
|
|
|
#include "lib/ilist.h"
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#include "utils/memdebug.h"
|
|
|
|
#include "utils/memutils.h"
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#include "utils/memutils_memorychunk.h"
|
|
|
|
#include "utils/memutils_internal.h"
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Use MAXALIGN() in calculations using sizeof(SlabBlock)
c6e0fe1f2 added a new pointer field to SlabBlock to make it 4 bytes larger
on 32-bit machines. Prior to that commit, the size of that struct was a
multiple of 8, which meant that MAXALIGN(sizeof(SlabBlock)) was the same
as sizeof(SlabBlock), however, after c6e0fe1f2, due to the addition of the
new pointer field to store a pointer to the owning context, that was no
longer true on builds with sizeof(void *) == 4.
This problem was highlighted by an Assert failure which was checking that
the pointer given to pfree() was MAXALIGNED. Various 32-bit ARM buildfarm
animals were failing. These have MAXIMUM_ALIGNOF of 8. The only 32-bit
testing I'd managed to do on c6e0fe1f2 had been on x86, which has a
MAXIMUM_ALIGNOF of 4, therefore did not exhibit this issue.
Here we define Slab_BLOCKHDRSZ and copy what is being done in aset.c and
generation.c for doing calculations based on the size of the context's
block type. This means that SlabAlloc() will now always return a
MAXALIGNed pointer.
This also fixes an incorrect sentinel_ok() check in SlabCheck() which was
incorrectly checking the wrong sentinel byte. This must have previously
not caused any issues due to the fullChunkSize never being large enough to
store the sentinel byte.
Diagnosed-by: Tomas Vondra, Tom Lane
Author: Tomas Vondra, David Rowley
Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
2022-08-30 04:36:04 +02:00
|
|
|
#define Slab_BLOCKHDRSZ MAXALIGN(sizeof(SlabBlock))
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/*
|
|
|
|
* Size of the memory required to store the SlabContext.
|
|
|
|
* MEMORY_CONTEXT_CHECKING builds need some extra memory for the isChunkFree
|
|
|
|
* array.
|
|
|
|
*/
|
|
|
|
#define Slab_CONTEXT_HDRSZ(chunksPerBlock) \
|
|
|
|
(sizeof(SlabContext) + ((chunksPerBlock) * sizeof(bool)))
|
|
|
|
#else
|
|
|
|
#define Slab_CONTEXT_HDRSZ(chunksPerBlock) sizeof(SlabContext)
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The number of partitions to divide the blocklist into based their number of
|
|
|
|
* free chunks. There must be at least 2.
|
|
|
|
*/
|
|
|
|
#define SLAB_BLOCKLIST_COUNT 3
|
|
|
|
|
|
|
|
/* The maximum number of completely empty blocks to keep around for reuse. */
|
|
|
|
#define SLAB_MAXIMUM_EMPTY_BLOCKS 10
|
|
|
|
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
/*
|
|
|
|
* SlabContext is a specialized implementation of MemoryContext.
|
|
|
|
*/
|
|
|
|
typedef struct SlabContext
|
|
|
|
{
|
|
|
|
MemoryContextData header; /* Standard memory-context fields */
|
|
|
|
/* Allocation parameters for this context: */
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
Size chunkSize; /* the requested (non-aligned) chunk size */
|
|
|
|
Size fullChunkSize; /* chunk size with chunk header and alignment */
|
|
|
|
Size blockSize; /* the size to make each block of chunks */
|
|
|
|
int32 chunksPerBlock; /* number of chunks that fit in 1 block */
|
|
|
|
int32 curBlocklistIndex; /* index into the blocklist[] element
|
|
|
|
* containing the fullest, blocks */
|
2020-01-17 14:06:28 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
bool *isChunkFree; /* array to mark free chunks in a block during
|
|
|
|
* SlabCheck */
|
2020-01-17 14:06:28 +01:00
|
|
|
#endif
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
|
|
|
|
int32 blocklist_shift; /* number of bits to shift the nfree count
|
|
|
|
* by to get the index into blocklist[] */
|
|
|
|
dclist_head emptyblocks; /* empty blocks to use up first instead of
|
|
|
|
* mallocing new blocks */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Blocks with free space, grouped by the number of free chunks they
|
|
|
|
* contain. Completely full blocks are stored in the 0th element.
|
|
|
|
* Completely empty blocks are stored in emptyblocks or free'd if we have
|
|
|
|
* enough empty blocks already.
|
|
|
|
*/
|
|
|
|
dlist_head blocklist[SLAB_BLOCKLIST_COUNT];
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
} SlabContext;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabBlock
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Structure of a single slab block.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* slab: pointer back to the owning MemoryContext
|
|
|
|
* nfree: number of chunks on the block which are unallocated
|
|
|
|
* nunused: number of chunks on the block unallocated and not on the block's
|
|
|
|
* freelist.
|
|
|
|
* freehead: linked-list header storing a pointer to the first free chunk on
|
|
|
|
* the block. Subsequent pointers are stored in the chunk's memory. NULL
|
|
|
|
* indicates the end of the list.
|
|
|
|
* unused: pointer to the next chunk which has yet to be used.
|
|
|
|
* node: doubly-linked list node for the context's blocklist
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
|
|
|
typedef struct SlabBlock
|
|
|
|
{
|
2017-02-28 08:32:22 +01:00
|
|
|
SlabContext *slab; /* owning context */
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
int32 nfree; /* number of chunks on free + unused chunks */
|
|
|
|
int32 nunused; /* number of unused chunks */
|
|
|
|
MemoryChunk *freehead; /* pointer to the first free chunk */
|
|
|
|
MemoryChunk *unused; /* pointer to the next unused chunk */
|
|
|
|
dlist_node node; /* doubly-linked list for blocklist[] */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
} SlabBlock;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
#define Slab_CHUNKHDRSZ sizeof(MemoryChunk)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#define SlabChunkGetPointer(chk) \
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
((void *) (((char *) (chk)) + sizeof(MemoryChunk)))
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabBlockGetChunk
|
|
|
|
* Obtain a pointer to the nth (0-based) chunk in the block
|
|
|
|
*/
|
|
|
|
#define SlabBlockGetChunk(slab, block, n) \
|
Use MAXALIGN() in calculations using sizeof(SlabBlock)
c6e0fe1f2 added a new pointer field to SlabBlock to make it 4 bytes larger
on 32-bit machines. Prior to that commit, the size of that struct was a
multiple of 8, which meant that MAXALIGN(sizeof(SlabBlock)) was the same
as sizeof(SlabBlock), however, after c6e0fe1f2, due to the addition of the
new pointer field to store a pointer to the owning context, that was no
longer true on builds with sizeof(void *) == 4.
This problem was highlighted by an Assert failure which was checking that
the pointer given to pfree() was MAXALIGNED. Various 32-bit ARM buildfarm
animals were failing. These have MAXIMUM_ALIGNOF of 8. The only 32-bit
testing I'd managed to do on c6e0fe1f2 had been on x86, which has a
MAXIMUM_ALIGNOF of 4, therefore did not exhibit this issue.
Here we define Slab_BLOCKHDRSZ and copy what is being done in aset.c and
generation.c for doing calculations based on the size of the context's
block type. This means that SlabAlloc() will now always return a
MAXALIGNed pointer.
This also fixes an incorrect sentinel_ok() check in SlabCheck() which was
incorrectly checking the wrong sentinel byte. This must have previously
not caused any issues due to the fullChunkSize never being large enough to
store the sentinel byte.
Diagnosed-by: Tomas Vondra, Tom Lane
Author: Tomas Vondra, David Rowley
Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
2022-08-30 04:36:04 +02:00
|
|
|
((MemoryChunk *) ((char *) (block) + Slab_BLOCKHDRSZ \
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
+ ((n) * (slab)->fullChunkSize)))
|
|
|
|
|
|
|
|
#if defined(MEMORY_CONTEXT_CHECKING) || defined(USE_ASSERT_CHECKING)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabChunkIndex
|
|
|
|
* Get the 0-based index of how many chunks into the block the given
|
|
|
|
* chunk is.
|
|
|
|
*/
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#define SlabChunkIndex(slab, block, chunk) \
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
(((char *) (chunk) - (char *) SlabBlockGetChunk(slab, block, 0)) / \
|
|
|
|
(slab)->fullChunkSize)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabChunkMod
|
|
|
|
* A MemoryChunk should always be at an address which is a multiple of
|
|
|
|
* fullChunkSize starting from the 0th chunk position. This will return
|
|
|
|
* non-zero if it's not.
|
|
|
|
*/
|
|
|
|
#define SlabChunkMod(slab, block, chunk) \
|
|
|
|
(((char *) (chunk) - (char *) SlabBlockGetChunk(slab, block, 0)) % \
|
|
|
|
(slab)->fullChunkSize)
|
|
|
|
|
|
|
|
#endif
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/*
|
|
|
|
* SlabIsValid
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* True iff set is a valid slab allocation set.
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
*/
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
#define SlabIsValid(set) (PointerIsValid(set) && IsA(set, SlabContext))
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabBlockIsValid
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* True iff block is a valid block of slab allocation set.
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
*/
|
|
|
|
#define SlabBlockIsValid(block) \
|
|
|
|
(PointerIsValid(block) && SlabIsValid((block)->slab))
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* SlabBlocklistIndex
|
|
|
|
* Determine the blocklist index that a block should be in for the given
|
|
|
|
* number of free chunks.
|
|
|
|
*/
|
|
|
|
static inline int32
|
|
|
|
SlabBlocklistIndex(SlabContext *slab, int nfree)
|
|
|
|
{
|
|
|
|
int32 index;
|
|
|
|
int32 blocklist_shift = slab->blocklist_shift;
|
|
|
|
|
|
|
|
Assert(nfree >= 0 && nfree <= slab->chunksPerBlock);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Determine the blocklist index based on the number of free chunks. We
|
|
|
|
* must ensure that 0 free chunks is dedicated to index 0. Everything
|
|
|
|
* else must be >= 1 and < SLAB_BLOCKLIST_COUNT.
|
|
|
|
*
|
|
|
|
* To make this as efficient as possible, we exploit some two's complement
|
|
|
|
* arithmetic where we reverse the sign before bit shifting. This results
|
|
|
|
* in an nfree of 0 using index 0 and anything non-zero staying non-zero.
|
|
|
|
* This is exploiting 0 and -0 being the same in two's complement. When
|
|
|
|
* we're done, we just need to flip the sign back over again for a
|
|
|
|
* positive index.
|
|
|
|
*/
|
|
|
|
index = -((-nfree) >> blocklist_shift);
|
|
|
|
|
|
|
|
if (nfree == 0)
|
|
|
|
Assert(index == 0);
|
|
|
|
else
|
|
|
|
Assert(index >= 1 && index < SLAB_BLOCKLIST_COUNT);
|
|
|
|
|
|
|
|
return index;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabFindNextBlockListIndex
|
|
|
|
* Search blocklist for blocks which have free chunks and return the
|
|
|
|
* index of the blocklist found containing at least 1 block with free
|
|
|
|
* chunks. If no block can be found we return 0.
|
|
|
|
*
|
|
|
|
* Note: We give priority to fuller blocks so that these are filled before
|
|
|
|
* emptier blocks. This is done to increase the chances that mostly-empty
|
|
|
|
* blocks will eventually become completely empty so they can be free'd.
|
|
|
|
*/
|
|
|
|
static int32
|
|
|
|
SlabFindNextBlockListIndex(SlabContext *slab)
|
|
|
|
{
|
|
|
|
/* start at 1 as blocklist[0] is for full blocks. */
|
|
|
|
for (int i = 1; i < SLAB_BLOCKLIST_COUNT; i++)
|
|
|
|
{
|
|
|
|
/* return the first found non-empty index */
|
|
|
|
if (!dlist_is_empty(&slab->blocklist[i]))
|
|
|
|
return i;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* no blocks with free space */
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabGetNextFreeChunk
|
|
|
|
* Return the next free chunk in block and update the block to account
|
|
|
|
* for the returned chunk now being used.
|
|
|
|
*/
|
|
|
|
static inline MemoryChunk *
|
|
|
|
SlabGetNextFreeChunk(SlabContext *slab, SlabBlock *block)
|
|
|
|
{
|
|
|
|
MemoryChunk *chunk;
|
|
|
|
|
|
|
|
Assert(block->nfree > 0);
|
|
|
|
|
|
|
|
if (block->freehead != NULL)
|
|
|
|
{
|
|
|
|
chunk = block->freehead;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Pop the chunk from the linked list of free chunks. The pointer to
|
|
|
|
* the next free chunk is stored in the chunk itself.
|
|
|
|
*/
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(SlabChunkGetPointer(chunk), sizeof(MemoryChunk *));
|
|
|
|
block->freehead = *(MemoryChunk **) SlabChunkGetPointer(chunk);
|
|
|
|
|
|
|
|
/* check nothing stomped on the free chunk's memory */
|
|
|
|
Assert(block->freehead == NULL ||
|
|
|
|
(block->freehead >= SlabBlockGetChunk(slab, block, 0) &&
|
|
|
|
block->freehead <= SlabBlockGetChunk(slab, block, slab->chunksPerBlock - 1) &&
|
|
|
|
SlabChunkMod(slab, block, block->freehead) == 0));
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
Assert(block->nunused > 0);
|
|
|
|
|
|
|
|
chunk = block->unused;
|
|
|
|
block->unused = (MemoryChunk *) (((char *) block->unused) + slab->fullChunkSize);
|
|
|
|
block->nunused--;
|
|
|
|
}
|
|
|
|
|
|
|
|
block->nfree--;
|
|
|
|
|
|
|
|
return chunk;
|
|
|
|
}
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
/*
|
|
|
|
* SlabContextCreate
|
|
|
|
* Create a new Slab context.
|
|
|
|
*
|
|
|
|
* parent: parent context, or NULL if top-level context
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
* name: name of context (must be statically allocated)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
* blockSize: allocation block size
|
|
|
|
* chunkSize: allocation chunk size
|
|
|
|
*
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
* The MAXALIGN(chunkSize) may not exceed MEMORYCHUNK_MAX_VALUE
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
|
|
|
MemoryContext
|
|
|
|
SlabContextCreate(MemoryContext parent,
|
|
|
|
const char *name,
|
|
|
|
Size blockSize,
|
|
|
|
Size chunkSize)
|
|
|
|
{
|
|
|
|
int chunksPerBlock;
|
|
|
|
Size fullChunkSize;
|
|
|
|
SlabContext *slab;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
int i;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* ensure MemoryChunk's size is properly maxaligned */
|
2022-12-08 14:30:01 +01:00
|
|
|
StaticAssertDecl(Slab_CHUNKHDRSZ == MAXALIGN(Slab_CHUNKHDRSZ),
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
"sizeof(MemoryChunk) is not maxaligned");
|
|
|
|
Assert(MAXALIGN(chunkSize) <= MEMORYCHUNK_MAX_VALUE);
|
2017-02-28 08:32:22 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* Ensure there's enough space to store the pointer to the next free chunk
|
|
|
|
* in the memory of the (otherwise) unused allocation.
|
|
|
|
*/
|
|
|
|
if (chunkSize < sizeof(MemoryChunk *))
|
|
|
|
chunkSize = sizeof(MemoryChunk *);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* length of the maxaligned chunk including the chunk header */
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* ensure there's always space for the sentinel byte */
|
|
|
|
fullChunkSize = Slab_CHUNKHDRSZ + MAXALIGN(chunkSize + 1);
|
|
|
|
#else
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
fullChunkSize = Slab_CHUNKHDRSZ + MAXALIGN(chunkSize);
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
#endif
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* compute the number of chunks that will fit on each block */
|
Use MAXALIGN() in calculations using sizeof(SlabBlock)
c6e0fe1f2 added a new pointer field to SlabBlock to make it 4 bytes larger
on 32-bit machines. Prior to that commit, the size of that struct was a
multiple of 8, which meant that MAXALIGN(sizeof(SlabBlock)) was the same
as sizeof(SlabBlock), however, after c6e0fe1f2, due to the addition of the
new pointer field to store a pointer to the owning context, that was no
longer true on builds with sizeof(void *) == 4.
This problem was highlighted by an Assert failure which was checking that
the pointer given to pfree() was MAXALIGNED. Various 32-bit ARM buildfarm
animals were failing. These have MAXIMUM_ALIGNOF of 8. The only 32-bit
testing I'd managed to do on c6e0fe1f2 had been on x86, which has a
MAXIMUM_ALIGNOF of 4, therefore did not exhibit this issue.
Here we define Slab_BLOCKHDRSZ and copy what is being done in aset.c and
generation.c for doing calculations based on the size of the context's
block type. This means that SlabAlloc() will now always return a
MAXALIGNed pointer.
This also fixes an incorrect sentinel_ok() check in SlabCheck() which was
incorrectly checking the wrong sentinel byte. This must have previously
not caused any issues due to the fullChunkSize never being large enough to
store the sentinel byte.
Diagnosed-by: Tomas Vondra, Tom Lane
Author: Tomas Vondra, David Rowley
Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
2022-08-30 04:36:04 +02:00
|
|
|
chunksPerBlock = (blockSize - Slab_BLOCKHDRSZ) / fullChunkSize;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* Make sure the block can store at least one chunk. */
|
|
|
|
if (chunksPerBlock == 0)
|
|
|
|
elog(ERROR, "block size %zu for slab is too small for %zu-byte chunks",
|
|
|
|
blockSize, chunkSize);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
|
2020-01-17 14:06:28 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
slab = (SlabContext *) malloc(Slab_CONTEXT_HDRSZ(chunksPerBlock));
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
if (slab == NULL)
|
|
|
|
{
|
|
|
|
MemoryContextStats(TopMemoryContext);
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory"),
|
|
|
|
errdetail("Failed while creating memory context \"%s\".",
|
|
|
|
name)));
|
|
|
|
}
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/*
|
|
|
|
* Avoid writing code that can fail between here and MemoryContextCreate;
|
|
|
|
* we'd leak the header if we ereport in this stretch.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* Fill in SlabContext-specific header fields */
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
slab->chunkSize = chunkSize;
|
|
|
|
slab->fullChunkSize = fullChunkSize;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
slab->blockSize = blockSize;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
slab->chunksPerBlock = chunksPerBlock;
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
slab->curBlocklistIndex = 0;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* Compute a shift that guarantees that shifting chunksPerBlock with it is
|
|
|
|
* < SLAB_BLOCKLIST_COUNT - 1. The reason that we subtract 1 from
|
|
|
|
* SLAB_BLOCKLIST_COUNT in this calculation is that we reserve the 0th
|
|
|
|
* blocklist element for blocks which have no free chunks.
|
|
|
|
*
|
|
|
|
* We calculate the number of bits to shift by rather than a divisor to
|
|
|
|
* divide by as performing division each time we need to find the
|
|
|
|
* blocklist index would be much slower.
|
|
|
|
*/
|
|
|
|
slab->blocklist_shift = 0;
|
|
|
|
while ((slab->chunksPerBlock >> slab->blocklist_shift) >= (SLAB_BLOCKLIST_COUNT - 1))
|
|
|
|
slab->blocklist_shift++;
|
|
|
|
|
|
|
|
/* initialize the list to store empty blocks to be reused */
|
|
|
|
dclist_init(&slab->emptyblocks);
|
|
|
|
|
|
|
|
/* initialize each blocklist slot */
|
|
|
|
for (i = 0; i < SLAB_BLOCKLIST_COUNT; i++)
|
|
|
|
dlist_init(&slab->blocklist[i]);
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
2020-01-17 14:06:28 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* set the isChunkFree pointer right after the end of the context */
|
|
|
|
slab->isChunkFree = (bool *) ((char *) slab + sizeof(SlabContext));
|
2020-01-17 14:06:28 +01:00
|
|
|
#endif
|
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Finally, do the type-independent part of context creation */
|
|
|
|
MemoryContextCreate((MemoryContext) slab,
|
|
|
|
T_SlabContext,
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MCTX_SLAB_ID,
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
parent,
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
name);
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
|
|
|
|
return (MemoryContext) slab;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabReset
|
|
|
|
* Frees all memory which is allocated in the given set.
|
|
|
|
*
|
|
|
|
* The code simply frees all the blocks in the context - we don't keep any
|
|
|
|
* keeper blocks or anything like that.
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabReset(MemoryContext context)
|
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab = (SlabContext *) context;
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
dlist_mutable_iter miter;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
int i;
|
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabIsValid(slab));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Check for corruption and leaks before freeing */
|
|
|
|
SlabCheck(context);
|
|
|
|
#endif
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* release any retained empty blocks */
|
|
|
|
dclist_foreach_modify(miter, &slab->emptyblocks)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
SlabBlock *block = dlist_container(SlabBlock, node, miter.cur);
|
|
|
|
|
|
|
|
dclist_delete_from(&slab->emptyblocks, miter.cur);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
|
|
|
wipe_mem(block, slab->blockSize);
|
|
|
|
#endif
|
|
|
|
free(block);
|
|
|
|
context->mem_allocated -= slab->blockSize;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* walk over blocklist and free the blocks */
|
|
|
|
for (i = 0; i < SLAB_BLOCKLIST_COUNT; i++)
|
|
|
|
{
|
|
|
|
dlist_foreach_modify(miter, &slab->blocklist[i])
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
|
|
|
SlabBlock *block = dlist_container(SlabBlock, node, miter.cur);
|
|
|
|
|
|
|
|
dlist_delete(miter.cur);
|
|
|
|
|
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
|
|
|
wipe_mem(block, slab->blockSize);
|
|
|
|
#endif
|
|
|
|
free(block);
|
2020-03-19 19:20:39 +01:00
|
|
|
context->mem_allocated -= slab->blockSize;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
slab->curBlocklistIndex = 0;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
2020-03-19 19:20:39 +01:00
|
|
|
Assert(context->mem_allocated == 0);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabDelete
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
* Free all memory which is allocated in the given context.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabDelete(MemoryContext context)
|
|
|
|
{
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Reset to release all the SlabBlocks */
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabReset(context);
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* And free the context header */
|
|
|
|
free(context);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabAlloc
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Returns a pointer to allocated memory of given size or NULL if
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
* request could not be completed; memory is added to the slab.
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void *
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabAlloc(MemoryContext context, Size size)
|
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab = (SlabContext *) context;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabBlock *block;
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabIsValid(slab));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* sanity check that this is pointing to a valid blocklist */
|
|
|
|
Assert(slab->curBlocklistIndex >= 0);
|
|
|
|
Assert(slab->curBlocklistIndex <= SlabBlocklistIndex(slab, slab->chunksPerBlock));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
/* make sure we only allow correct request size */
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (unlikely(size != slab->chunkSize))
|
2017-02-28 08:32:22 +01:00
|
|
|
elog(ERROR, "unexpected alloc chunk size %zu (expected %zu)",
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
size, slab->chunkSize);
|
|
|
|
|
|
|
|
/*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Handle the case when there are no partially filled blocks available.
|
|
|
|
* SlabFree() will have updated the curBlocklistIndex setting it to zero
|
|
|
|
* to indicate that it has freed the final block. Also later in
|
|
|
|
* SlabAlloc() we will set the curBlocklistIndex to zero if we end up
|
|
|
|
* filling the final block.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (unlikely(slab->curBlocklistIndex == 0))
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
dlist_head *blocklist;
|
|
|
|
int blocklist_idx;
|
|
|
|
|
|
|
|
/* to save allocating a new one, first check the empty blocks list */
|
|
|
|
if (dclist_count(&slab->emptyblocks) > 0)
|
|
|
|
{
|
|
|
|
dlist_node *node = dclist_pop_head_node(&slab->emptyblocks);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
block = dlist_container(SlabBlock, node, node);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* SlabFree() should have left this block in a valid state with
|
|
|
|
* all chunks free. Ensure that's the case.
|
|
|
|
*/
|
|
|
|
Assert(block->nfree == slab->chunksPerBlock);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* fetch the next chunk from this block */
|
|
|
|
chunk = SlabGetNextFreeChunk(slab, block);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
else
|
|
|
|
{
|
|
|
|
block = (SlabBlock *) malloc(slab->blockSize);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (unlikely(block == NULL))
|
|
|
|
return NULL;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
block->slab = slab;
|
|
|
|
context->mem_allocated += slab->blockSize;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* use the first chunk in the new block */
|
|
|
|
chunk = SlabBlockGetChunk(slab, block, 0);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
block->nfree = slab->chunksPerBlock - 1;
|
|
|
|
block->unused = SlabBlockGetChunk(slab, block, 1);
|
|
|
|
block->freehead = NULL;
|
|
|
|
block->nunused = slab->chunksPerBlock - 1;
|
|
|
|
}
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* find the blocklist element for storing blocks with 1 used chunk */
|
|
|
|
blocklist_idx = SlabBlocklistIndex(slab, block->nfree);
|
|
|
|
blocklist = &slab->blocklist[blocklist_idx];
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* this better be empty. We just added a block thinking it was */
|
|
|
|
Assert(dlist_is_empty(blocklist));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
dlist_push_head(blocklist, &block->node);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
slab->curBlocklistIndex = blocklist_idx;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
dlist_head *blocklist = &slab->blocklist[slab->curBlocklistIndex];
|
|
|
|
int new_blocklist_idx;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
Assert(!dlist_is_empty(blocklist));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* grab the block from the blocklist */
|
|
|
|
block = dlist_head_element(SlabBlock, node, blocklist);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* make sure we actually got a valid block, with matching nfree */
|
|
|
|
Assert(block != NULL);
|
|
|
|
Assert(slab->curBlocklistIndex == SlabBlocklistIndex(slab, block->nfree));
|
|
|
|
Assert(block->nfree > 0);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* fetch the next chunk from this block */
|
|
|
|
chunk = SlabGetNextFreeChunk(slab, block);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* get the new blocklist index based on the new free chunk count */
|
|
|
|
new_blocklist_idx = SlabBlocklistIndex(slab, block->nfree);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* Handle the case where the blocklist index changes. This also deals
|
|
|
|
* with blocks becoming full as only full blocks go at index 0.
|
|
|
|
*/
|
|
|
|
if (unlikely(slab->curBlocklistIndex != new_blocklist_idx))
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
dlist_delete_from(blocklist, &block->node);
|
|
|
|
dlist_push_head(&slab->blocklist[new_blocklist_idx], &block->node);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (dlist_is_empty(blocklist))
|
|
|
|
slab->curBlocklistIndex = SlabFindNextBlockListIndex(slab);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* Check that the chunk pointer is actually somewhere on the block and is
|
|
|
|
* aligned as expected.
|
|
|
|
*/
|
|
|
|
Assert(chunk >= SlabBlockGetChunk(slab, block, 0));
|
|
|
|
Assert(chunk <= SlabBlockGetChunk(slab, block, slab->chunksPerBlock - 1));
|
|
|
|
Assert(SlabChunkMod(slab, block, chunk) == 0);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
/* Prepare to initialize the chunk header. */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
VALGRIND_MAKE_MEM_UNDEFINED(chunk, Slab_CHUNKHDRSZ);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunkSetHdrMask(chunk, block, MAXALIGN(slab->chunkSize),
|
|
|
|
MCTX_SLAB_ID);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* slab mark to catch clobber of "unused" space */
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
Assert(slab->chunkSize < (slab->fullChunkSize - Slab_CHUNKHDRSZ));
|
|
|
|
set_sentinel(MemoryChunkGetPointer(chunk), size);
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(((char *) chunk) +
|
|
|
|
Slab_CHUNKHDRSZ + slab->chunkSize,
|
|
|
|
slab->fullChunkSize -
|
|
|
|
(slab->chunkSize + Slab_CHUNKHDRSZ));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#endif
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#ifdef RANDOMIZE_ALLOCATED_MEMORY
|
|
|
|
/* fill the allocated space with junk */
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
randomize_mem((char *) MemoryChunkGetPointer(chunk), size);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#endif
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
return MemoryChunkGetPointer(chunk);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabFree
|
|
|
|
* Frees allocated memory; memory is removed from the slab.
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
|
|
|
SlabFree(void *pointer)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
2023-04-15 01:59:52 +02:00
|
|
|
SlabBlock *block;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab;
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
int curBlocklistIdx;
|
|
|
|
int newBlocklistIdx;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
block = MemoryChunkGetBlock(chunk);
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/*
|
|
|
|
* For speed reasons we just Assert that the referenced block is good.
|
|
|
|
* Future field experience may show that this Assert had better become a
|
|
|
|
* regular runtime test-and-elog check.
|
|
|
|
*/
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabBlockIsValid(block));
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
slab = block->slab;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
/* Test for someone scribbling on unused space in chunk */
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
Assert(slab->chunkSize < (slab->fullChunkSize - Slab_CHUNKHDRSZ));
|
|
|
|
if (!sentinel_ok(pointer, slab->chunkSize))
|
|
|
|
elog(WARNING, "detected write past chunk end in %s %p",
|
|
|
|
slab->header.name, chunk);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#endif
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* push this chunk onto the head of the block's free list */
|
|
|
|
*(MemoryChunk **) pointer = block->freehead;
|
|
|
|
block->freehead = chunk;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
block->nfree++;
|
|
|
|
|
|
|
|
Assert(block->nfree > 0);
|
|
|
|
Assert(block->nfree <= slab->chunksPerBlock);
|
|
|
|
|
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* don't wipe the free list MemoryChunk pointer stored in the chunk */
|
|
|
|
wipe_mem((char *) pointer + sizeof(MemoryChunk *),
|
|
|
|
slab->chunkSize - sizeof(MemoryChunk *));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
#endif
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
curBlocklistIdx = SlabBlocklistIndex(slab, block->nfree - 1);
|
|
|
|
newBlocklistIdx = SlabBlocklistIndex(slab, block->nfree);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
/*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Check if the block needs to be moved to another element on the
|
|
|
|
* blocklist based on it now having 1 more free chunk.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (unlikely(curBlocklistIdx != newBlocklistIdx))
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* do the move */
|
|
|
|
dlist_delete_from(&slab->blocklist[curBlocklistIdx], &block->node);
|
|
|
|
dlist_push_head(&slab->blocklist[newBlocklistIdx], &block->node);
|
|
|
|
|
|
|
|
/*
|
2022-12-21 21:57:49 +01:00
|
|
|
* The blocklist[curBlocklistIdx] may now be empty or we may now be
|
|
|
|
* able to use a lower-element blocklist. We'll need to redetermine
|
|
|
|
* what the slab->curBlocklistIndex is if the current blocklist was
|
|
|
|
* changed or if a lower element one was changed. We must ensure we
|
|
|
|
* use the list with the fullest block(s).
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
*/
|
2022-12-21 21:57:49 +01:00
|
|
|
if (slab->curBlocklistIndex >= curBlocklistIdx)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
slab->curBlocklistIndex = SlabFindNextBlockListIndex(slab);
|
2022-12-21 21:57:49 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We know there must be a block with at least 1 unused chunk as
|
|
|
|
* we just pfree'd one. Ensure curBlocklistIndex reflects this.
|
|
|
|
*/
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
Assert(slab->curBlocklistIndex > 0);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* Handle when a block becomes completely empty */
|
|
|
|
if (unlikely(block->nfree == slab->chunksPerBlock))
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* remove the block */
|
|
|
|
dlist_delete_from(&slab->blocklist[newBlocklistIdx], &block->node);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* To avoid thrashing malloc/free, we keep a list of empty blocks that
|
|
|
|
* we can reuse again instead of having to malloc a new one.
|
|
|
|
*/
|
|
|
|
if (dclist_count(&slab->emptyblocks) < SLAB_MAXIMUM_EMPTY_BLOCKS)
|
|
|
|
dclist_push_head(&slab->emptyblocks, &block->node);
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* When we have enough empty blocks stored already, we actually
|
|
|
|
* free the block.
|
|
|
|
*/
|
|
|
|
#ifdef CLOBBER_FREED_MEMORY
|
|
|
|
wipe_mem(block, slab->blockSize);
|
|
|
|
#endif
|
|
|
|
free(block);
|
|
|
|
slab->header.mem_allocated -= slab->blockSize;
|
|
|
|
}
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* Check if we need to reset the blocklist index. This is required
|
|
|
|
* when the blocklist this block is on has become completely empty.
|
|
|
|
*/
|
|
|
|
if (slab->curBlocklistIndex == newBlocklistIdx &&
|
|
|
|
dlist_is_empty(&slab->blocklist[newBlocklistIdx]))
|
|
|
|
slab->curBlocklistIndex = SlabFindNextBlockListIndex(slab);
|
|
|
|
}
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabRealloc
|
2017-03-08 22:19:37 +01:00
|
|
|
* Change the allocated size of a chunk.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
2017-03-08 22:19:37 +01:00
|
|
|
* As Slab is designed for allocating equally-sized chunks of memory, it can't
|
|
|
|
* do an actual chunk size change. We try to be gentle and allow calls with
|
|
|
|
* exactly the same size, as in that case we can simply return the same
|
|
|
|
* chunk. When the size differs, we throw an error.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
2017-03-08 22:19:37 +01:00
|
|
|
* We could also allow requests with size < chunkSize. That however seems
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
* rather pointless - Slab is meant for chunks of constant size, and moreover
|
|
|
|
* realloc is usually used to enlarge the chunk.
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void *
|
|
|
|
SlabRealloc(void *pointer, Size size)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
2023-04-15 01:59:52 +02:00
|
|
|
SlabBlock *block;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab;
|
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
block = MemoryChunkGetBlock(chunk);
|
|
|
|
|
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
/*
|
|
|
|
* Try to verify that we have a sane block pointer: the block header
|
|
|
|
* should reference a slab context. (We use a test-and-elog, not just
|
|
|
|
* Assert, because it seems highly likely that we're here in error in the
|
|
|
|
* first place.)
|
|
|
|
*/
|
|
|
|
if (!SlabBlockIsValid(block))
|
|
|
|
elog(ERROR, "could not find block containing chunk %p", chunk);
|
|
|
|
slab = block->slab;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
/* can't do actual realloc with slab, but let's try to be gentle */
|
|
|
|
if (size == slab->chunkSize)
|
|
|
|
return pointer;
|
|
|
|
|
|
|
|
elog(ERROR, "slab allocator does not support realloc()");
|
2017-03-08 22:19:37 +01:00
|
|
|
return NULL; /* keep compiler quiet */
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/*
|
|
|
|
* SlabGetChunkContext
|
|
|
|
* Return the MemoryContext that 'pointer' belongs to.
|
|
|
|
*/
|
|
|
|
MemoryContext
|
|
|
|
SlabGetChunkContext(void *pointer)
|
|
|
|
{
|
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
2023-04-15 01:59:52 +02:00
|
|
|
SlabBlock *block;
|
|
|
|
|
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
block = MemoryChunkGetBlock(chunk);
|
|
|
|
|
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, Slab_CHUNKHDRSZ);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabBlockIsValid(block));
|
2023-04-15 01:59:52 +02:00
|
|
|
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
return &block->slab->header;
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
}
|
|
|
|
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
/*
|
|
|
|
* SlabGetChunkSpace
|
|
|
|
* Given a currently-allocated chunk, determine the total space
|
|
|
|
* it occupies (including all memory-allocation overhead).
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
Size
|
|
|
|
SlabGetChunkSpace(void *pointer)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
|
2023-04-15 01:59:52 +02:00
|
|
|
SlabBlock *block;
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab;
|
2017-02-28 08:32:22 +01:00
|
|
|
|
2023-04-15 01:59:52 +02:00
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
block = MemoryChunkGetBlock(chunk);
|
|
|
|
|
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabBlockIsValid(block));
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
slab = block->slab;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
2017-02-28 08:32:22 +01:00
|
|
|
return slab->fullChunkSize;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabIsEmpty
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Is the slab empty of any allocated space?
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
bool
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabIsEmpty(MemoryContext context)
|
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
Assert(SlabIsValid((SlabContext *) context));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
return (context->mem_allocated == 0);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabStats
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
* Compute stats about memory consumption of a Slab context.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
* printfunc: if not NULL, pass a human-readable stats string to this.
|
|
|
|
* passthru: pass this pointer through to printfunc.
|
|
|
|
* totals: if not NULL, add stats about this context into *totals.
|
Add function to log the memory contexts of specified backend process.
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
2021-04-06 06:44:15 +02:00
|
|
|
* print_to_stderr: print stats to stderr if true, elog otherwise.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
SlabStats(MemoryContext context,
|
|
|
|
MemoryStatsPrintFunc printfunc, void *passthru,
|
Add function to log the memory contexts of specified backend process.
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
2021-04-06 06:44:15 +02:00
|
|
|
MemoryContextCounters *totals,
|
|
|
|
bool print_to_stderr)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab = (SlabContext *) context;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
Size nblocks = 0;
|
|
|
|
Size freechunks = 0;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
Size totalspace;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
Size freespace = 0;
|
|
|
|
int i;
|
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabIsValid(slab));
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
/* Include context header in totalspace */
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
totalspace = Slab_CONTEXT_HDRSZ(slab->chunksPerBlock);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* Add the space consumed by blocks in the emptyblocks list */
|
|
|
|
totalspace += dclist_count(&slab->emptyblocks) * slab->blockSize;
|
|
|
|
|
|
|
|
for (i = 0; i < SLAB_BLOCKLIST_COUNT; i++)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
|
|
|
dlist_iter iter;
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
dlist_foreach(iter, &slab->blocklist[i])
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
|
|
|
SlabBlock *block = dlist_container(SlabBlock, node, iter.cur);
|
|
|
|
|
|
|
|
nblocks++;
|
|
|
|
totalspace += slab->blockSize;
|
|
|
|
freespace += slab->fullChunkSize * block->nfree;
|
|
|
|
freechunks += block->nfree;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
if (printfunc)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
char stats_string[200];
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* XXX should we include free chunks on empty blocks? */
|
Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident". The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to. The "ident" string, if
wanted, is supplied later. This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.
The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there. And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.
All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead. This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other. Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.
I also arranged for plancache contexts to use the query source string
as their identifier. This is basically free for CachedPlanSources, as
they contained a copy of that string already. We pay an extra pstrdup
to do it for CachedPlans. That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first). I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.
This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format. This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.
The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1. Seems
better for extension authors to endure just one round of API changes
not two.
Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 22:46:47 +02:00
|
|
|
snprintf(stats_string, sizeof(stats_string),
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
"%zu total in %zu blocks; %u empty blocks; %zu free (%zu chunks); %zu used",
|
|
|
|
totalspace, nblocks, dclist_count(&slab->emptyblocks),
|
|
|
|
freespace, freechunks, totalspace - freespace);
|
Add function to log the memory contexts of specified backend process.
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
2021-04-06 06:44:15 +02:00
|
|
|
printfunc(context, passthru, stats_string, print_to_stderr);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
if (totals)
|
|
|
|
{
|
|
|
|
totals->nblocks += nblocks;
|
|
|
|
totals->freechunks += freechunks;
|
|
|
|
totals->totalspace += totalspace;
|
|
|
|
totals->freespace += freespace;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
#ifdef MEMORY_CONTEXT_CHECKING
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SlabCheck
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* Walk through all blocks looking for inconsistencies.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*
|
|
|
|
* NOTE: report errors as WARNING, *not* ERROR or FATAL. Otherwise you'll
|
|
|
|
* find yourself in an infinite loop when trouble occurs, because this
|
|
|
|
* routine will be entered again when elog cleanup tries to release memory!
|
|
|
|
*/
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
void
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabCheck(MemoryContext context)
|
|
|
|
{
|
Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise. But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions. That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.
Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check. This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.
As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks. The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.
In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes. Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.
Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.
While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect. Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.
Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-11 00:45:34 +02:00
|
|
|
SlabContext *slab = (SlabContext *) context;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
int i;
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
int nblocks = 0;
|
Rethink MemoryContext creation to improve performance.
This patch makes a number of interrelated changes to reduce the overhead
involved in creating/deleting memory contexts. The key ideas are:
* Include the AllocSetContext header of an aset.c context in its first
malloc request, rather than allocating it separately in TopMemoryContext.
This means that we now always create an initial or "keeper" block in an
aset, even if it never receives any allocation requests.
* Create freelists in which we can save and recycle recently-destroyed
asets (this idea is due to Robert Haas).
* In the common case where the name of a context is a constant string,
just store a pointer to it in the context header, rather than copying
the string.
The first change eliminates a palloc/pfree cycle per context, and
also avoids bloat in TopMemoryContext, at the price that creating
a context now involves a malloc/free cycle even if the context never
receives any allocations. That would be a loser for some common
usage patterns, but recycling short-lived contexts via the freelist
eliminates that pain.
Avoiding copying constant strings not only saves strlen() and strcpy()
overhead, but is an essential part of the freelist optimization because
it makes the context header size constant. Currently we make no
attempt to use the freelist for contexts with non-constant names.
(Perhaps someday we'll need to think harder about that, but in current
usage, most contexts with custom names are long-lived anyway.)
The freelist management in this initial commit is pretty simplistic,
and we might want to refine it later --- but in common workloads that
will never matter because the freelists will never get full anyway.
To create a context with a non-constant name, one is now required to
call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME
option. AllocSetContextCreate becomes a wrapper macro, and it includes
a test that will complain about non-string-literal context name
parameters on gcc and similar compilers.
An unfortunate side effect of making AllocSetContextCreate a macro is
that one is now *required* to use the size parameter abstraction macros
(ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of
writing out individual size parameters no longer works unless you
switch to AllocSetContextCreateExtended.
Internally to the memory-context-related modules, the context creation
APIs are simplified, removing the rather baroque original design whereby
a context-type module called mcxt.c which then called back into the
context-type module. That saved a bit of code duplication, but not much,
and it prevented context-type modules from exercising control over the
allocation of context headers.
In passing, I converted the test-and-elog validation of aset size
parameters into Asserts to save a few more cycles. The original thought
was that callers might compute size parameters on the fly, but in practice
nobody does that, so it's useless to expend cycles on checking those
numbers in production builds.
Also, mark the memory context method-pointer structs "const",
just for cleanliness.
Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-12-13 19:55:12 +01:00
|
|
|
const char *name = slab->header.name;
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
dlist_iter iter;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
2022-10-28 09:19:06 +02:00
|
|
|
Assert(SlabIsValid(slab));
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
Assert(slab->chunksPerBlock > 0);
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/*
|
|
|
|
* Have a look at the empty blocks. These should have all their chunks
|
|
|
|
* marked as free. Ensure that's the case.
|
|
|
|
*/
|
|
|
|
dclist_foreach(iter, &slab->emptyblocks)
|
|
|
|
{
|
|
|
|
SlabBlock *block = dlist_container(SlabBlock, node, iter.cur);
|
|
|
|
|
|
|
|
if (block->nfree != slab->chunksPerBlock)
|
|
|
|
elog(WARNING, "problem in slab %s: empty block %p should have %d free chunks but has %d chunks free",
|
|
|
|
name, block, slab->chunksPerBlock, block->nfree);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* walk the non-empty block lists */
|
|
|
|
for (i = 0; i < SLAB_BLOCKLIST_COUNT; i++)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
|
|
|
int j,
|
|
|
|
nfree;
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* walk all blocks on this blocklist */
|
|
|
|
dlist_foreach(iter, &slab->blocklist[i])
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
|
|
|
SlabBlock *block = dlist_container(SlabBlock, node, iter.cur);
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
MemoryChunk *cur_chunk;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure the number of free chunks (in the block header)
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* matches the position in the blocklist.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (SlabBlocklistIndex(slab, block->nfree) != i)
|
|
|
|
elog(WARNING, "problem in slab %s: block %p is on blocklist %d but should be on blocklist %d",
|
|
|
|
name, block, i, SlabBlocklistIndex(slab, block->nfree));
|
|
|
|
|
|
|
|
/* make sure the block is not empty */
|
|
|
|
if (block->nfree >= slab->chunksPerBlock)
|
|
|
|
elog(WARNING, "problem in slab %s: empty block %p incorrectly stored on blocklist element %d",
|
|
|
|
name, block, i);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
/* make sure the slab pointer correctly points to this context */
|
|
|
|
if (block->slab != slab)
|
|
|
|
elog(WARNING, "problem in slab %s: bogus slab link in block %p",
|
|
|
|
name, block);
|
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* reset the array of free chunks for this block */
|
|
|
|
memset(slab->isChunkFree, 0, (slab->chunksPerBlock * sizeof(bool)));
|
|
|
|
nfree = 0;
|
|
|
|
|
|
|
|
/* walk through the block's free list chunks */
|
|
|
|
cur_chunk = block->freehead;
|
|
|
|
while (cur_chunk != NULL)
|
|
|
|
{
|
|
|
|
int chunkidx = SlabChunkIndex(slab, block, cur_chunk);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Ensure the free list link points to something on the block
|
|
|
|
* at an address aligned according to the full chunk size.
|
|
|
|
*/
|
|
|
|
if (cur_chunk < SlabBlockGetChunk(slab, block, 0) ||
|
|
|
|
cur_chunk > SlabBlockGetChunk(slab, block, slab->chunksPerBlock - 1) ||
|
|
|
|
SlabChunkMod(slab, block, cur_chunk) != 0)
|
|
|
|
elog(WARNING, "problem in slab %s: bogus free list link %p in block %p",
|
|
|
|
name, cur_chunk, block);
|
|
|
|
|
|
|
|
/* count the chunk and mark it free on the free chunk array */
|
|
|
|
nfree++;
|
|
|
|
slab->isChunkFree[chunkidx] = true;
|
|
|
|
|
|
|
|
/* read pointer of the next free chunk */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(MemoryChunkGetPointer(cur_chunk), sizeof(MemoryChunk *));
|
|
|
|
cur_chunk = *(MemoryChunk **) SlabChunkGetPointer(cur_chunk);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* check that the unused pointer matches what nunused claims */
|
|
|
|
if (SlabBlockGetChunk(slab, block, slab->chunksPerBlock - block->nunused) !=
|
|
|
|
block->unused)
|
|
|
|
elog(WARNING, "problem in slab %s: mismatch detected between nunused chunks and unused pointer in block %p",
|
|
|
|
name, block);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
|
|
|
/*
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
* count the remaining free chunks that have yet to make it onto
|
|
|
|
* the block's free list.
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
*/
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
cur_chunk = block->unused;
|
|
|
|
for (j = 0; j < block->nunused; j++)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
int chunkidx = SlabChunkIndex(slab, block, cur_chunk);
|
|
|
|
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* count the chunk as free and mark it as so in the array */
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
nfree++;
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (chunkidx < slab->chunksPerBlock)
|
|
|
|
slab->isChunkFree[chunkidx] = true;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* move forward 1 chunk */
|
|
|
|
cur_chunk = (MemoryChunk *) (((char *) cur_chunk) + slab->fullChunkSize);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
for (j = 0; j < slab->chunksPerBlock; j++)
|
|
|
|
{
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
if (!slab->isChunkFree[j])
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
{
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk *chunk = SlabBlockGetChunk(slab, block, j);
|
2023-04-15 01:59:52 +02:00
|
|
|
SlabBlock *chunkblock;
|
|
|
|
|
|
|
|
/* Allow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_DEFINED(chunk, Slab_CHUNKHDRSZ);
|
|
|
|
|
|
|
|
chunkblock = (SlabBlock *) MemoryChunkGetBlock(chunk);
|
|
|
|
|
|
|
|
/* Disallow access to the chunk header. */
|
|
|
|
VALGRIND_MAKE_MEM_NOACCESS(chunk, Slab_CHUNKHDRSZ);
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* check the chunk's blockoffset correctly points back to
|
|
|
|
* the block
|
|
|
|
*/
|
|
|
|
if (chunkblock != block)
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
elog(WARNING, "problem in slab %s: bogus block link in block %p, chunk %p",
|
|
|
|
name, block, chunk);
|
|
|
|
|
Make more effort to put a sentinel at the end of allocated memory
Traditionally, in MEMORY_CONTEXT_CHECKING builds, we only ever marked a
sentinel byte just beyond the requested size if there happened to be
enough space on the chunk to do so. For Slab and Generation context
types, we only rounded the size of the chunk up to the next maxalign
boundary, so it was often not that likely that those would ever have space
for the sentinel given that the majority of allocation requests are going
to be for sizes which are maxaligned. For AllocSet, it was a little
different as smaller allocations are rounded up to the next power-of-2
value rather than the next maxalign boundary, so we're a bit more likely
to have space for the sentinel byte, especially when we get away from tiny
sized allocations such as 8 or 16 bytes.
Here we make more of an effort to allow space so that there is enough room
for the sentinel byte in more cases. This makes it more likely that we'll
detect when buggy code accidentally writes beyond the end of any of its
memory allocations.
Each of the 3 MemoryContext types has been changed as follows:
The Slab allocator will now always set a sentinel byte. Both the current
usages of this MemoryContext type happen to use chunk sizes which were on
the maxalign boundary, so these never used sentinel bytes previously.
For the Generation allocator, we now always ensure there's enough space in
the allocation for a sentinel byte.
For AllocSet, this commit makes an adjustment for allocation sizes which
are greater than allocChunkLimit. We now ensure there is always space for
a sentinel byte. We don't alter the sentinel behavior for request sizes
<= allocChunkLimit. Making way for the sentinel byte for power-of-2
request sizes would require doubling up to the next power of 2. Some
analysis done on the request sizes made during installcheck shows that a
fairly large portion of allocation requests are for power-of-2 sizes. The
amount of additional memory for the sentinel there seems prohibitive, so
we do nothing for those here.
Author: David Rowley
Discussion: https://postgr.es/m/3478405.1661824539@sss.pgh.pa.us
2022-09-07 05:46:57 +02:00
|
|
|
/* check the sentinel byte is intact */
|
|
|
|
Assert(slab->chunkSize < (slab->fullChunkSize - Slab_CHUNKHDRSZ));
|
|
|
|
if (!sentinel_ok(chunk, Slab_CHUNKHDRSZ + slab->chunkSize))
|
|
|
|
elog(WARNING, "problem in slab %s: detected write past chunk end in block %p, chunk %p",
|
|
|
|
name, block, chunk);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure we got the expected number of free chunks (as tracked
|
|
|
|
* in the block header).
|
|
|
|
*/
|
|
|
|
if (nfree != block->nfree)
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
elog(WARNING, "problem in slab %s: nfree in block %p is %d but %d chunk were found as free",
|
|
|
|
name, block, block->nfree, nfree);
|
|
|
|
|
|
|
|
nblocks++;
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
}
|
2019-10-01 03:13:39 +02:00
|
|
|
|
Improve the performance of the slab memory allocator
Slab has traditionally been fairly slow when compared with the AllocSet or
Generation memory allocators. Part of this slowness came from having to
write out an entire block when we allocate a new block in order to
populate the free list indexes within the block's memory. Additional
slowness came from having to move a block onto another dlist each time we
palloc or pfree a chunk from it.
Here we optimize both of those cases and do a little bit extra to improve
the performance of the slab allocator.
Here, instead of writing out the free list indexes when allocating a new
block, we introduce the concept of "unused" chunks. When a block is first
allocated all chunks are unused. These chunks only make it onto the
free list when they are pfree'd. When allocating new chunks on an
existing block, we have the choice of consuming a chunk from the free list
or an unused chunk. When both exist, we opt to use one from the free
list, as these have been used already and the memory of them is more
likely to be cached by the CPU.
Here we also reduce the number of block lists from there being one for
every possible value of free chunks on a block to just having a small
fixed number of block lists. We keep the 0th block list for completely
full blocks and anything else stores blocks for some range of free chunks
with fuller blocks appearing on lower block list array elements. This
reduces how often we must move a block to another list when we allocate or
free chunks, but still allows us to prefer to put new chunks on fuller
blocks and perhaps allow blocks with fewer chunks to be free'd later
once all their remaining chunks have been pfree'd.
Additionally, we now store a list of "emptyblocks", which are blocks that
no longer contain any allocated chunks. We now keep up to 10 of these
around to avoid having to thrash malloc/free when allocation patterns
continually cause blocks to become free of any allocated chunks only to
allocate more chunks again. Now only once we have 10 of these, we free
the block. This does raise the high water mark for the total memory that
a slab context can consume. It does not seem entirely unreasonable that
we might one day want to make this a property of SlabContext rather than a
compile-time constant. Let's wait and see if there is any evidence to
support that this is required before doing it.
Author: Andres Freund, David Rowley
Tested-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
2022-12-20 09:48:51 +01:00
|
|
|
/* the stored empty blocks are tracked in mem_allocated too */
|
|
|
|
nblocks += dclist_count(&slab->emptyblocks);
|
|
|
|
|
|
|
|
Assert(nblocks * slab->blockSize == context->mem_allocated);
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
#endif /* MEMORY_CONTEXT_CHECKING */
|