1996-07-09 08:22:35 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* catcache.c
|
1996-07-09 08:22:35 +02:00
|
|
|
* System catalog cache for tuples matching a key.
|
|
|
|
*
|
2023-01-02 21:00:37 +01:00
|
|
|
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/utils/cache/catcache.c
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
2000-02-21 04:36:59 +01:00
|
|
|
|
2019-12-27 00:09:00 +01:00
|
|
|
#include "access/genam.h"
|
2019-07-08 17:58:05 +02:00
|
|
|
#include "access/heaptoast.h"
|
2008-06-19 02:46:06 +02:00
|
|
|
#include "access/relscan.h"
|
2008-05-12 02:00:54 +02:00
|
|
|
#include "access/sysattr.h"
|
2019-01-21 19:18:20 +01:00
|
|
|
#include "access/table.h"
|
2013-07-15 19:31:36 +02:00
|
|
|
#include "access/xact.h"
|
Make collation-aware system catalog columns use "C" collation.
Up to now we allowed text columns in system catalogs to use collation
"default", but that isn't really safe because it might mean something
different in template0 than it means in a database cloned from template0.
In particular, this could mean that cloned pg_statistic entries for such
columns weren't entirely valid, possibly leading to bogus planner
estimates, though (probably) not any outright failures.
In the wake of commit 5e0928005, a better solution is available: if we
label such columns with "C" collation, then their pg_statistic entries
will also use that collation and hence will be valid independently of
the database collation.
This also provides a cleaner solution for indexes on such columns than
the hack added by commit 0b28ea79c: the indexes will naturally inherit
"C" collation and don't have to be forced to use text_pattern_ops.
Also, with the planned improvement of type "name" to be collation-aware,
this policy will apply cleanly to both text and name columns.
Because of the pg_statistic angle, we should also apply this policy
to the tables in information_schema. This patch does that by adjusting
information_schema's textual domain types to specify "C" collation.
That has the user-visible effect that order-sensitive comparisons to
textual information_schema view columns will now use "C" collation
by default. The SQL standard says that the collation of those view
columns is implementation-defined, so I think this is legal per spec.
At some point this might allow for translation of such comparisons
into indexable conditions on the underlying "name" columns, although
additional work will be needed before that can happen.
Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
2018-12-18 18:48:15 +01:00
|
|
|
#include "catalog/pg_collation.h"
|
1999-11-22 18:56:41 +01:00
|
|
|
#include "catalog/pg_operator.h"
|
1999-07-16 05:14:30 +02:00
|
|
|
#include "catalog/pg_type.h"
|
2020-02-27 04:55:41 +01:00
|
|
|
#include "common/hashfn.h"
|
1999-07-16 07:00:38 +02:00
|
|
|
#include "miscadmin.h"
|
2022-02-20 07:22:08 +01:00
|
|
|
#include "port/pg_bitutils.h"
|
2002-02-19 21:11:20 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
#include "storage/ipc.h" /* for on_proc_exit */
|
|
|
|
#endif
|
2011-03-22 18:00:24 +01:00
|
|
|
#include "storage/lmgr.h"
|
1999-07-16 07:00:38 +02:00
|
|
|
#include "utils/builtins.h"
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
#include "utils/catcache.h"
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
#include "utils/datum.h"
|
2006-07-11 18:35:33 +02:00
|
|
|
#include "utils/fmgroids.h"
|
2010-02-07 21:48:13 +01:00
|
|
|
#include "utils/inval.h"
|
2005-05-06 19:24:55 +02:00
|
|
|
#include "utils/memutils.h"
|
2008-06-19 02:46:06 +02:00
|
|
|
#include "utils/rel.h"
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
#include "utils/resowner.h"
|
1999-11-01 03:29:27 +01:00
|
|
|
#include "utils/syscache.h"
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2000-11-16 23:30:52 +01:00
|
|
|
|
2005-04-14 22:03:27 +02:00
|
|
|
/* #define CACHEDEBUG */ /* turns DEBUG elogs on */
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2002-03-06 21:49:46 +01:00
|
|
|
/*
|
|
|
|
* Given a hash value and the size of the hash table, find the bucket
|
|
|
|
* in which the hash value belongs. Since the hash table must contain
|
|
|
|
* a power-of-2 number of elements, this is a simple bitmask.
|
|
|
|
*/
|
|
|
|
#define HASH_INDEX(h, sz) ((Index) ((h) & ((sz) - 1)))
|
|
|
|
|
1997-08-19 23:40:56 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* variables, macros and other stuff
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifdef CACHEDEBUG
|
2019-02-18 12:32:34 +01:00
|
|
|
#define CACHE_elog(...) elog(__VA_ARGS__)
|
1996-07-09 08:22:35 +02:00
|
|
|
#else
|
2019-02-18 12:32:34 +01:00
|
|
|
#define CACHE_elog(...)
|
1996-07-09 08:22:35 +02:00
|
|
|
#endif
|
|
|
|
|
2001-06-18 05:35:07 +02:00
|
|
|
/* Cache management header --- pointer is NULL until created */
|
|
|
|
static CatCacheHeader *CacheHdr = NULL;
|
1996-07-09 08:22:35 +02:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
|
|
|
|
int nkeys,
|
|
|
|
Datum v1, Datum v2,
|
|
|
|
Datum v3, Datum v4);
|
|
|
|
|
|
|
|
static pg_noinline HeapTuple SearchCatCacheMiss(CatCache *cache,
|
|
|
|
int nkeys,
|
|
|
|
uint32 hashValue,
|
|
|
|
Index hashIndex,
|
|
|
|
Datum v1, Datum v2,
|
|
|
|
Datum v3, Datum v4);
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
static uint32 CatalogCacheComputeHashValue(CatCache *cache, int nkeys,
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Datum v1, Datum v2, Datum v3, Datum v4);
|
|
|
|
static uint32 CatalogCacheComputeTupleHashValue(CatCache *cache, int nkeys,
|
2001-06-18 05:35:07 +02:00
|
|
|
HeapTuple tuple);
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
static inline bool CatalogCacheCompareTuple(const CatCache *cache, int nkeys,
|
|
|
|
const Datum *cachekeys,
|
|
|
|
const Datum *searchkeys);
|
2002-09-04 22:31:48 +02:00
|
|
|
|
2002-02-19 21:11:20 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
2006-06-15 04:08:09 +02:00
|
|
|
static void CatCachePrintStats(int code, Datum arg);
|
2002-02-19 21:11:20 +01:00
|
|
|
#endif
|
2002-03-03 18:47:56 +01:00
|
|
|
static void CatCacheRemoveCTup(CatCache *cache, CatCTup *ct);
|
2002-04-06 08:59:25 +02:00
|
|
|
static void CatCacheRemoveCList(CatCache *cache, CatCList *cl);
|
2002-03-03 18:47:56 +01:00
|
|
|
static void CatalogCacheInitializeCache(CatCache *cache);
|
2002-04-06 08:59:25 +02:00
|
|
|
static CatCTup *CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp,
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Datum *arguments,
|
2002-04-06 08:59:25 +02:00
|
|
|
uint32 hashValue, Index hashIndex,
|
|
|
|
bool negative);
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
static void ReleaseCatCacheWithOwner(HeapTuple tuple, ResourceOwner resowner);
|
|
|
|
static void ReleaseCatCacheListWithOwner(CatCList *list, ResourceOwner resowner);
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
static void CatCacheFreeKeys(TupleDesc tupdesc, int nkeys, int *attnos,
|
|
|
|
Datum *keys);
|
|
|
|
static void CatCacheCopyKeys(TupleDesc tupdesc, int nkeys, int *attnos,
|
|
|
|
Datum *srckeys, Datum *dstkeys);
|
2001-06-18 05:35:07 +02:00
|
|
|
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* internal support functions
|
|
|
|
*/
|
2000-02-21 04:36:59 +01:00
|
|
|
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
/* ResourceOwner callbacks to hold catcache references */
|
|
|
|
|
|
|
|
static void ResOwnerReleaseCatCache(Datum res);
|
|
|
|
static char *ResOwnerPrintCatCache(Datum res);
|
|
|
|
static void ResOwnerReleaseCatCacheList(Datum res);
|
|
|
|
static char *ResOwnerPrintCatCacheList(Datum res);
|
|
|
|
|
|
|
|
static const ResourceOwnerDesc catcache_resowner_desc =
|
|
|
|
{
|
|
|
|
/* catcache references */
|
|
|
|
.name = "catcache reference",
|
|
|
|
.release_phase = RESOURCE_RELEASE_AFTER_LOCKS,
|
|
|
|
.release_priority = RELEASE_PRIO_CATCACHE_REFS,
|
|
|
|
.ReleaseResource = ResOwnerReleaseCatCache,
|
|
|
|
.DebugPrint = ResOwnerPrintCatCache
|
|
|
|
};
|
|
|
|
|
|
|
|
static const ResourceOwnerDesc catlistref_resowner_desc =
|
|
|
|
{
|
|
|
|
/* catcache-list pins */
|
|
|
|
.name = "catcache list reference",
|
|
|
|
.release_phase = RESOURCE_RELEASE_AFTER_LOCKS,
|
|
|
|
.release_priority = RELEASE_PRIO_CATCACHE_LIST_REFS,
|
|
|
|
.ReleaseResource = ResOwnerReleaseCatCacheList,
|
|
|
|
.DebugPrint = ResOwnerPrintCatCacheList
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Convenience wrappers over ResourceOwnerRemember/Forget */
|
|
|
|
static inline void
|
|
|
|
ResourceOwnerRememberCatCacheRef(ResourceOwner owner, HeapTuple tuple)
|
|
|
|
{
|
|
|
|
ResourceOwnerRemember(owner, PointerGetDatum(tuple), &catcache_resowner_desc);
|
|
|
|
}
|
|
|
|
static inline void
|
|
|
|
ResourceOwnerForgetCatCacheRef(ResourceOwner owner, HeapTuple tuple)
|
|
|
|
{
|
|
|
|
ResourceOwnerForget(owner, PointerGetDatum(tuple), &catcache_resowner_desc);
|
|
|
|
}
|
|
|
|
static inline void
|
|
|
|
ResourceOwnerRememberCatCacheListRef(ResourceOwner owner, CatCList *list)
|
|
|
|
{
|
|
|
|
ResourceOwnerRemember(owner, PointerGetDatum(list), &catlistref_resowner_desc);
|
|
|
|
}
|
|
|
|
static inline void
|
|
|
|
ResourceOwnerForgetCatCacheListRef(ResourceOwner owner, CatCList *list)
|
|
|
|
{
|
|
|
|
ResourceOwnerForget(owner, PointerGetDatum(list), &catlistref_resowner_desc);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2003-06-23 00:04:55 +02:00
|
|
|
/*
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
* Hash and equality functions for system types that are used as cache key
|
|
|
|
* fields. In some cases, we just call the regular SQL-callable functions for
|
|
|
|
* the appropriate data type, but that tends to be a little slow, and the
|
|
|
|
* speed of these functions is performance-critical. Therefore, for data
|
|
|
|
* types that frequently occur as catcache keys, we hard-code the logic here.
|
|
|
|
* Avoiding the overhead of DirectFunctionCallN(...) is a substantial win, and
|
|
|
|
* in certain cases (like int4) we can adopt a faster hash algorithm as well.
|
2003-06-23 00:04:55 +02:00
|
|
|
*/
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
|
|
|
|
static bool
|
|
|
|
chareqfast(Datum a, Datum b)
|
|
|
|
{
|
|
|
|
return DatumGetChar(a) == DatumGetChar(b);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint32
|
|
|
|
charhashfast(Datum datum)
|
|
|
|
{
|
|
|
|
return murmurhash32((int32) DatumGetChar(datum));
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
nameeqfast(Datum a, Datum b)
|
|
|
|
{
|
|
|
|
char *ca = NameStr(*DatumGetName(a));
|
|
|
|
char *cb = NameStr(*DatumGetName(b));
|
|
|
|
|
|
|
|
return strncmp(ca, cb, NAMEDATALEN) == 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint32
|
|
|
|
namehashfast(Datum datum)
|
|
|
|
{
|
|
|
|
char *key = NameStr(*DatumGetName(datum));
|
|
|
|
|
|
|
|
return hash_any((unsigned char *) key, strlen(key));
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
int2eqfast(Datum a, Datum b)
|
|
|
|
{
|
|
|
|
return DatumGetInt16(a) == DatumGetInt16(b);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint32
|
|
|
|
int2hashfast(Datum datum)
|
|
|
|
{
|
|
|
|
return murmurhash32((int32) DatumGetInt16(datum));
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
int4eqfast(Datum a, Datum b)
|
|
|
|
{
|
|
|
|
return DatumGetInt32(a) == DatumGetInt32(b);
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint32
|
|
|
|
int4hashfast(Datum datum)
|
|
|
|
{
|
|
|
|
return murmurhash32((int32) DatumGetInt32(datum));
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
texteqfast(Datum a, Datum b)
|
|
|
|
{
|
2019-03-22 12:09:32 +01:00
|
|
|
/*
|
|
|
|
* The use of DEFAULT_COLLATION_OID is fairly arbitrary here. We just
|
|
|
|
* want to take the fast "deterministic" path in texteq().
|
|
|
|
*/
|
|
|
|
return DatumGetBool(DirectFunctionCall2Coll(texteq, DEFAULT_COLLATION_OID, a, b));
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static uint32
|
|
|
|
texthashfast(Datum datum)
|
|
|
|
{
|
2019-03-22 12:09:32 +01:00
|
|
|
/* analogously here as in texteqfast() */
|
|
|
|
return DatumGetInt32(DirectFunctionCall1Coll(hashtext, DEFAULT_COLLATION_OID, datum));
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
oidvectoreqfast(Datum a, Datum b)
|
|
|
|
{
|
|
|
|
return DatumGetBool(DirectFunctionCall2(oidvectoreq, a, b));
|
|
|
|
}
|
|
|
|
|
|
|
|
static uint32
|
|
|
|
oidvectorhashfast(Datum datum)
|
|
|
|
{
|
|
|
|
return DatumGetInt32(DirectFunctionCall1(hashoidvector, datum));
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Lookup support functions for a type. */
|
2003-06-23 00:04:55 +02:00
|
|
|
static void
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
GetCCHashEqFuncs(Oid keytype, CCHashFN *hashfunc, RegProcedure *eqfunc, CCFastEqualFN *fasteqfunc)
|
2000-02-21 04:36:59 +01:00
|
|
|
{
|
|
|
|
switch (keytype)
|
|
|
|
{
|
2001-06-18 05:35:07 +02:00
|
|
|
case BOOLOID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = charhashfast;
|
|
|
|
*fasteqfunc = chareqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_BOOLEQ;
|
|
|
|
break;
|
2001-06-18 05:35:07 +02:00
|
|
|
case CHAROID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = charhashfast;
|
|
|
|
*fasteqfunc = chareqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_CHAREQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
case NAMEOID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = namehashfast;
|
|
|
|
*fasteqfunc = nameeqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_NAMEEQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
case INT2OID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = int2hashfast;
|
|
|
|
*fasteqfunc = int2eqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_INT2EQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
case INT4OID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = int4hashfast;
|
|
|
|
*fasteqfunc = int4eqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_INT4EQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
case TEXTOID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = texthashfast;
|
|
|
|
*fasteqfunc = texteqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_TEXTEQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
case OIDOID:
|
2002-04-25 04:56:56 +02:00
|
|
|
case REGPROCOID:
|
|
|
|
case REGPROCEDUREOID:
|
|
|
|
case REGOPEROID:
|
|
|
|
case REGOPERATOROID:
|
|
|
|
case REGCLASSOID:
|
|
|
|
case REGTYPEOID:
|
2022-07-17 23:43:28 +02:00
|
|
|
case REGCOLLATIONOID:
|
2007-08-21 03:11:32 +02:00
|
|
|
case REGCONFIGOID:
|
|
|
|
case REGDICTIONARYOID:
|
2015-05-09 19:06:49 +02:00
|
|
|
case REGROLEOID:
|
2015-05-09 19:36:52 +02:00
|
|
|
case REGNAMESPACEOID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = int4hashfast;
|
|
|
|
*fasteqfunc = int4eqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_OIDEQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
case OIDVECTOROID:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
*hashfunc = oidvectorhashfast;
|
|
|
|
*fasteqfunc = oidvectoreqfast;
|
2003-06-23 00:04:55 +02:00
|
|
|
*eqfunc = F_OIDVECTOREQ;
|
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
default:
|
2003-07-25 22:18:01 +02:00
|
|
|
elog(FATAL, "type %u not supported as catcache key", keytype);
|
2005-09-25 00:54:44 +02:00
|
|
|
*hashfunc = NULL; /* keep compiler quiet */
|
2009-06-11 16:49:15 +02:00
|
|
|
|
2005-09-25 00:54:44 +02:00
|
|
|
*eqfunc = InvalidOid;
|
2003-06-23 00:04:55 +02:00
|
|
|
break;
|
2000-02-21 04:36:59 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2002-03-03 18:47:56 +01:00
|
|
|
* CatalogCacheComputeHashValue
|
2000-11-10 01:33:12 +01:00
|
|
|
*
|
2002-03-03 18:47:56 +01:00
|
|
|
* Compute the hash value associated with a given set of lookup keys
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2002-03-03 18:47:56 +01:00
|
|
|
static uint32
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
CatalogCacheComputeHashValue(CatCache *cache, int nkeys,
|
|
|
|
Datum v1, Datum v2, Datum v3, Datum v4)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2002-03-03 18:47:56 +01:00
|
|
|
uint32 hashValue = 0;
|
2007-04-21 06:49:20 +02:00
|
|
|
uint32 oneHash;
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
CCHashFN *cc_hashfunc = cache->cc_hashfunc;
|
2000-11-10 01:33:12 +01:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "CatalogCacheComputeHashValue %s %d %p",
|
|
|
|
cache->cc_relname, nkeys, cache);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
switch (nkeys)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
case 4:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
oneHash = (cc_hashfunc[3]) (v4);
|
2022-02-20 07:22:08 +01:00
|
|
|
hashValue ^= pg_rotate_left32(oneHash, 24);
|
1996-07-09 08:22:35 +02:00
|
|
|
/* FALLTHROUGH */
|
|
|
|
case 3:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
oneHash = (cc_hashfunc[2]) (v3);
|
2022-02-20 07:22:08 +01:00
|
|
|
hashValue ^= pg_rotate_left32(oneHash, 16);
|
1996-07-09 08:22:35 +02:00
|
|
|
/* FALLTHROUGH */
|
|
|
|
case 2:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
oneHash = (cc_hashfunc[1]) (v2);
|
2022-02-20 07:22:08 +01:00
|
|
|
hashValue ^= pg_rotate_left32(oneHash, 8);
|
1996-07-09 08:22:35 +02:00
|
|
|
/* FALLTHROUGH */
|
|
|
|
case 1:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
oneHash = (cc_hashfunc[0]) (v1);
|
2007-04-21 06:49:20 +02:00
|
|
|
hashValue ^= oneHash;
|
1996-07-09 08:22:35 +02:00
|
|
|
break;
|
|
|
|
default:
|
2003-07-25 22:18:01 +02:00
|
|
|
elog(FATAL, "wrong number of hash keys: %d", nkeys);
|
1996-07-09 08:22:35 +02:00
|
|
|
break;
|
|
|
|
}
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
return hashValue;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2002-03-03 18:47:56 +01:00
|
|
|
* CatalogCacheComputeTupleHashValue
|
|
|
|
*
|
|
|
|
* Compute the hash value associated with a given tuple to be cached
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2002-03-03 18:47:56 +01:00
|
|
|
static uint32
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
CatalogCacheComputeTupleHashValue(CatCache *cache, int nkeys, HeapTuple tuple)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Datum v1 = 0,
|
|
|
|
v2 = 0,
|
|
|
|
v3 = 0,
|
|
|
|
v4 = 0;
|
2000-01-31 05:35:57 +01:00
|
|
|
bool isNull = false;
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
int *cc_keyno = cache->cc_keyno;
|
|
|
|
TupleDesc cc_tupdesc = cache->cc_tupdesc;
|
2000-01-31 05:35:57 +01:00
|
|
|
|
2000-11-10 01:33:12 +01:00
|
|
|
/* Now extract key fields from tuple, insert into scankey */
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
switch (nkeys)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
case 4:
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
v4 = fastgetattr(tuple,
|
|
|
|
cc_keyno[3],
|
|
|
|
cc_tupdesc,
|
|
|
|
&isNull);
|
1996-07-09 08:22:35 +02:00
|
|
|
Assert(!isNull);
|
|
|
|
/* FALLTHROUGH */
|
|
|
|
case 3:
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
v3 = fastgetattr(tuple,
|
|
|
|
cc_keyno[2],
|
|
|
|
cc_tupdesc,
|
|
|
|
&isNull);
|
1996-07-09 08:22:35 +02:00
|
|
|
Assert(!isNull);
|
|
|
|
/* FALLTHROUGH */
|
|
|
|
case 2:
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
v2 = fastgetattr(tuple,
|
|
|
|
cc_keyno[1],
|
|
|
|
cc_tupdesc,
|
|
|
|
&isNull);
|
1996-07-09 08:22:35 +02:00
|
|
|
Assert(!isNull);
|
|
|
|
/* FALLTHROUGH */
|
|
|
|
case 1:
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
v1 = fastgetattr(tuple,
|
|
|
|
cc_keyno[0],
|
|
|
|
cc_tupdesc,
|
|
|
|
&isNull);
|
1996-07-09 08:22:35 +02:00
|
|
|
Assert(!isNull);
|
|
|
|
break;
|
|
|
|
default:
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
elog(FATAL, "wrong number of hash keys: %d", nkeys);
|
1996-07-09 08:22:35 +02:00
|
|
|
break;
|
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
return CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CatalogCacheCompareTuple
|
|
|
|
*
|
|
|
|
* Compare a tuple to the passed arguments.
|
|
|
|
*/
|
|
|
|
static inline bool
|
|
|
|
CatalogCacheCompareTuple(const CatCache *cache, int nkeys,
|
|
|
|
const Datum *cachekeys,
|
|
|
|
const Datum *searchkeys)
|
|
|
|
{
|
|
|
|
const CCFastEqualFN *cc_fastequal = cache->cc_fastequal;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < nkeys; i++)
|
|
|
|
{
|
|
|
|
if (!(cc_fastequal[i]) (cachekeys[i], searchkeys[i]))
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
return true;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
|
|
|
|
static void
|
2006-06-15 04:08:09 +02:00
|
|
|
CatCachePrintStats(int code, Datum arg)
|
2002-03-03 18:47:56 +01:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_iter iter;
|
2002-03-03 18:47:56 +01:00
|
|
|
long cc_searches = 0;
|
|
|
|
long cc_hits = 0;
|
|
|
|
long cc_neg_hits = 0;
|
|
|
|
long cc_newloads = 0;
|
|
|
|
long cc_invals = 0;
|
2002-04-06 08:59:25 +02:00
|
|
|
long cc_lsearches = 0;
|
|
|
|
long cc_lhits = 0;
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_foreach(iter, &CacheHdr->ch_caches)
|
2002-03-03 18:47:56 +01:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
CatCache *cache = slist_container(CatCache, cc_next, iter.cur);
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
if (cache->cc_ntup == 0 && cache->cc_searches == 0)
|
|
|
|
continue; /* don't print unused caches */
|
2006-06-15 04:08:09 +02:00
|
|
|
elog(DEBUG2, "catcache %s/%u: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits",
|
2002-03-03 18:47:56 +01:00
|
|
|
cache->cc_relname,
|
2005-04-14 22:03:27 +02:00
|
|
|
cache->cc_indexoid,
|
2002-03-03 18:47:56 +01:00
|
|
|
cache->cc_ntup,
|
|
|
|
cache->cc_searches,
|
|
|
|
cache->cc_hits,
|
|
|
|
cache->cc_neg_hits,
|
|
|
|
cache->cc_hits + cache->cc_neg_hits,
|
|
|
|
cache->cc_newloads,
|
|
|
|
cache->cc_searches - cache->cc_hits - cache->cc_neg_hits - cache->cc_newloads,
|
|
|
|
cache->cc_searches - cache->cc_hits - cache->cc_neg_hits,
|
|
|
|
cache->cc_invals,
|
2002-04-06 08:59:25 +02:00
|
|
|
cache->cc_lsearches,
|
|
|
|
cache->cc_lhits);
|
2002-03-03 18:47:56 +01:00
|
|
|
cc_searches += cache->cc_searches;
|
|
|
|
cc_hits += cache->cc_hits;
|
|
|
|
cc_neg_hits += cache->cc_neg_hits;
|
|
|
|
cc_newloads += cache->cc_newloads;
|
|
|
|
cc_invals += cache->cc_invals;
|
2002-04-06 08:59:25 +02:00
|
|
|
cc_lsearches += cache->cc_lsearches;
|
|
|
|
cc_lhits += cache->cc_lhits;
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
2006-06-15 04:08:09 +02:00
|
|
|
elog(DEBUG2, "catcache totals: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits",
|
2002-03-03 18:47:56 +01:00
|
|
|
CacheHdr->ch_ntup,
|
|
|
|
cc_searches,
|
|
|
|
cc_hits,
|
|
|
|
cc_neg_hits,
|
|
|
|
cc_hits + cc_neg_hits,
|
|
|
|
cc_newloads,
|
|
|
|
cc_searches - cc_hits - cc_neg_hits - cc_newloads,
|
|
|
|
cc_searches - cc_hits - cc_neg_hits,
|
|
|
|
cc_invals,
|
2002-04-06 08:59:25 +02:00
|
|
|
cc_lsearches,
|
|
|
|
cc_lhits);
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
|
|
|
#endif /* CATCACHE_STATS */
|
|
|
|
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* CatCacheRemoveCTup
|
2002-03-03 18:47:56 +01:00
|
|
|
*
|
|
|
|
* Unlink and delete the given cache entry
|
2002-04-06 08:59:25 +02:00
|
|
|
*
|
|
|
|
* NB: if it is a member of a CatCList, the CatCList is deleted too.
|
2005-08-14 00:18:07 +02:00
|
|
|
* Both the cache entry and the list had better have zero refcount.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1997-08-19 23:40:56 +02:00
|
|
|
static void
|
2000-11-16 23:30:52 +01:00
|
|
|
CatCacheRemoveCTup(CatCache *cache, CatCTup *ct)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-11-16 23:30:52 +01:00
|
|
|
Assert(ct->refcount == 0);
|
2001-06-18 05:35:07 +02:00
|
|
|
Assert(ct->my_cache == cache);
|
2000-01-31 05:35:57 +01:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
if (ct->c_list)
|
2006-01-07 22:16:10 +01:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* The cleanest way to handle this is to call CatCacheRemoveCList,
|
|
|
|
* which will recurse back to me, and the recursive call will do the
|
|
|
|
* work. Set the "dead" flag to make sure it does recurse.
|
|
|
|
*/
|
|
|
|
ct->dead = true;
|
2002-04-06 08:59:25 +02:00
|
|
|
CatCacheRemoveCList(cache, ct->c_list);
|
2006-01-07 22:16:10 +01:00
|
|
|
return; /* nothing left to do */
|
|
|
|
}
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2012-10-19 01:30:43 +02:00
|
|
|
/* delink from linked list */
|
2012-10-19 01:04:20 +02:00
|
|
|
dlist_delete(&ct->cache_elem);
|
2000-01-31 05:35:57 +01:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/*
|
|
|
|
* Free keys when we're dealing with a negative entry, normal entries just
|
|
|
|
* point into tuple, allocated together with the CatCTup.
|
|
|
|
*/
|
|
|
|
if (ct->negative)
|
|
|
|
CatCacheFreeKeys(cache->cc_tupdesc, cache->cc_nkeys,
|
|
|
|
cache->cc_keyno, ct->keys);
|
|
|
|
|
2000-01-31 05:35:57 +01:00
|
|
|
pfree(ct);
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
--cache->cc_ntup;
|
2001-06-18 05:35:07 +02:00
|
|
|
--CacheHdr->ch_ntup;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
/*
|
|
|
|
* CatCacheRemoveCList
|
|
|
|
*
|
|
|
|
* Unlink and delete the given cache list entry
|
2006-01-07 22:16:10 +01:00
|
|
|
*
|
|
|
|
* NB: any dead member entries that become unreferenced are deleted too.
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
CatCacheRemoveCList(CatCache *cache, CatCList *cl)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
Assert(cl->refcount == 0);
|
|
|
|
Assert(cl->my_cache == cache);
|
|
|
|
|
|
|
|
/* delink from member tuples */
|
|
|
|
for (i = cl->n_members; --i >= 0;)
|
|
|
|
{
|
|
|
|
CatCTup *ct = cl->members[i];
|
|
|
|
|
|
|
|
Assert(ct->c_list == cl);
|
|
|
|
ct->c_list = NULL;
|
2006-01-07 22:16:10 +01:00
|
|
|
/* if the member is dead and now has no references, remove it */
|
|
|
|
if (
|
|
|
|
#ifndef CATCACHE_FORCE_RELEASE
|
|
|
|
ct->dead &&
|
|
|
|
#endif
|
|
|
|
ct->refcount == 0)
|
|
|
|
CatCacheRemoveCTup(cache, ct);
|
2002-04-06 08:59:25 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* delink from linked list */
|
2012-10-19 01:04:20 +02:00
|
|
|
dlist_delete(&cl->cache_elem);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/* free associated column data */
|
|
|
|
CatCacheFreeKeys(cache->cc_tupdesc, cl->nkeys,
|
|
|
|
cache->cc_keyno, cl->keys);
|
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
pfree(cl);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2017-05-13 00:17:29 +02:00
|
|
|
* CatCacheInvalidate
|
2002-03-03 18:47:56 +01:00
|
|
|
*
|
Forget about targeting catalog cache invalidations by tuple TID.
The TID isn't stable enough: we might queue an sinval event before a VACUUM
FULL, and then process it afterwards, when the target tuple no longer has
the same TID. So we must invalidate entries on the basis of hash value
only. The old coding can be shown to result in various bizarre,
hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on
system catalogs, and could easily result in permanent catalog corruption,
up to and including complete loss of tables.
This commit is just a minimal fix that removes the unsafe comparison.
We should remove transmission of the tuple TID from sinval messages
altogether, and then arrange to suppress the extra message in the common
case of a heap_update that doesn't change the key hashvalue. But that's
going to be much more invasive, and will only produce a probably-marginal
performance gain, so it doesn't seem like material for a back-patch.
Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving
if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a tuple
that might be the subject of an unsent sinval message.
2011-08-16 21:26:22 +02:00
|
|
|
* Invalidate entries in the specified cache, given a hash value.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
Forget about targeting catalog cache invalidations by tuple TID.
The TID isn't stable enough: we might queue an sinval event before a VACUUM
FULL, and then process it afterwards, when the target tuple no longer has
the same TID. So we must invalidate entries on the basis of hash value
only. The old coding can be shown to result in various bizarre,
hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on
system catalogs, and could easily result in permanent catalog corruption,
up to and including complete loss of tables.
This commit is just a minimal fix that removes the unsafe comparison.
We should remove transmission of the tuple TID from sinval messages
altogether, and then arrange to suppress the extra message in the common
case of a heap_update that doesn't change the key hashvalue. But that's
going to be much more invasive, and will only produce a probably-marginal
performance gain, so it doesn't seem like material for a back-patch.
Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving
if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a tuple
that might be the subject of an unsent sinval message.
2011-08-16 21:26:22 +02:00
|
|
|
* We delete cache entries that match the hash value, whether positive
|
|
|
|
* or negative. We don't care whether the invalidation is the result
|
|
|
|
* of a tuple insertion or a deletion.
|
|
|
|
*
|
|
|
|
* We used to try to match positive cache entries by TID, but that is
|
|
|
|
* unsafe after a VACUUM FULL on a system catalog: an inval event could
|
|
|
|
* be queued before VACUUM FULL, and then processed afterwards, when the
|
|
|
|
* target tuple that has to be invalidated has a different TID than it
|
|
|
|
* did when the event was created. So now we just compare hash values and
|
|
|
|
* accept the small risk of unnecessary invalidations due to false matches.
|
2002-03-03 18:47:56 +01:00
|
|
|
*
|
|
|
|
* This routine is only quasi-public: it should only be used by inval.c.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
void
|
2017-05-13 00:17:29 +02:00
|
|
|
CatCacheInvalidate(CatCache *cache, uint32 hashValue)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2017-05-13 00:17:29 +02:00
|
|
|
Index hashIndex;
|
|
|
|
dlist_mutable_iter iter;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "CatCacheInvalidate: called");
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2017-05-13 00:17:29 +02:00
|
|
|
* We don't bother to check whether the cache has finished initialization
|
|
|
|
* yet; if not, there will be no entries in it so no problem.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2017-05-13 00:17:29 +02:00
|
|
|
/*
|
|
|
|
* Invalidate *all* CatCLists in this cache; it's too hard to tell which
|
|
|
|
* searches might still be correct, so just zap 'em all.
|
|
|
|
*/
|
|
|
|
dlist_foreach_modify(iter, &cache->cc_lists)
|
|
|
|
{
|
|
|
|
CatCList *cl = dlist_container(CatCList, cache_elem, iter.cur);
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2017-05-13 00:17:29 +02:00
|
|
|
if (cl->refcount > 0)
|
|
|
|
cl->dead = true;
|
|
|
|
else
|
|
|
|
CatCacheRemoveCList(cache, cl);
|
|
|
|
}
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2017-05-13 00:17:29 +02:00
|
|
|
/*
|
|
|
|
* inspect the proper hash bucket for tuple matches
|
|
|
|
*/
|
|
|
|
hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
|
|
|
|
dlist_foreach_modify(iter, &cache->cc_bucket[hashIndex])
|
|
|
|
{
|
|
|
|
CatCTup *ct = dlist_container(CatCTup, cache_elem, iter.cur);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2017-05-13 00:17:29 +02:00
|
|
|
if (hashValue == ct->hash_value)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2017-05-13 00:17:29 +02:00
|
|
|
if (ct->refcount > 0 ||
|
|
|
|
(ct->c_list && ct->c_list->refcount > 0))
|
2000-11-16 23:30:52 +01:00
|
|
|
{
|
2017-05-13 00:17:29 +02:00
|
|
|
ct->dead = true;
|
|
|
|
/* list, if any, was marked dead above */
|
|
|
|
Assert(ct->c_list == NULL || ct->c_list->dead);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
CatCacheRemoveCTup(cache, ct);
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "CatCacheInvalidate: invalidated");
|
2002-03-03 18:47:56 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
2017-05-13 00:17:29 +02:00
|
|
|
cache->cc_invals++;
|
2002-03-03 18:47:56 +01:00
|
|
|
#endif
|
2017-05-13 00:17:29 +02:00
|
|
|
/* could be multiple matches, so keep looking! */
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* public functions
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
2000-11-16 23:30:52 +01:00
|
|
|
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
/*
|
|
|
|
* Standard routine for creating cache context if it doesn't exist yet
|
|
|
|
*
|
|
|
|
* There are a lot of places (probably far more than necessary) that check
|
|
|
|
* whether CacheMemoryContext exists yet and want to create it if not.
|
|
|
|
* We centralize knowledge of exactly how to create it here.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
CreateCacheMemoryContext(void)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Purely for paranoia, check that context doesn't exist; caller probably
|
|
|
|
* did so already.
|
|
|
|
*/
|
|
|
|
if (!CacheMemoryContext)
|
|
|
|
CacheMemoryContext = AllocSetContextCreate(TopMemoryContext,
|
|
|
|
"CacheMemoryContext",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_DEFAULT_SIZES);
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2001-06-18 05:35:07 +02:00
|
|
|
* ResetCatalogCache
|
2000-11-16 23:30:52 +01:00
|
|
|
*
|
2001-06-18 05:35:07 +02:00
|
|
|
* Reset one catalog cache to empty.
|
2001-02-22 19:39:20 +01:00
|
|
|
*
|
2001-06-18 05:35:07 +02:00
|
|
|
* This is not very efficient if the target cache is nearly empty.
|
|
|
|
* However, it shouldn't need to be efficient; we don't invoke it often.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2001-06-18 05:35:07 +02:00
|
|
|
static void
|
|
|
|
ResetCatalogCache(CatCache *cache)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_mutable_iter iter;
|
2001-06-18 05:35:07 +02:00
|
|
|
int i;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
/* Remove each list in this cache, or at least mark it dead */
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach_modify(iter, &cache->cc_lists)
|
2002-04-06 08:59:25 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
CatCList *cl = dlist_container(CatCList, cache_elem, iter.cur);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
if (cl->refcount > 0)
|
|
|
|
cl->dead = true;
|
|
|
|
else
|
|
|
|
CatCacheRemoveCList(cache, cl);
|
|
|
|
}
|
|
|
|
|
2001-06-18 05:35:07 +02:00
|
|
|
/* Remove each tuple in this cache, or at least mark it dead */
|
2002-03-06 21:49:46 +01:00
|
|
|
for (i = 0; i < cache->cc_nbuckets; i++)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_head *bucket = &cache->cc_bucket[i];
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach_modify(iter, bucket)
|
|
|
|
{
|
|
|
|
CatCTup *ct = dlist_container(CatCTup, cache_elem, iter.cur);
|
2000-08-06 06:17:47 +02:00
|
|
|
|
2005-08-14 00:18:07 +02:00
|
|
|
if (ct->refcount > 0 ||
|
|
|
|
(ct->c_list && ct->c_list->refcount > 0))
|
|
|
|
{
|
2000-11-16 23:30:52 +01:00
|
|
|
ct->dead = true;
|
2005-08-14 00:18:07 +02:00
|
|
|
/* list, if any, was marked dead above */
|
|
|
|
Assert(ct->c_list == NULL || ct->c_list->dead);
|
|
|
|
}
|
2000-11-16 23:30:52 +01:00
|
|
|
else
|
|
|
|
CatCacheRemoveCTup(cache, ct);
|
2002-03-03 18:47:56 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
cache->cc_invals++;
|
|
|
|
#endif
|
2000-08-06 06:17:47 +02:00
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
2001-06-18 05:35:07 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-06-18 05:35:07 +02:00
|
|
|
/*
|
|
|
|
* ResetCatalogCaches
|
|
|
|
*
|
|
|
|
* Reset all caches when a shared cache inval event forces it
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
ResetCatalogCaches(void)
|
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_iter iter;
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "ResetCatalogCaches called");
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_foreach(iter, &CacheHdr->ch_caches)
|
|
|
|
{
|
|
|
|
CatCache *cache = slist_container(CatCache, cc_next, iter.cur);
|
|
|
|
|
2001-06-18 05:35:07 +02:00
|
|
|
ResetCatalogCache(cache);
|
2012-10-16 22:36:30 +02:00
|
|
|
}
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "end of ResetCatalogCaches call");
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2010-02-07 21:48:13 +01:00
|
|
|
/*
|
|
|
|
* CatalogCacheFlushCatalog
|
|
|
|
*
|
|
|
|
* Flush all catcache entries that came from the specified system catalog.
|
|
|
|
* This is needed after VACUUM FULL/CLUSTER on the catalog, since the
|
|
|
|
* tuples very likely now have different TIDs than before. (At one point
|
|
|
|
* we also tried to force re-execution of CatalogCacheInitializeCache for
|
|
|
|
* the cache(s) on that catalog. This is a bad idea since it leads to all
|
|
|
|
* kinds of trouble if a cache flush occurs while loading cache entries.
|
|
|
|
* We now avoid the need to do it by copying cc_tupdesc out of the relcache,
|
|
|
|
* rather than relying on the relcache to keep a tupdesc for us. Of course
|
|
|
|
* this assumes the tupdesc of a cachable system table will not change...)
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
CatalogCacheFlushCatalog(Oid catId)
|
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_iter iter;
|
2010-02-07 21:48:13 +01:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "CatalogCacheFlushCatalog called for %u", catId);
|
2010-02-07 21:48:13 +01:00
|
|
|
|
2012-10-19 01:30:43 +02:00
|
|
|
slist_foreach(iter, &CacheHdr->ch_caches)
|
2010-02-07 21:48:13 +01:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
CatCache *cache = slist_container(CatCache, cc_next, iter.cur);
|
|
|
|
|
2010-02-07 21:48:13 +01:00
|
|
|
/* Does this cache store tuples of the target catalog? */
|
Forget about targeting catalog cache invalidations by tuple TID.
The TID isn't stable enough: we might queue an sinval event before a VACUUM
FULL, and then process it afterwards, when the target tuple no longer has
the same TID. So we must invalidate entries on the basis of hash value
only. The old coding can be shown to result in various bizarre,
hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on
system catalogs, and could easily result in permanent catalog corruption,
up to and including complete loss of tables.
This commit is just a minimal fix that removes the unsafe comparison.
We should remove transmission of the tuple TID from sinval messages
altogether, and then arrange to suppress the extra message in the common
case of a heap_update that doesn't change the key hashvalue. But that's
going to be much more invasive, and will only produce a probably-marginal
performance gain, so it doesn't seem like material for a back-patch.
Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving
if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a tuple
that might be the subject of an unsent sinval message.
2011-08-16 21:26:22 +02:00
|
|
|
if (cache->cc_reloid == catId)
|
2010-02-07 21:48:13 +01:00
|
|
|
{
|
|
|
|
/* Yes, so flush all its contents */
|
|
|
|
ResetCatalogCache(cache);
|
|
|
|
|
|
|
|
/* Tell inval.c to call syscache callbacks for this cache */
|
2011-08-17 01:27:46 +02:00
|
|
|
CallSyscacheCallbacks(cache->id, 0);
|
2010-02-07 21:48:13 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "end of CatalogCacheFlushCatalog call");
|
2010-02-07 21:48:13 +01:00
|
|
|
}
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2000-11-16 23:30:52 +01:00
|
|
|
* InitCatCache
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* This allocates and initializes a cache for a system catalog relation.
|
|
|
|
* Actually, the cache is only partially initialized to avoid opening the
|
|
|
|
* relation. The relation will be opened and the rest of the cache
|
|
|
|
* structure initialized on the first access.
|
|
|
|
*/
|
|
|
|
#ifdef CACHEDEBUG
|
2003-05-27 19:49:47 +02:00
|
|
|
#define InitCatCache_DEBUG2 \
|
1998-06-15 20:40:05 +02:00
|
|
|
do { \
|
2005-04-14 22:03:27 +02:00
|
|
|
elog(DEBUG2, "InitCatCache: rel=%u ind=%u id=%d nkeys=%d size=%d", \
|
|
|
|
cp->cc_reloid, cp->cc_indexoid, cp->id, \
|
|
|
|
cp->cc_nkeys, cp->cc_nbuckets); \
|
1998-06-15 20:40:05 +02:00
|
|
|
} while(0)
|
1996-07-09 08:22:35 +02:00
|
|
|
#else
|
2003-05-27 19:49:47 +02:00
|
|
|
#define InitCatCache_DEBUG2
|
1996-07-09 08:22:35 +02:00
|
|
|
#endif
|
|
|
|
|
2001-06-18 05:35:07 +02:00
|
|
|
CatCache *
|
2000-11-16 23:30:52 +01:00
|
|
|
InitCatCache(int id,
|
2005-04-14 22:03:27 +02:00
|
|
|
Oid reloid,
|
|
|
|
Oid indexoid,
|
1996-07-09 08:22:35 +02:00
|
|
|
int nkeys,
|
2006-06-15 04:08:09 +02:00
|
|
|
const int *key,
|
|
|
|
int nbuckets)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
CatCache *cp;
|
|
|
|
MemoryContext oldcxt;
|
2000-11-10 01:33:12 +01:00
|
|
|
int i;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2006-06-15 04:08:09 +02:00
|
|
|
/*
|
2013-09-05 18:47:56 +02:00
|
|
|
* nbuckets is the initial number of hash buckets to use in this catcache.
|
|
|
|
* It will be enlarged later if it becomes too full.
|
2006-06-15 04:08:09 +02:00
|
|
|
*
|
|
|
|
* nbuckets must be a power of two. We check this via Assert rather than
|
|
|
|
* a full runtime check because the values will be coming from constant
|
|
|
|
* tables.
|
|
|
|
*
|
|
|
|
* If you're confused by the power-of-two check, see comments in
|
|
|
|
* bitmapset.c for an explanation.
|
|
|
|
*/
|
|
|
|
Assert(nbuckets > 0 && (nbuckets & -nbuckets) == nbuckets);
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* first switch to the cache context so our allocations do not vanish at
|
1997-08-25 01:08:01 +02:00
|
|
|
* the end of a transaction
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2000-06-28 05:33:33 +02:00
|
|
|
if (!CacheMemoryContext)
|
|
|
|
CreateCacheMemoryContext();
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2006-06-15 04:08:09 +02:00
|
|
|
* if first time through, initialize the cache group header
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2001-06-18 05:35:07 +02:00
|
|
|
if (CacheHdr == NULL)
|
|
|
|
{
|
|
|
|
CacheHdr = (CatCacheHeader *) palloc(sizeof(CatCacheHeader));
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_init(&CacheHdr->ch_caches);
|
2001-06-18 05:35:07 +02:00
|
|
|
CacheHdr->ch_ntup = 0;
|
2002-02-19 21:11:20 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
2006-06-15 04:08:09 +02:00
|
|
|
/* set up to dump stats at backend exit */
|
2002-02-19 21:11:20 +01:00
|
|
|
on_proc_exit(CatCachePrintStats, 0);
|
|
|
|
#endif
|
2001-06-18 05:35:07 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
* Allocate a new cache structure, aligning to a cacheline boundary
|
2012-10-19 01:30:43 +02:00
|
|
|
*
|
|
|
|
* Note: we rely on zeroing to initialize all the dlist headers correctly
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
Add palloc_aligned() to allow aligned memory allocations
This introduces palloc_aligned() and MemoryContextAllocAligned() which
allow callers to obtain memory which is allocated to the given size and
also aligned to the specified alignment boundary. The alignment
boundaries may be any power-of-2 value. Currently, the alignment is
capped at 2^26, however, we don't expect values anything like that large.
The primary expected use case is to align allocations to perhaps CPU
cache line size or to maybe I/O page size. Certain use cases can benefit
from having aligned memory by either having better performance or more
predictable performance.
The alignment is achieved by requesting 'alignto' additional bytes from
the underlying allocator function and then aligning the address that is
returned to the requested alignment. This obviously does waste some
memory, so alignments should be kept as small as what is required.
It's also important to note that these alignment bytes eat into the
maximum allocation size. So something like:
palloc_aligned(MaxAllocSize, 64, 0);
will not work as we cannot request MaxAllocSize + 64 bytes.
Additionally, because we're just requesting the requested size plus the
alignment requirements from the given MemoryContext, if that context is
the Slab allocator, then since slab can only provide chunks of the size
that's specified when the slab context is created, then this is not going
to work. Slab will generate an error to indicate that the requested size
is not supported.
The alignment that is requested in palloc_aligned() is stored along with
the allocated memory. This allows the alignment to remain intact through
repalloc() calls.
Author: Andres Freund, David Rowley
Reviewed-by: Maxim Orlov, Andres Freund, John Naylor
Discussion: https://postgr.es/m/CAApHDvpxLPUMV1mhxs6g7GNwCP6Cs6hfnYQL5ffJQTuFAuxt8A%40mail.gmail.com
2022-12-22 01:32:05 +01:00
|
|
|
cp = (CatCache *) palloc_aligned(sizeof(CatCache), PG_CACHE_LINE_SIZE,
|
|
|
|
MCXT_ALLOC_ZERO);
|
2013-09-05 18:47:56 +02:00
|
|
|
cp->cc_bucket = palloc0(nbuckets * sizeof(dlist_head));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* initialize the cache's relation information for the relation
|
2000-11-24 05:16:12 +01:00
|
|
|
* corresponding to this cache, and initialize some of the new cache's
|
|
|
|
* other internal fields. But don't open the relation yet.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2001-06-18 05:35:07 +02:00
|
|
|
cp->id = id;
|
2005-04-14 22:03:27 +02:00
|
|
|
cp->cc_relname = "(not known yet)";
|
|
|
|
cp->cc_reloid = reloid;
|
|
|
|
cp->cc_indexoid = indexoid;
|
2001-06-19 21:42:16 +02:00
|
|
|
cp->cc_relisshared = false; /* temporary */
|
1996-07-09 08:22:35 +02:00
|
|
|
cp->cc_tupdesc = (TupleDesc) NULL;
|
2001-06-18 05:35:07 +02:00
|
|
|
cp->cc_ntup = 0;
|
2006-06-15 04:08:09 +02:00
|
|
|
cp->cc_nbuckets = nbuckets;
|
1996-07-09 08:22:35 +02:00
|
|
|
cp->cc_nkeys = nkeys;
|
|
|
|
for (i = 0; i < nkeys; ++i)
|
2023-07-27 03:55:16 +02:00
|
|
|
{
|
|
|
|
Assert(AttributeNumberIsValid(key[i]));
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
cp->cc_keyno[i] = key[i];
|
2023-07-27 03:55:16 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2001-06-18 05:35:07 +02:00
|
|
|
* new cache is initialized as far as we can go for now. print some
|
|
|
|
* debugging information, if appropriate.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2003-05-27 19:49:47 +02:00
|
|
|
InitCatCache_DEBUG2;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-06-18 05:35:07 +02:00
|
|
|
/*
|
|
|
|
* add completed cache to top of group header's list
|
|
|
|
*/
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_push_head(&CacheHdr->ch_caches, &cp->cc_next);
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* back to the old context before we return...
|
|
|
|
*/
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
2000-07-02 07:38:40 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
return cp;
|
|
|
|
}
|
|
|
|
|
2013-09-05 18:47:56 +02:00
|
|
|
/*
|
|
|
|
* Enlarge a catcache, doubling the number of buckets.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
RehashCatCache(CatCache *cp)
|
|
|
|
{
|
|
|
|
dlist_head *newbucket;
|
|
|
|
int newnbuckets;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
elog(DEBUG1, "rehashing catalog cache id %d for %s; %d tups, %d buckets",
|
|
|
|
cp->id, cp->cc_relname, cp->cc_ntup, cp->cc_nbuckets);
|
|
|
|
|
|
|
|
/* Allocate a new, larger, hash table. */
|
|
|
|
newnbuckets = cp->cc_nbuckets * 2;
|
|
|
|
newbucket = (dlist_head *) MemoryContextAllocZero(CacheMemoryContext, newnbuckets * sizeof(dlist_head));
|
|
|
|
|
|
|
|
/* Move all entries from old hash table to new. */
|
|
|
|
for (i = 0; i < cp->cc_nbuckets; i++)
|
|
|
|
{
|
|
|
|
dlist_mutable_iter iter;
|
2014-05-06 18:12:18 +02:00
|
|
|
|
2013-09-05 18:47:56 +02:00
|
|
|
dlist_foreach_modify(iter, &cp->cc_bucket[i])
|
|
|
|
{
|
|
|
|
CatCTup *ct = dlist_container(CatCTup, cache_elem, iter.cur);
|
|
|
|
int hashIndex = HASH_INDEX(ct->hash_value, newnbuckets);
|
|
|
|
|
|
|
|
dlist_delete(iter.cur);
|
|
|
|
dlist_push_head(&newbucket[hashIndex], &ct->cache_elem);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Switch to the new array. */
|
|
|
|
pfree(cp->cc_bucket);
|
|
|
|
cp->cc_nbuckets = newnbuckets;
|
|
|
|
cp->cc_bucket = newbucket;
|
|
|
|
}
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
/*
|
|
|
|
* CatalogCacheInitializeCache
|
|
|
|
*
|
|
|
|
* This function does final initialization of a catcache: obtain the tuple
|
|
|
|
* descriptor and set up the hash and equality function links. We assume
|
|
|
|
* that the relcache entry can be opened at this point!
|
|
|
|
*/
|
|
|
|
#ifdef CACHEDEBUG
|
2005-04-14 22:03:27 +02:00
|
|
|
#define CatalogCacheInitializeCache_DEBUG1 \
|
|
|
|
elog(DEBUG2, "CatalogCacheInitializeCache: cache @%p rel=%u", cache, \
|
|
|
|
cache->cc_reloid)
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
#define CatalogCacheInitializeCache_DEBUG2 \
|
|
|
|
do { \
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
if (cache->cc_keyno[i] > 0) { \
|
2003-05-27 19:49:47 +02:00
|
|
|
elog(DEBUG2, "CatalogCacheInitializeCache: load %d/%d w/%d, %u", \
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
i+1, cache->cc_nkeys, cache->cc_keyno[i], \
|
|
|
|
TupleDescAttr(tupdesc, cache->cc_keyno[i] - 1)->atttypid); \
|
2002-03-03 18:47:56 +01:00
|
|
|
} else { \
|
2003-05-27 19:49:47 +02:00
|
|
|
elog(DEBUG2, "CatalogCacheInitializeCache: load %d/%d w/%d", \
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
i+1, cache->cc_nkeys, cache->cc_keyno[i]); \
|
2002-03-03 18:47:56 +01:00
|
|
|
} \
|
|
|
|
} while(0)
|
|
|
|
#else
|
2005-04-14 22:03:27 +02:00
|
|
|
#define CatalogCacheInitializeCache_DEBUG1
|
2002-03-03 18:47:56 +01:00
|
|
|
#define CatalogCacheInitializeCache_DEBUG2
|
|
|
|
#endif
|
|
|
|
|
|
|
|
static void
|
|
|
|
CatalogCacheInitializeCache(CatCache *cache)
|
|
|
|
{
|
|
|
|
Relation relation;
|
|
|
|
MemoryContext oldcxt;
|
|
|
|
TupleDesc tupdesc;
|
|
|
|
int i;
|
|
|
|
|
2005-04-14 22:03:27 +02:00
|
|
|
CatalogCacheInitializeCache_DEBUG1;
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
relation = table_open(cache->cc_reloid, AccessShareLock);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* switch to the cache context so our allocations do not vanish at the end
|
|
|
|
* of a transaction
|
|
|
|
*/
|
|
|
|
Assert(CacheMemoryContext != NULL);
|
|
|
|
|
|
|
|
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* copy the relcache's tuple descriptor to permanent cache storage
|
|
|
|
*/
|
|
|
|
tupdesc = CreateTupleDescCopyConstr(RelationGetDescr(relation));
|
|
|
|
|
|
|
|
/*
|
2005-04-14 22:03:27 +02:00
|
|
|
* save the relation's name and relisshared flag, too (cc_relname is used
|
|
|
|
* only for debugging purposes)
|
2002-03-03 18:47:56 +01:00
|
|
|
*/
|
2005-04-14 22:03:27 +02:00
|
|
|
cache->cc_relname = pstrdup(RelationGetRelationName(relation));
|
2002-03-03 18:47:56 +01:00
|
|
|
cache->cc_relisshared = RelationGetForm(relation)->relisshared;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* return to the caller's memory context and close the rel
|
|
|
|
*/
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(relation, AccessShareLock);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "CatalogCacheInitializeCache: %s, %d keys",
|
|
|
|
cache->cc_relname, cache->cc_nkeys);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* initialize cache's key information
|
|
|
|
*/
|
|
|
|
for (i = 0; i < cache->cc_nkeys; ++i)
|
|
|
|
{
|
|
|
|
Oid keytype;
|
2003-11-09 22:30:38 +01:00
|
|
|
RegProcedure eqfunc;
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
CatalogCacheInitializeCache_DEBUG2;
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
if (cache->cc_keyno[i] > 0)
|
2014-06-21 00:20:56 +02:00
|
|
|
{
|
2017-08-20 20:19:07 +02:00
|
|
|
Form_pg_attribute attr = TupleDescAttr(tupdesc,
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
cache->cc_keyno[i] - 1);
|
2014-06-21 00:20:56 +02:00
|
|
|
|
|
|
|
keytype = attr->atttypid;
|
|
|
|
/* cache key columns should always be NOT NULL */
|
|
|
|
Assert(attr->attnotnull);
|
|
|
|
}
|
2002-03-03 18:47:56 +01:00
|
|
|
else
|
|
|
|
{
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
if (cache->cc_keyno[i] < 0)
|
|
|
|
elog(FATAL, "sys attributes are not supported in caches");
|
2002-03-03 18:47:56 +01:00
|
|
|
keytype = OIDOID;
|
|
|
|
}
|
|
|
|
|
2003-06-23 00:04:55 +02:00
|
|
|
GetCCHashEqFuncs(keytype,
|
|
|
|
&cache->cc_hashfunc[i],
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
&eqfunc,
|
|
|
|
&cache->cc_fastequal[i]);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
/*
|
2003-06-23 00:04:55 +02:00
|
|
|
* Do equality-function lookup (we assume this won't need a catalog
|
|
|
|
* lookup for any supported type)
|
2002-03-03 18:47:56 +01:00
|
|
|
*/
|
2003-11-09 22:30:38 +01:00
|
|
|
fmgr_info_cxt(eqfunc,
|
2002-03-03 18:47:56 +01:00
|
|
|
&cache->cc_skey[i].sk_func,
|
|
|
|
CacheMemoryContext);
|
|
|
|
|
|
|
|
/* Initialize sk_attno suitably for HeapKeyTest() and heap scans */
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
cache->cc_skey[i].sk_attno = cache->cc_keyno[i];
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2003-11-12 22:15:59 +01:00
|
|
|
/* Fill in sk_strategy as well --- always standard equality */
|
2003-11-09 22:30:38 +01:00
|
|
|
cache->cc_skey[i].sk_strategy = BTEqualStrategyNumber;
|
2003-11-12 22:15:59 +01:00
|
|
|
cache->cc_skey[i].sk_subtype = InvalidOid;
|
Make collation-aware system catalog columns use "C" collation.
Up to now we allowed text columns in system catalogs to use collation
"default", but that isn't really safe because it might mean something
different in template0 than it means in a database cloned from template0.
In particular, this could mean that cloned pg_statistic entries for such
columns weren't entirely valid, possibly leading to bogus planner
estimates, though (probably) not any outright failures.
In the wake of commit 5e0928005, a better solution is available: if we
label such columns with "C" collation, then their pg_statistic entries
will also use that collation and hence will be valid independently of
the database collation.
This also provides a cleaner solution for indexes on such columns than
the hack added by commit 0b28ea79c: the indexes will naturally inherit
"C" collation and don't have to be forced to use text_pattern_ops.
Also, with the planned improvement of type "name" to be collation-aware,
this policy will apply cleanly to both text and name columns.
Because of the pg_statistic angle, we should also apply this policy
to the tables in information_schema. This patch does that by adjusting
information_schema's textual domain types to specify "C" collation.
That has the user-visible effect that order-sensitive comparisons to
textual information_schema view columns will now use "C" collation
by default. The SQL standard says that the collation of those view
columns is implementation-defined, so I think this is legal per spec.
At some point this might allow for translation of such comparisons
into indexable conditions on the underlying "name" columns, although
additional work will be needed before that can happen.
Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
2018-12-18 18:48:15 +01:00
|
|
|
/* If a catcache key requires a collation, it must be C collation */
|
|
|
|
cache->cc_skey[i].sk_collation = C_COLLATION_OID;
|
2003-11-09 22:30:38 +01:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "CatalogCacheInitializeCache %s %d %p",
|
|
|
|
cache->cc_relname, i, cache);
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* mark this cache fully initialized
|
|
|
|
*/
|
|
|
|
cache->cc_tupdesc = tupdesc;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* InitCatCachePhase2 -- external interface for CatalogCacheInitializeCache
|
|
|
|
*
|
2006-10-06 20:23:35 +02:00
|
|
|
* One reason to call this routine is to ensure that the relcache has
|
|
|
|
* created entries for all the catalogs and indexes referenced by catcaches.
|
|
|
|
* Therefore, provide an option to open the index as well as fixing the
|
|
|
|
* cache itself. An exception is the indexes on pg_am, which we don't use
|
|
|
|
* (cf. IndexScanOK).
|
2002-03-03 18:47:56 +01:00
|
|
|
*/
|
|
|
|
void
|
2006-10-06 20:23:35 +02:00
|
|
|
InitCatCachePhase2(CatCache *cache, bool touch_index)
|
2002-03-03 18:47:56 +01:00
|
|
|
{
|
|
|
|
if (cache->cc_tupdesc == NULL)
|
|
|
|
CatalogCacheInitializeCache(cache);
|
|
|
|
|
2006-10-06 20:23:35 +02:00
|
|
|
if (touch_index &&
|
|
|
|
cache->id != AMOID &&
|
2002-03-03 18:47:56 +01:00
|
|
|
cache->id != AMNAME)
|
|
|
|
{
|
|
|
|
Relation idesc;
|
|
|
|
|
2011-03-22 18:00:24 +01:00
|
|
|
/*
|
|
|
|
* We must lock the underlying catalog before opening the index to
|
|
|
|
* avoid deadlock, since index_open could possibly result in reading
|
|
|
|
* this same catalog, and if anyone else is exclusive-locking this
|
|
|
|
* catalog and index they'll be doing it in that order.
|
|
|
|
*/
|
|
|
|
LockRelationOid(cache->cc_reloid, AccessShareLock);
|
2006-07-31 22:09:10 +02:00
|
|
|
idesc = index_open(cache->cc_indexoid, AccessShareLock);
|
2014-06-21 00:20:56 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* While we've got the index open, let's check that it's unique (and
|
|
|
|
* not just deferrable-unique, thank you very much). This is just to
|
|
|
|
* catch thinkos in definitions of new catcaches, so we don't worry
|
|
|
|
* about the pg_am indexes not getting tested.
|
|
|
|
*/
|
|
|
|
Assert(idesc->rd_index->indisunique &&
|
|
|
|
idesc->rd_index->indimmediate);
|
|
|
|
|
2006-07-31 22:09:10 +02:00
|
|
|
index_close(idesc, AccessShareLock);
|
2011-03-22 18:00:24 +01:00
|
|
|
UnlockRelationOid(cache->cc_reloid, AccessShareLock);
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2000-11-10 01:33:12 +01:00
|
|
|
* IndexScanOK
|
1999-11-01 03:29:27 +01:00
|
|
|
*
|
2000-11-10 01:33:12 +01:00
|
|
|
* This function checks for tuples that will be fetched by
|
|
|
|
* IndexSupportInitialize() during relcache initialization for
|
|
|
|
* certain system indexes that support critical syscaches.
|
|
|
|
* We can't use an indexscan to fetch these, else we'll get into
|
|
|
|
* infinite recursion. A plain heap scan will work, however.
|
2002-02-19 21:11:20 +01:00
|
|
|
* Once we have completed relcache initialization (signaled by
|
|
|
|
* criticalRelcachesBuilt), we don't have to worry anymore.
|
2010-04-21 01:48:47 +02:00
|
|
|
*
|
|
|
|
* Similarly, during backend startup we have to be able to use the
|
2021-03-26 18:42:17 +01:00
|
|
|
* pg_authid, pg_auth_members and pg_database syscaches for
|
|
|
|
* authentication even if we don't yet have relcache entries for those
|
|
|
|
* catalogs' indexes.
|
1999-11-01 03:29:27 +01:00
|
|
|
*/
|
2000-11-10 01:33:12 +01:00
|
|
|
static bool
|
|
|
|
IndexScanOK(CatCache *cache, ScanKey cur_skey)
|
1999-11-01 03:29:27 +01:00
|
|
|
{
|
2010-04-21 01:48:47 +02:00
|
|
|
switch (cache->id)
|
1999-11-01 03:29:27 +01:00
|
|
|
{
|
2010-04-21 01:48:47 +02:00
|
|
|
case INDEXRELID:
|
2010-07-06 21:19:02 +02:00
|
|
|
|
2010-04-21 01:48:47 +02:00
|
|
|
/*
|
|
|
|
* Rather than tracking exactly which indexes have to be loaded
|
|
|
|
* before we can use indexscans (which changes from time to time),
|
|
|
|
* just force all pg_index searches to be heap scans until we've
|
|
|
|
* built the critical relcaches.
|
|
|
|
*/
|
|
|
|
if (!criticalRelcachesBuilt)
|
|
|
|
return false;
|
|
|
|
break;
|
|
|
|
|
|
|
|
case AMOID:
|
|
|
|
case AMNAME:
|
2010-07-06 21:19:02 +02:00
|
|
|
|
2010-04-21 01:48:47 +02:00
|
|
|
/*
|
|
|
|
* Always do heap scans in pg_am, because it's so small there's
|
|
|
|
* not much point in an indexscan anyway. We *must* do this when
|
|
|
|
* initially building critical relcache entries, but we might as
|
|
|
|
* well just always do it.
|
|
|
|
*/
|
2000-11-10 01:33:12 +01:00
|
|
|
return false;
|
2010-04-21 01:48:47 +02:00
|
|
|
|
|
|
|
case AUTHNAME:
|
|
|
|
case AUTHOID:
|
|
|
|
case AUTHMEMMEMROLE:
|
2021-03-26 18:42:17 +01:00
|
|
|
case DATABASEOID:
|
2010-07-06 21:19:02 +02:00
|
|
|
|
2010-04-21 01:48:47 +02:00
|
|
|
/*
|
|
|
|
* Protect authentication lookups occurring before relcache has
|
|
|
|
* collected entries for shared indexes.
|
|
|
|
*/
|
|
|
|
if (!criticalSharedRelcachesBuilt)
|
|
|
|
return false;
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
break;
|
2001-08-21 18:36:06 +02:00
|
|
|
}
|
1999-11-01 03:29:27 +01:00
|
|
|
|
2000-11-10 01:33:12 +01:00
|
|
|
/* Normal case, allow index scan */
|
|
|
|
return true;
|
1999-11-01 03:29:27 +01:00
|
|
|
}
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2021-01-13 02:32:21 +01:00
|
|
|
* SearchCatCache
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* This call searches a system cache for a tuple, opening the relation
|
2002-03-03 18:47:56 +01:00
|
|
|
* if necessary (on the first access to a particular cache).
|
|
|
|
*
|
|
|
|
* The result is NULL if not found, or a pointer to a HeapTuple in
|
|
|
|
* the cache. The caller must not modify the tuple, and must call
|
|
|
|
* ReleaseCatCache() when done with it.
|
|
|
|
*
|
|
|
|
* The search key values should be expressed as Datums of the key columns'
|
|
|
|
* datatype(s). (Pass zeroes for any unused parameters.) As a special
|
|
|
|
* exception, the passed-in key for a NAME column can be just a C string;
|
|
|
|
* the caller need not go to the trouble of converting it to a fully
|
|
|
|
* null-padded NAME.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
HeapTuple
|
2000-11-16 23:30:52 +01:00
|
|
|
SearchCatCache(CatCache *cache,
|
1996-07-09 08:22:35 +02:00
|
|
|
Datum v1,
|
|
|
|
Datum v2,
|
|
|
|
Datum v3,
|
|
|
|
Datum v4)
|
|
|
|
{
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
return SearchCatCacheInternal(cache, cache->cc_nkeys, v1, v2, v3, v4);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SearchCatCacheN() are SearchCatCache() versions for a specific number of
|
|
|
|
* arguments. The compiler can inline the body and unroll loops, making them a
|
|
|
|
* bit faster than SearchCatCache().
|
|
|
|
*/
|
|
|
|
|
|
|
|
HeapTuple
|
|
|
|
SearchCatCache1(CatCache *cache,
|
|
|
|
Datum v1)
|
|
|
|
{
|
|
|
|
return SearchCatCacheInternal(cache, 1, v1, 0, 0, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
HeapTuple
|
|
|
|
SearchCatCache2(CatCache *cache,
|
|
|
|
Datum v1, Datum v2)
|
|
|
|
{
|
|
|
|
return SearchCatCacheInternal(cache, 2, v1, v2, 0, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
HeapTuple
|
|
|
|
SearchCatCache3(CatCache *cache,
|
|
|
|
Datum v1, Datum v2, Datum v3)
|
|
|
|
{
|
|
|
|
return SearchCatCacheInternal(cache, 3, v1, v2, v3, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
HeapTuple
|
|
|
|
SearchCatCache4(CatCache *cache,
|
|
|
|
Datum v1, Datum v2, Datum v3, Datum v4)
|
|
|
|
{
|
|
|
|
return SearchCatCacheInternal(cache, 4, v1, v2, v3, v4);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Work-horse for SearchCatCache/SearchCatCacheN.
|
|
|
|
*/
|
|
|
|
static inline HeapTuple
|
|
|
|
SearchCatCacheInternal(CatCache *cache,
|
|
|
|
int nkeys,
|
|
|
|
Datum v1,
|
|
|
|
Datum v2,
|
|
|
|
Datum v3,
|
|
|
|
Datum v4)
|
|
|
|
{
|
|
|
|
Datum arguments[CATCACHE_MAXKEYS];
|
2002-03-03 18:47:56 +01:00
|
|
|
uint32 hashValue;
|
|
|
|
Index hashIndex;
|
2012-10-19 01:30:43 +02:00
|
|
|
dlist_iter iter;
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_head *bucket;
|
2000-11-16 23:30:52 +01:00
|
|
|
CatCTup *ct;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2015-05-20 15:18:11 +02:00
|
|
|
/* Make sure we're in an xact, even if this ends up being a cache hit */
|
2013-07-15 19:31:36 +02:00
|
|
|
Assert(IsTransactionState());
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Assert(cache->cc_nkeys == nkeys);
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2001-08-21 18:36:06 +02:00
|
|
|
* one-time startup overhead for each cache
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
if (unlikely(cache->cc_tupdesc == NULL))
|
2000-11-10 01:33:12 +01:00
|
|
|
CatalogCacheInitializeCache(cache);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-02-19 21:11:20 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
cache->cc_searches++;
|
|
|
|
#endif
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/* Initialize local parameter array */
|
|
|
|
arguments[0] = v1;
|
|
|
|
arguments[1] = v2;
|
|
|
|
arguments[2] = v3;
|
|
|
|
arguments[3] = v4;
|
1999-11-01 03:29:27 +01:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* find the hash bucket in which to look for the tuple
|
|
|
|
*/
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
|
2002-03-06 21:49:46 +01:00
|
|
|
hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* scan the hash bucket until we find a match or exhaust our tuples
|
2012-10-19 01:30:43 +02:00
|
|
|
*
|
|
|
|
* Note: it's okay to use dlist_foreach here, even though we modify the
|
|
|
|
* dlist within the loop, because we don't continue the loop afterwards.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2012-10-16 22:36:30 +02:00
|
|
|
bucket = &cache->cc_bucket[hashIndex];
|
2012-10-19 01:30:43 +02:00
|
|
|
dlist_foreach(iter, bucket)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
ct = dlist_container(CatCTup, cache_elem, iter.cur);
|
2000-11-16 23:30:52 +01:00
|
|
|
|
|
|
|
if (ct->dead)
|
|
|
|
continue; /* ignore dead entries */
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
if (ct->hash_value != hashValue)
|
|
|
|
continue; /* quickly skip entry if wrong hash val */
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
|
2000-11-16 23:30:52 +01:00
|
|
|
continue;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2006-06-15 04:08:09 +02:00
|
|
|
* We found a match in the cache. Move it to the front of the list
|
|
|
|
* for its hashbucket, in order to speed subsequent searches. (The
|
|
|
|
* most frequently accessed elements in any hashbucket will tend to be
|
|
|
|
* near the front of the hashbucket's list.)
|
2000-11-16 23:30:52 +01:00
|
|
|
*/
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_move_head(bucket, &ct->cache_elem);
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
/*
|
|
|
|
* If it's a positive entry, bump its refcount and return it. If it's
|
|
|
|
* negative, we can report failure to the caller.
|
|
|
|
*/
|
|
|
|
if (!ct->negative)
|
|
|
|
{
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
ResourceOwnerEnlarge(CurrentResourceOwner);
|
2002-03-03 18:47:56 +01:00
|
|
|
ct->refcount++;
|
2004-07-17 05:32:14 +02:00
|
|
|
ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, &ct->tuple);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "SearchCatCache(%s): found in bucket %d",
|
|
|
|
cache->cc_relname, hashIndex);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-02-19 21:11:20 +01:00
|
|
|
#ifdef CATCACHE_STATS
|
2002-03-03 18:47:56 +01:00
|
|
|
cache->cc_hits++;
|
2002-02-19 21:11:20 +01:00
|
|
|
#endif
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
return &ct->tuple;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "SearchCatCache(%s): found neg entry in bucket %d",
|
|
|
|
cache->cc_relname, hashIndex);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
cache->cc_neg_hits++;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
return SearchCatCacheMiss(cache, nkeys, hashValue, hashIndex, v1, v2, v3, v4);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Search the actual catalogs, rather than the cache.
|
|
|
|
*
|
|
|
|
* This is kept separate from SearchCatCacheInternal() to keep the fast-path
|
|
|
|
* as small as possible. To avoid that effort being undone by a helpful
|
|
|
|
* compiler, try to explicitly forbid inlining.
|
|
|
|
*/
|
|
|
|
static pg_noinline HeapTuple
|
|
|
|
SearchCatCacheMiss(CatCache *cache,
|
|
|
|
int nkeys,
|
|
|
|
uint32 hashValue,
|
|
|
|
Index hashIndex,
|
|
|
|
Datum v1,
|
|
|
|
Datum v2,
|
|
|
|
Datum v3,
|
|
|
|
Datum v4)
|
|
|
|
{
|
|
|
|
ScanKeyData cur_skey[CATCACHE_MAXKEYS];
|
|
|
|
Relation relation;
|
|
|
|
SysScanDesc scandesc;
|
|
|
|
HeapTuple ntp;
|
|
|
|
CatCTup *ct;
|
|
|
|
Datum arguments[CATCACHE_MAXKEYS];
|
|
|
|
|
|
|
|
/* Initialize local parameter array */
|
|
|
|
arguments[0] = v1;
|
|
|
|
arguments[1] = v2;
|
|
|
|
arguments[2] = v3;
|
|
|
|
arguments[3] = v4;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Ok, need to make a lookup in the relation, copy the scankey and fill
|
|
|
|
* out any per-call fields.
|
|
|
|
*/
|
|
|
|
memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * nkeys);
|
|
|
|
cur_skey[0].sk_argument = v1;
|
|
|
|
cur_skey[1].sk_argument = v2;
|
|
|
|
cur_skey[2].sk_argument = v3;
|
|
|
|
cur_skey[3].sk_argument = v4;
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2002-03-03 18:47:56 +01:00
|
|
|
* Tuple was not found in cache, so we have to try to retrieve it directly
|
|
|
|
* from the relation. If found, we will add it to the cache; if not
|
|
|
|
* found, we will add a negative cache entry instead.
|
1999-06-04 04:19:47 +02:00
|
|
|
*
|
2000-11-10 01:33:12 +01:00
|
|
|
* NOTE: it is possible for recursive cache lookups to occur while reading
|
|
|
|
* the relation --- for example, due to shared-cache-inval messages being
|
2019-01-21 19:32:19 +01:00
|
|
|
* processed during table_open(). This is OK. It's even possible for one
|
2000-11-10 01:33:12 +01:00
|
|
|
* of those lookups to find and enter the very same tuple we are trying to
|
|
|
|
* fetch here. If that happens, we will enter a second copy of the tuple
|
|
|
|
* into the cache. The first copy will never be referenced again, and
|
|
|
|
* will eventually age out of the cache, so there's no functional problem.
|
|
|
|
* This case is rare enough that it's not worth expending extra cycles to
|
|
|
|
* detect.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2019-01-21 19:32:19 +01:00
|
|
|
relation = table_open(cache->cc_reloid, AccessShareLock);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
scandesc = systable_beginscan(relation,
|
2005-04-14 22:03:27 +02:00
|
|
|
cache->cc_indexoid,
|
2002-04-06 08:59:25 +02:00
|
|
|
IndexScanOK(cache, cur_skey),
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL,
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
nkeys,
|
2002-04-06 08:59:25 +02:00
|
|
|
cur_skey);
|
|
|
|
|
|
|
|
ct = NULL;
|
|
|
|
|
|
|
|
while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
|
|
|
|
{
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
ct = CatalogCacheCreateEntry(cache, ntp, arguments,
|
2002-04-06 08:59:25 +02:00
|
|
|
hashValue, hashIndex,
|
|
|
|
false);
|
2004-07-17 05:32:14 +02:00
|
|
|
/* immediately set the refcount to 1 */
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
ResourceOwnerEnlarge(CurrentResourceOwner);
|
2004-07-17 05:32:14 +02:00
|
|
|
ct->refcount++;
|
|
|
|
ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, &ct->tuple);
|
2002-04-06 08:59:25 +02:00
|
|
|
break; /* assume only one match */
|
|
|
|
}
|
|
|
|
|
|
|
|
systable_endscan(scandesc);
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(relation, AccessShareLock);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2002-04-06 08:59:25 +02:00
|
|
|
* If tuple was not found, we need to build a negative cache entry
|
|
|
|
* containing a fake tuple. The fake tuple has the correct key columns,
|
|
|
|
* but nulls everywhere else.
|
2005-04-14 22:03:27 +02:00
|
|
|
*
|
|
|
|
* In bootstrap mode, we don't build negative entries, because the cache
|
|
|
|
* invalidation mechanism isn't alive and can't clear them if the tuple
|
|
|
|
* gets created later. (Bootstrap doesn't do UPDATEs, so it doesn't need
|
|
|
|
* cache inval for that.)
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2002-04-06 08:59:25 +02:00
|
|
|
if (ct == NULL)
|
|
|
|
{
|
2005-04-14 22:03:27 +02:00
|
|
|
if (IsBootstrapProcessingMode())
|
|
|
|
return NULL;
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
ct = CatalogCacheCreateEntry(cache, NULL, arguments,
|
2002-04-06 08:59:25 +02:00
|
|
|
hashValue, hashIndex,
|
|
|
|
true);
|
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "SearchCatCache(%s): Contains %d/%d tuples",
|
|
|
|
cache->cc_relname, cache->cc_ntup, CacheHdr->ch_ntup);
|
|
|
|
CACHE_elog(DEBUG2, "SearchCatCache(%s): put neg entry in bucket %d",
|
|
|
|
cache->cc_relname, hashIndex);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
/*
|
2004-07-17 05:32:14 +02:00
|
|
|
* We are not returning the negative entry to the caller, so leave its
|
|
|
|
* refcount zero.
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "SearchCatCache(%s): Contains %d/%d tuples",
|
|
|
|
cache->cc_relname, cache->cc_ntup, CacheHdr->ch_ntup);
|
|
|
|
CACHE_elog(DEBUG2, "SearchCatCache(%s): put in bucket %d",
|
|
|
|
cache->cc_relname, hashIndex);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
cache->cc_newloads++;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
return &ct->tuple;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* ReleaseCatCache
|
|
|
|
*
|
|
|
|
* Decrement the reference count of a catcache entry (releasing the
|
|
|
|
* hold grabbed by a successful SearchCatCache).
|
|
|
|
*
|
|
|
|
* NOTE: if compiled with -DCATCACHE_FORCE_RELEASE then catcache entries
|
|
|
|
* will be freed as soon as their refcount goes to zero. In combination
|
|
|
|
* with aset.c's CLOBBER_FREED_MEMORY option, this provides a good test
|
|
|
|
* to catch references to already-released catcache entries.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
ReleaseCatCache(HeapTuple tuple)
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
{
|
|
|
|
ReleaseCatCacheWithOwner(tuple, CurrentResourceOwner);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
ReleaseCatCacheWithOwner(HeapTuple tuple, ResourceOwner resowner)
|
2002-04-06 08:59:25 +02:00
|
|
|
{
|
|
|
|
CatCTup *ct = (CatCTup *) (((char *) tuple) -
|
|
|
|
offsetof(CatCTup, tuple));
|
|
|
|
|
|
|
|
/* Safety checks to ensure we were handed a cache entry */
|
|
|
|
Assert(ct->ct_magic == CT_MAGIC);
|
|
|
|
Assert(ct->refcount > 0);
|
|
|
|
|
|
|
|
ct->refcount--;
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
if (resowner)
|
|
|
|
ResourceOwnerForgetCatCacheRef(CurrentResourceOwner, &ct->tuple);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-14 00:18:07 +02:00
|
|
|
if (
|
2002-04-06 08:59:25 +02:00
|
|
|
#ifndef CATCACHE_FORCE_RELEASE
|
2005-08-14 00:18:07 +02:00
|
|
|
ct->dead &&
|
2002-04-06 08:59:25 +02:00
|
|
|
#endif
|
2005-08-14 00:18:07 +02:00
|
|
|
ct->refcount == 0 &&
|
|
|
|
(ct->c_list == NULL || ct->c_list->refcount == 0))
|
2002-04-06 08:59:25 +02:00
|
|
|
CatCacheRemoveCTup(ct->my_cache, ct);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2012-03-07 20:51:13 +01:00
|
|
|
/*
|
|
|
|
* GetCatCacheHashValue
|
|
|
|
*
|
|
|
|
* Compute the hash value for a given set of search keys.
|
|
|
|
*
|
|
|
|
* The reason for exposing this as part of the API is that the hash value is
|
|
|
|
* exposed in cache invalidation operations, so there are places outside the
|
|
|
|
* catcache code that need to be able to compute the hash values.
|
|
|
|
*/
|
|
|
|
uint32
|
|
|
|
GetCatCacheHashValue(CatCache *cache,
|
|
|
|
Datum v1,
|
|
|
|
Datum v2,
|
|
|
|
Datum v3,
|
|
|
|
Datum v4)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* one-time startup overhead for each cache
|
|
|
|
*/
|
|
|
|
if (cache->cc_tupdesc == NULL)
|
|
|
|
CatalogCacheInitializeCache(cache);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* calculate the hash value
|
|
|
|
*/
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
return CatalogCacheComputeHashValue(cache, cache->cc_nkeys, v1, v2, v3, v4);
|
2012-03-07 20:51:13 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
/*
|
|
|
|
* SearchCatCacheList
|
|
|
|
*
|
|
|
|
* Generate a list of all tuples matching a partial key (that is,
|
|
|
|
* a key specifying just the first K of the cache's N key columns).
|
|
|
|
*
|
2018-01-29 21:13:07 +01:00
|
|
|
* It doesn't make any sense to specify all of the cache's key columns
|
|
|
|
* here: since the key is unique, there could be at most one match, so
|
|
|
|
* you ought to use SearchCatCache() instead. Hence this function takes
|
2021-02-24 08:13:17 +01:00
|
|
|
* one fewer Datum argument than SearchCatCache() does.
|
2018-01-29 21:13:07 +01:00
|
|
|
*
|
2002-04-06 08:59:25 +02:00
|
|
|
* The caller must not modify the list object or the pointed-to tuples,
|
|
|
|
* and must call ReleaseCatCacheList() when done with the list.
|
|
|
|
*/
|
|
|
|
CatCList *
|
|
|
|
SearchCatCacheList(CatCache *cache,
|
|
|
|
int nkeys,
|
|
|
|
Datum v1,
|
|
|
|
Datum v2,
|
2018-01-29 21:13:07 +01:00
|
|
|
Datum v3)
|
2002-04-06 08:59:25 +02:00
|
|
|
{
|
2018-01-29 21:13:07 +01:00
|
|
|
Datum v4 = 0; /* dummy last-column value */
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Datum arguments[CATCACHE_MAXKEYS];
|
2002-04-06 08:59:25 +02:00
|
|
|
uint32 lHashValue;
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_iter iter;
|
2002-04-06 08:59:25 +02:00
|
|
|
CatCList *cl;
|
|
|
|
CatCTup *ct;
|
2005-08-08 21:17:23 +02:00
|
|
|
List *volatile ctlist;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *ctlist_item;
|
2002-04-06 08:59:25 +02:00
|
|
|
int nmembers;
|
|
|
|
bool ordered;
|
|
|
|
HeapTuple ntp;
|
|
|
|
MemoryContext oldcxt;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* one-time startup overhead for each cache
|
|
|
|
*/
|
|
|
|
if (cache->cc_tupdesc == NULL)
|
|
|
|
CatalogCacheInitializeCache(cache);
|
|
|
|
|
|
|
|
Assert(nkeys > 0 && nkeys < cache->cc_nkeys);
|
|
|
|
|
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
cache->cc_lsearches++;
|
|
|
|
#endif
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/* Initialize local parameter array */
|
|
|
|
arguments[0] = v1;
|
|
|
|
arguments[1] = v2;
|
|
|
|
arguments[2] = v3;
|
|
|
|
arguments[3] = v4;
|
2002-03-03 18:47:56 +01:00
|
|
|
|
|
|
|
/*
|
2002-04-06 08:59:25 +02:00
|
|
|
* compute a hash value of the given keys for faster search. We don't
|
|
|
|
* presently divide the CatCList items into buckets, but this still lets
|
|
|
|
* us skip non-matching items quickly most of the time.
|
2002-03-03 18:47:56 +01:00
|
|
|
*/
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
lHashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2002-04-06 08:59:25 +02:00
|
|
|
* scan the items until we find a match or exhaust our list
|
2012-10-19 01:30:43 +02:00
|
|
|
*
|
|
|
|
* Note: it's okay to use dlist_foreach here, even though we modify the
|
|
|
|
* dlist within the loop, because we don't continue the loop afterwards.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach(iter, &cache->cc_lists)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
cl = dlist_container(CatCList, cache_elem, iter.cur);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
if (cl->dead)
|
|
|
|
continue; /* ignore dead entries */
|
|
|
|
|
|
|
|
if (cl->hash_value != lHashValue)
|
|
|
|
continue; /* quickly skip entry if wrong hash val */
|
2000-11-10 01:33:12 +01:00
|
|
|
|
2000-11-24 05:16:12 +01:00
|
|
|
/*
|
2002-04-06 08:59:25 +02:00
|
|
|
* see if the cached list matches our key.
|
2000-11-24 05:16:12 +01:00
|
|
|
*/
|
2002-04-06 08:59:25 +02:00
|
|
|
if (cl->nkeys != nkeys)
|
|
|
|
continue;
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
|
|
|
|
if (!CatalogCacheCompareTuple(cache, nkeys, cl->keys, arguments))
|
2002-04-06 08:59:25 +02:00
|
|
|
continue;
|
2000-11-24 05:16:12 +01:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
/*
|
2006-06-15 04:08:09 +02:00
|
|
|
* We found a matching list. Move the list to the front of the
|
|
|
|
* cache's list-of-lists, to speed subsequent searches. (We do not
|
2005-08-14 00:18:07 +02:00
|
|
|
* move the members to the fronts of their hashbucket lists, however,
|
2002-04-06 08:59:25 +02:00
|
|
|
* since there's no point in that unless they are searched for
|
2005-08-14 00:18:07 +02:00
|
|
|
* individually.)
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_move_head(&cache->cc_lists, &cl->cache_elem);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
/* Bump the list's refcount and return it */
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
ResourceOwnerEnlarge(CurrentResourceOwner);
|
2002-04-06 08:59:25 +02:00
|
|
|
cl->refcount++;
|
2004-07-17 05:32:14 +02:00
|
|
|
ResourceOwnerRememberCatCacheListRef(CurrentResourceOwner, cl);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "SearchCatCacheList(%s): found list",
|
|
|
|
cache->cc_relname);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
#ifdef CATCACHE_STATS
|
|
|
|
cache->cc_lhits++;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
return cl;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* List was not found in cache, so we have to build it by reading the
|
|
|
|
* relation. For each matching tuple found in the relation, use an
|
|
|
|
* existing cache entry if possible, else build a new one.
|
2005-08-08 21:17:23 +02:00
|
|
|
*
|
2005-08-14 00:18:07 +02:00
|
|
|
* We have to bump the member refcounts temporarily to ensure they won't
|
2005-08-08 21:17:23 +02:00
|
|
|
* get dropped from the cache while loading other members. We use a PG_TRY
|
|
|
|
* block to ensure we can undo those refcounts if we get an error before
|
|
|
|
* we finish constructing the CatCList.
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
|
|
|
ctlist = NIL;
|
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
PG_TRY();
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
ScanKeyData cur_skey[CATCACHE_MAXKEYS];
|
2005-08-08 21:17:23 +02:00
|
|
|
Relation relation;
|
|
|
|
SysScanDesc scandesc;
|
2000-11-10 01:33:12 +01:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/*
|
|
|
|
* Ok, need to make a lookup in the relation, copy the scankey and
|
|
|
|
* fill out any per-call fields.
|
|
|
|
*/
|
|
|
|
memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * cache->cc_nkeys);
|
|
|
|
cur_skey[0].sk_argument = v1;
|
|
|
|
cur_skey[1].sk_argument = v2;
|
|
|
|
cur_skey[2].sk_argument = v3;
|
|
|
|
cur_skey[3].sk_argument = v4;
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
relation = table_open(cache->cc_reloid, AccessShareLock);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
scandesc = systable_beginscan(relation,
|
|
|
|
cache->cc_indexoid,
|
2010-04-21 01:48:47 +02:00
|
|
|
IndexScanOK(cache, cur_skey),
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL,
|
2005-08-08 21:17:23 +02:00
|
|
|
nkeys,
|
|
|
|
cur_skey);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
/* The list will be ordered iff we are doing an index scan */
|
|
|
|
ordered = (scandesc->irel != NULL);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
|
|
|
|
{
|
|
|
|
uint32 hashValue;
|
|
|
|
Index hashIndex;
|
2012-10-16 22:36:30 +02:00
|
|
|
bool found = false;
|
|
|
|
dlist_head *bucket;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
/*
|
2005-08-08 21:17:23 +02:00
|
|
|
* See if there's an entry for this tuple already.
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
2005-08-08 21:17:23 +02:00
|
|
|
ct = NULL;
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
hashValue = CatalogCacheComputeTupleHashValue(cache, cache->cc_nkeys, ntp);
|
2005-08-08 21:17:23 +02:00
|
|
|
hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
bucket = &cache->cc_bucket[hashIndex];
|
|
|
|
dlist_foreach(iter, bucket)
|
2005-08-08 21:17:23 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
ct = dlist_container(CatCTup, cache_elem, iter.cur);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
if (ct->dead || ct->negative)
|
|
|
|
continue; /* ignore dead and negative entries */
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
if (ct->hash_value != hashValue)
|
|
|
|
continue; /* quickly skip entry if wrong hash val */
|
|
|
|
|
|
|
|
if (!ItemPointerEquals(&(ct->tuple.t_self), &(ntp->t_self)))
|
|
|
|
continue; /* not same tuple */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Found a match, but can't use it if it belongs to another
|
|
|
|
* list already
|
|
|
|
*/
|
|
|
|
if (ct->c_list)
|
|
|
|
continue;
|
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
found = true;
|
2006-06-15 04:08:09 +02:00
|
|
|
break; /* A-OK */
|
2005-08-08 21:17:23 +02:00
|
|
|
}
|
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
if (!found)
|
2005-08-08 21:17:23 +02:00
|
|
|
{
|
|
|
|
/* We didn't find a usable entry, so make a new one */
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
ct = CatalogCacheCreateEntry(cache, ntp, arguments,
|
2005-08-08 21:17:23 +02:00
|
|
|
hashValue, hashIndex,
|
|
|
|
false);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Careful here: add entry to ctlist, then bump its refcount */
|
2005-08-14 00:18:07 +02:00
|
|
|
/* This way leaves state correct if lappend runs out of memory */
|
2005-08-08 21:17:23 +02:00
|
|
|
ctlist = lappend(ctlist, ct);
|
|
|
|
ct->refcount++;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
systable_endscan(scandesc);
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(relation, AccessShareLock);
|
2005-08-08 21:17:23 +02:00
|
|
|
|
Move a few ResourceOwnerEnlarge() calls for safety and clarity.
These are functions where a lot of things happen between the
ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important
that there are no unrelated ResourceOwnerRemember calls in the code in
between, otherwise the reserved entry might be used up by the
intervening ResourceOwnerRemember and not be available at the intended
ResourceOwnerRemember call anymore. I don't see any bugs here, but the
longer the code path between the calls is, the harder it is to verify.
In bufmgr.c, there is a function similar to ResourceOwnerEnlarge,
ReservePrivateRefCountEntry(), to ensure that the private refcount
array has enough space. The ReservePrivateRefCountEntry() calls were
made at different places than the ResourceOwnerEnlargeBuffers()
calls. Move the ResourceOwnerEnlargeBuffers() and
ReservePrivateRefCountEntry() calls together for consistency.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:46 +01:00
|
|
|
/* Make sure the resource owner has room to remember this entry. */
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
ResourceOwnerEnlarge(CurrentResourceOwner);
|
Move a few ResourceOwnerEnlarge() calls for safety and clarity.
These are functions where a lot of things happen between the
ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important
that there are no unrelated ResourceOwnerRemember calls in the code in
between, otherwise the reserved entry might be used up by the
intervening ResourceOwnerRemember and not be available at the intended
ResourceOwnerRemember call anymore. I don't see any bugs here, but the
longer the code path between the calls is, the harder it is to verify.
In bufmgr.c, there is a function similar to ResourceOwnerEnlarge,
ReservePrivateRefCountEntry(), to ensure that the private refcount
array has enough space. The ReservePrivateRefCountEntry() calls were
made at different places than the ResourceOwnerEnlargeBuffers()
calls. Move the ResourceOwnerEnlargeBuffers() and
ReservePrivateRefCountEntry() calls together for consistency.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:46 +01:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/* Now we can build the CatCList entry. */
|
2005-08-08 21:17:23 +02:00
|
|
|
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
|
|
|
|
nmembers = list_length(ctlist);
|
|
|
|
cl = (CatCList *)
|
2015-02-20 06:11:42 +01:00
|
|
|
palloc(offsetof(CatCList, members) + nmembers * sizeof(CatCTup *));
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
|
|
|
|
/* Extract key values */
|
|
|
|
CatCacheCopyKeys(cache->cc_tupdesc, nkeys, cache->cc_keyno,
|
|
|
|
arguments, cl->keys);
|
2005-08-08 21:17:23 +02:00
|
|
|
MemoryContextSwitchTo(oldcxt);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
/*
|
|
|
|
* We are now past the last thing that could trigger an elog before we
|
|
|
|
* have finished building the CatCList and remembering it in the
|
|
|
|
* resource owner. So it's OK to fall out of the PG_TRY, and indeed
|
|
|
|
* we'd better do so before we start marking the members as belonging
|
|
|
|
* to the list.
|
|
|
|
*/
|
|
|
|
}
|
|
|
|
PG_CATCH();
|
|
|
|
{
|
|
|
|
foreach(ctlist_item, ctlist)
|
|
|
|
{
|
|
|
|
ct = (CatCTup *) lfirst(ctlist_item);
|
|
|
|
Assert(ct->c_list == NULL);
|
|
|
|
Assert(ct->refcount > 0);
|
|
|
|
ct->refcount--;
|
2005-08-14 00:18:07 +02:00
|
|
|
if (
|
2005-08-08 21:17:23 +02:00
|
|
|
#ifndef CATCACHE_FORCE_RELEASE
|
2005-08-14 00:18:07 +02:00
|
|
|
ct->dead &&
|
2005-08-08 21:17:23 +02:00
|
|
|
#endif
|
2005-08-14 00:18:07 +02:00
|
|
|
ct->refcount == 0 &&
|
|
|
|
(ct->c_list == NULL || ct->c_list->refcount == 0))
|
2005-08-08 21:17:23 +02:00
|
|
|
CatCacheRemoveCTup(cache, ct);
|
|
|
|
}
|
2000-11-10 01:33:12 +01:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
PG_RE_THROW();
|
|
|
|
}
|
|
|
|
PG_END_TRY();
|
2002-04-06 08:59:25 +02:00
|
|
|
|
|
|
|
cl->cl_magic = CL_MAGIC;
|
|
|
|
cl->my_cache = cache;
|
2004-07-17 05:32:14 +02:00
|
|
|
cl->refcount = 0; /* for the moment */
|
2002-04-06 08:59:25 +02:00
|
|
|
cl->dead = false;
|
|
|
|
cl->ordered = ordered;
|
|
|
|
cl->nkeys = nkeys;
|
|
|
|
cl->hash_value = lHashValue;
|
|
|
|
cl->n_members = nmembers;
|
2004-09-27 06:12:03 +02:00
|
|
|
|
2005-08-08 21:17:23 +02:00
|
|
|
i = 0;
|
|
|
|
foreach(ctlist_item, ctlist)
|
2002-03-03 18:47:56 +01:00
|
|
|
{
|
2005-08-08 21:17:23 +02:00
|
|
|
cl->members[i++] = ct = (CatCTup *) lfirst(ctlist_item);
|
2002-04-06 08:59:25 +02:00
|
|
|
Assert(ct->c_list == NULL);
|
|
|
|
ct->c_list = cl;
|
2005-08-14 00:18:07 +02:00
|
|
|
/* release the temporary refcount on the member */
|
|
|
|
Assert(ct->refcount > 0);
|
|
|
|
ct->refcount--;
|
2002-04-06 08:59:25 +02:00
|
|
|
/* mark list dead if any members already dead */
|
|
|
|
if (ct->dead)
|
|
|
|
cl->dead = true;
|
|
|
|
}
|
2005-08-08 21:17:23 +02:00
|
|
|
Assert(i == nmembers);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_push_head(&cache->cc_lists, &cl->cache_elem);
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2004-07-17 05:32:14 +02:00
|
|
|
/* Finally, bump the list's refcount and return it */
|
|
|
|
cl->refcount++;
|
|
|
|
ResourceOwnerRememberCatCacheListRef(CurrentResourceOwner, cl);
|
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "SearchCatCacheList(%s): made list of %d members",
|
|
|
|
cache->cc_relname, nmembers);
|
2005-08-08 21:17:23 +02:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
return cl;
|
|
|
|
}
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
/*
|
|
|
|
* ReleaseCatCacheList
|
|
|
|
*
|
2005-08-14 00:18:07 +02:00
|
|
|
* Decrement the reference count of a catcache list.
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
ReleaseCatCacheList(CatCList *list)
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
{
|
|
|
|
ReleaseCatCacheListWithOwner(list, CurrentResourceOwner);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
ReleaseCatCacheListWithOwner(CatCList *list, ResourceOwner resowner)
|
2002-04-06 08:59:25 +02:00
|
|
|
{
|
|
|
|
/* Safety checks to ensure we were handed a cache entry */
|
|
|
|
Assert(list->cl_magic == CL_MAGIC);
|
|
|
|
Assert(list->refcount > 0);
|
|
|
|
list->refcount--;
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
if (resowner)
|
|
|
|
ResourceOwnerForgetCatCacheListRef(CurrentResourceOwner, list);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
2005-08-14 00:18:07 +02:00
|
|
|
if (
|
2002-04-06 08:59:25 +02:00
|
|
|
#ifndef CATCACHE_FORCE_RELEASE
|
2005-08-14 00:18:07 +02:00
|
|
|
list->dead &&
|
2002-04-06 08:59:25 +02:00
|
|
|
#endif
|
2005-08-14 00:18:07 +02:00
|
|
|
list->refcount == 0)
|
2002-04-06 08:59:25 +02:00
|
|
|
CatCacheRemoveCList(list->my_cache, list);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CatalogCacheCreateEntry
|
|
|
|
* Create a new CatCTup entry, copying the given HeapTuple and other
|
2004-07-17 05:32:14 +02:00
|
|
|
* supplied data into it. The new entry initially has refcount 0.
|
2002-04-06 08:59:25 +02:00
|
|
|
*/
|
|
|
|
static CatCTup *
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
|
|
|
|
uint32 hashValue, Index hashIndex,
|
|
|
|
bool negative)
|
2002-04-06 08:59:25 +02:00
|
|
|
{
|
|
|
|
CatCTup *ct;
|
Fix race condition with toast table access from a stale syscache entry.
If a tuple in a syscache contains an out-of-line toasted field, and we
try to fetch that field shortly after some other transaction has committed
an update or deletion of the tuple, there is a race condition: vacuum
could come along and remove the toast tuples before we can fetch them.
This leads to transient failures like "missing chunk number 0 for toast
value NNNNN in pg_toast_2619", as seen in recent reports from Andrew
Hammond and Tim Uckun.
The design idea of syscache is that access to stale syscache entries
should be prevented by relation-level locks, but that fails for at least
two cases where toasted fields are possible: ANALYZE updates pg_statistic
rows without locking out sessions that might want to plan queries on the
same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without
any meaningful lock at all.
The least risky fix seems to be an idea that Heikki suggested when we
were dealing with a related problem back in August: forcibly detoast any
out-of-line fields before putting a tuple into syscache in the first place.
This avoids the problem because at the time we fetch the parent tuple from
the catalog, we should be holding an MVCC snapshot that will prevent
removal of the toast tuples, even if the parent tuple is outdated
immediately after we fetch it. (Note: I'm not convinced that this
statement holds true at every instant where we could be fetching a syscache
entry at all, but it does appear to hold true at the times where we could
fetch an entry that could have a toasted field. We will need to be a bit
wary of adding toast tables to low-level catalogs that don't have them
already.) An additional benefit is that subsequent uses of the syscache
entry should be faster, since they won't have to detoast the field.
Back-patch to all supported versions. The problem is significantly harder
to reproduce in pre-9.0 releases, because of their willingness to flush
every entry in a syscache whenever the underlying catalog is vacuumed
(cf CatalogCacheFlushRelation); but there is still a window for trouble.
2011-11-02 00:48:37 +01:00
|
|
|
HeapTuple dtp;
|
2002-04-06 08:59:25 +02:00
|
|
|
MemoryContext oldcxt;
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/* negative entries have no tuple associated */
|
|
|
|
if (ntp)
|
|
|
|
{
|
|
|
|
int i;
|
Fix race condition with toast table access from a stale syscache entry.
If a tuple in a syscache contains an out-of-line toasted field, and we
try to fetch that field shortly after some other transaction has committed
an update or deletion of the tuple, there is a race condition: vacuum
could come along and remove the toast tuples before we can fetch them.
This leads to transient failures like "missing chunk number 0 for toast
value NNNNN in pg_toast_2619", as seen in recent reports from Andrew
Hammond and Tim Uckun.
The design idea of syscache is that access to stale syscache entries
should be prevented by relation-level locks, but that fails for at least
two cases where toasted fields are possible: ANALYZE updates pg_statistic
rows without locking out sessions that might want to plan queries on the
same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without
any meaningful lock at all.
The least risky fix seems to be an idea that Heikki suggested when we
were dealing with a related problem back in August: forcibly detoast any
out-of-line fields before putting a tuple into syscache in the first place.
This avoids the problem because at the time we fetch the parent tuple from
the catalog, we should be holding an MVCC snapshot that will prevent
removal of the toast tuples, even if the parent tuple is outdated
immediately after we fetch it. (Note: I'm not convinced that this
statement holds true at every instant where we could be fetching a syscache
entry at all, but it does appear to hold true at the times where we could
fetch an entry that could have a toasted field. We will need to be a bit
wary of adding toast tables to low-level catalogs that don't have them
already.) An additional benefit is that subsequent uses of the syscache
entry should be faster, since they won't have to detoast the field.
Back-patch to all supported versions. The problem is significantly harder
to reproduce in pre-9.0 releases, because of their willingness to flush
every entry in a syscache whenever the underlying catalog is vacuumed
(cf CatalogCacheFlushRelation); but there is still a window for trouble.
2011-11-02 00:48:37 +01:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Assert(!negative);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If there are any out-of-line toasted fields in the tuple, expand
|
|
|
|
* them in-line. This saves cycles during later use of the catcache
|
|
|
|
* entry, and also protects us against the possibility of the toast
|
|
|
|
* tuples being freed before we attempt to fetch them, in case of
|
|
|
|
* something using a slightly stale catcache entry.
|
|
|
|
*/
|
|
|
|
if (HeapTupleHasExternal(ntp))
|
|
|
|
dtp = toast_flatten_tuple(ntp, cache->cc_tupdesc);
|
|
|
|
else
|
|
|
|
dtp = ntp;
|
|
|
|
|
|
|
|
/* Allocate memory for CatCTup and the cached tuple in one go */
|
|
|
|
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
|
2002-04-06 08:59:25 +02:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
ct = (CatCTup *) palloc(sizeof(CatCTup) +
|
|
|
|
MAXIMUM_ALIGNOF + dtp->t_len);
|
|
|
|
ct->tuple.t_len = dtp->t_len;
|
|
|
|
ct->tuple.t_self = dtp->t_self;
|
|
|
|
ct->tuple.t_tableOid = dtp->t_tableOid;
|
|
|
|
ct->tuple.t_data = (HeapTupleHeader)
|
|
|
|
MAXALIGN(((char *) ct) + sizeof(CatCTup));
|
|
|
|
/* copy tuple contents */
|
|
|
|
memcpy((char *) ct->tuple.t_data,
|
|
|
|
(const char *) dtp->t_data,
|
|
|
|
dtp->t_len);
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
|
|
|
|
if (dtp != ntp)
|
|
|
|
heap_freetuple(dtp);
|
|
|
|
|
|
|
|
/* extract keys - they'll point into the tuple if not by-value */
|
|
|
|
for (i = 0; i < cache->cc_nkeys; i++)
|
|
|
|
{
|
|
|
|
Datum atp;
|
|
|
|
bool isnull;
|
|
|
|
|
|
|
|
atp = heap_getattr(&ct->tuple,
|
|
|
|
cache->cc_keyno[i],
|
|
|
|
cache->cc_tupdesc,
|
|
|
|
&isnull);
|
|
|
|
Assert(!isnull);
|
|
|
|
ct->keys[i] = atp;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
Assert(negative);
|
|
|
|
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
|
|
|
|
ct = (CatCTup *) palloc(sizeof(CatCTup));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Store keys - they'll point into separately allocated memory if not
|
|
|
|
* by-value.
|
|
|
|
*/
|
|
|
|
CatCacheCopyKeys(cache->cc_tupdesc, cache->cc_nkeys, cache->cc_keyno,
|
|
|
|
arguments, ct->keys);
|
|
|
|
MemoryContextSwitchTo(oldcxt);
|
|
|
|
}
|
Fix race condition with toast table access from a stale syscache entry.
If a tuple in a syscache contains an out-of-line toasted field, and we
try to fetch that field shortly after some other transaction has committed
an update or deletion of the tuple, there is a race condition: vacuum
could come along and remove the toast tuples before we can fetch them.
This leads to transient failures like "missing chunk number 0 for toast
value NNNNN in pg_toast_2619", as seen in recent reports from Andrew
Hammond and Tim Uckun.
The design idea of syscache is that access to stale syscache entries
should be prevented by relation-level locks, but that fails for at least
two cases where toasted fields are possible: ANALYZE updates pg_statistic
rows without locking out sessions that might want to plan queries on the
same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without
any meaningful lock at all.
The least risky fix seems to be an idea that Heikki suggested when we
were dealing with a related problem back in August: forcibly detoast any
out-of-line fields before putting a tuple into syscache in the first place.
This avoids the problem because at the time we fetch the parent tuple from
the catalog, we should be holding an MVCC snapshot that will prevent
removal of the toast tuples, even if the parent tuple is outdated
immediately after we fetch it. (Note: I'm not convinced that this
statement holds true at every instant where we could be fetching a syscache
entry at all, but it does appear to hold true at the times where we could
fetch an entry that could have a toasted field. We will need to be a bit
wary of adding toast tables to low-level catalogs that don't have them
already.) An additional benefit is that subsequent uses of the syscache
entry should be faster, since they won't have to detoast the field.
Back-patch to all supported versions. The problem is significantly harder
to reproduce in pre-9.0 releases, because of their willingness to flush
every entry in a syscache whenever the underlying catalog is vacuumed
(cf CatalogCacheFlushRelation); but there is still a window for trouble.
2011-11-02 00:48:37 +01:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2002-04-06 08:59:25 +02:00
|
|
|
* Finish initializing the CatCTup header, and add it to the cache's
|
2006-06-15 04:08:09 +02:00
|
|
|
* linked list and counts.
|
2000-11-16 23:30:52 +01:00
|
|
|
*/
|
|
|
|
ct->ct_magic = CT_MAGIC;
|
2001-06-18 05:35:07 +02:00
|
|
|
ct->my_cache = cache;
|
2002-04-06 08:59:25 +02:00
|
|
|
ct->c_list = NULL;
|
2004-07-17 05:32:14 +02:00
|
|
|
ct->refcount = 0; /* for the moment */
|
2000-11-16 23:30:52 +01:00
|
|
|
ct->dead = false;
|
2002-04-06 08:59:25 +02:00
|
|
|
ct->negative = negative;
|
2002-03-03 18:47:56 +01:00
|
|
|
ct->hash_value = hashValue;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2012-10-19 01:04:20 +02:00
|
|
|
dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
|
2002-02-19 21:11:20 +01:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
cache->cc_ntup++;
|
|
|
|
CacheHdr->ch_ntup++;
|
|
|
|
|
2013-09-05 18:47:56 +02:00
|
|
|
/*
|
|
|
|
* If the hash table has become too full, enlarge the buckets array. Quite
|
|
|
|
* arbitrarily, we enlarge when fill factor > 2.
|
|
|
|
*/
|
|
|
|
if (cache->cc_ntup > cache->cc_nbuckets * 2)
|
|
|
|
RehashCatCache(cache);
|
|
|
|
|
2005-08-14 00:18:07 +02:00
|
|
|
return ct;
|
|
|
|
}
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
* Helper routine that frees keys stored in the keys array.
|
2000-11-16 23:30:52 +01:00
|
|
|
*/
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
static void
|
|
|
|
CatCacheFreeKeys(TupleDesc tupdesc, int nkeys, int *attnos, Datum *keys)
|
2000-11-16 23:30:52 +01:00
|
|
|
{
|
2002-04-06 08:59:25 +02:00
|
|
|
int i;
|
2000-11-16 23:30:52 +01:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
for (i = 0; i < nkeys; i++)
|
|
|
|
{
|
|
|
|
int attnum = attnos[i];
|
|
|
|
Form_pg_attribute att;
|
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
/* system attribute are not supported in caches */
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
Assert(attnum > 0);
|
|
|
|
|
|
|
|
att = TupleDescAttr(tupdesc, attnum - 1);
|
|
|
|
|
|
|
|
if (!att->attbyval)
|
|
|
|
pfree(DatumGetPointer(keys[i]));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Helper routine that copies the keys in the srckeys array into the dstkeys
|
|
|
|
* one, guaranteeing that the datums are fully allocated in the current memory
|
|
|
|
* context.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
CatCacheCopyKeys(TupleDesc tupdesc, int nkeys, int *attnos,
|
|
|
|
Datum *srckeys, Datum *dstkeys)
|
|
|
|
{
|
|
|
|
int i;
|
2000-11-16 23:30:52 +01:00
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
/*
|
|
|
|
* XXX: memory and lookup performance could possibly be improved by
|
|
|
|
* storing all keys in one allocation.
|
|
|
|
*/
|
2000-11-16 23:30:52 +01:00
|
|
|
|
2002-04-06 08:59:25 +02:00
|
|
|
for (i = 0; i < nkeys; i++)
|
|
|
|
{
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
int attnum = attnos[i];
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
Form_pg_attribute att = TupleDescAttr(tupdesc, attnum - 1);
|
|
|
|
Datum src = srckeys[i];
|
|
|
|
NameData srcname;
|
2002-04-06 08:59:25 +02:00
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
/*
|
Make collation-aware system catalog columns use "C" collation.
Up to now we allowed text columns in system catalogs to use collation
"default", but that isn't really safe because it might mean something
different in template0 than it means in a database cloned from template0.
In particular, this could mean that cloned pg_statistic entries for such
columns weren't entirely valid, possibly leading to bogus planner
estimates, though (probably) not any outright failures.
In the wake of commit 5e0928005, a better solution is available: if we
label such columns with "C" collation, then their pg_statistic entries
will also use that collation and hence will be valid independently of
the database collation.
This also provides a cleaner solution for indexes on such columns than
the hack added by commit 0b28ea79c: the indexes will naturally inherit
"C" collation and don't have to be forced to use text_pattern_ops.
Also, with the planned improvement of type "name" to be collation-aware,
this policy will apply cleanly to both text and name columns.
Because of the pg_statistic angle, we should also apply this policy
to the tables in information_schema. This patch does that by adjusting
information_schema's textual domain types to specify "C" collation.
That has the user-visible effect that order-sensitive comparisons to
textual information_schema view columns will now use "C" collation
by default. The SQL standard says that the collation of those view
columns is implementation-defined, so I think this is legal per spec.
At some point this might allow for translation of such comparisons
into indexable conditions on the underlying "name" columns, although
additional work will be needed before that can happen.
Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
2018-12-18 18:48:15 +01:00
|
|
|
* Must be careful in case the caller passed a C string where a NAME
|
|
|
|
* is wanted: convert the given argument to a correctly padded NAME.
|
|
|
|
* Otherwise the memcpy() done by datumCopy() could fall off the end
|
|
|
|
* of memory.
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
*/
|
|
|
|
if (att->atttypid == NAMEOID)
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
{
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
namestrcpy(&srcname, DatumGetCString(src));
|
|
|
|
src = NameGetDatum(&srcname);
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
}
|
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
dstkeys[i] = datumCopy(src,
|
|
|
|
att->attbyval,
|
|
|
|
att->attlen);
|
2002-04-06 08:59:25 +02:00
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
2001-01-05 23:54:37 +01:00
|
|
|
* PrepareToInvalidateCacheTuple()
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2001-01-05 23:54:37 +01:00
|
|
|
* This is part of a rather subtle chain of events, so pay attention:
|
|
|
|
*
|
2002-03-03 18:47:56 +01:00
|
|
|
* When a tuple is inserted or deleted, it cannot be flushed from the
|
2001-06-18 05:35:07 +02:00
|
|
|
* catcaches immediately, for reasons explained at the top of cache/inval.c.
|
2001-01-05 23:54:37 +01:00
|
|
|
* Instead we have to add entry(s) for the tuple to a list of pending tuple
|
|
|
|
* invalidations that will be done at the end of the command or transaction.
|
|
|
|
*
|
|
|
|
* The lists of tuples that need to be flushed are kept by inval.c. This
|
|
|
|
* routine is a helper routine for inval.c. Given a tuple belonging to
|
|
|
|
* the specified relation, find all catcaches it could be in, compute the
|
2011-08-17 01:27:46 +02:00
|
|
|
* correct hash value for each such catcache, and call the specified
|
|
|
|
* function to record the cache id and hash value in inval.c's lists.
|
2017-05-13 00:17:29 +02:00
|
|
|
* SysCacheInvalidate will be called later, if appropriate,
|
2001-01-05 23:54:37 +01:00
|
|
|
* using the recorded information.
|
|
|
|
*
|
2011-08-17 01:27:46 +02:00
|
|
|
* For an insert or delete, tuple is the target tuple and newtuple is NULL.
|
|
|
|
* For an update, we are called just once, with tuple being the old tuple
|
|
|
|
* version and newtuple the new version. We should make two list entries
|
|
|
|
* if the tuple's hash value changed, but only one if it didn't.
|
|
|
|
*
|
2001-01-05 23:54:37 +01:00
|
|
|
* Note that it is irrelevant whether the given tuple is actually loaded
|
|
|
|
* into the catcache at the moment. Even if it's not there now, it might
|
2002-03-03 18:47:56 +01:00
|
|
|
* be by the end of the command, or there might be a matching negative entry
|
|
|
|
* to flush --- or other backends' caches might have such entries --- so
|
|
|
|
* we have to make list entries to flush it later.
|
2001-01-05 23:54:37 +01:00
|
|
|
*
|
|
|
|
* Also note that it's not an error if there are no catcaches for the
|
|
|
|
* specified relation. inval.c doesn't know exactly which rels have
|
|
|
|
* catcaches --- it will call this routine for any tuple that's in a
|
|
|
|
* system relation.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
void
|
2001-01-05 23:54:37 +01:00
|
|
|
PrepareToInvalidateCacheTuple(Relation relation,
|
|
|
|
HeapTuple tuple,
|
2011-08-17 01:27:46 +02:00
|
|
|
HeapTuple newtuple,
|
|
|
|
void (*function) (int, uint32, Oid))
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
slist_iter iter;
|
2002-03-03 18:47:56 +01:00
|
|
|
Oid reloid;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2019-02-18 12:32:34 +01:00
|
|
|
CACHE_elog(DEBUG2, "PrepareToInvalidateCacheTuple: called");
|
2001-06-18 05:35:07 +02:00
|
|
|
|
2001-02-22 19:39:20 +01:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* sanity checks
|
|
|
|
*/
|
|
|
|
Assert(RelationIsValid(relation));
|
|
|
|
Assert(HeapTupleIsValid(tuple));
|
1998-10-12 02:53:42 +02:00
|
|
|
Assert(PointerIsValid(function));
|
2001-06-18 05:35:07 +02:00
|
|
|
Assert(CacheHdr != NULL);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
reloid = RelationGetRelid(relation);
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/* ----------------
|
|
|
|
* for each cache
|
|
|
|
* if the cache contains tuples from the specified relation
|
2011-08-17 01:27:46 +02:00
|
|
|
* compute the tuple's hash value(s) in this cache,
|
2001-01-05 23:54:37 +01:00
|
|
|
* and call the passed function to register the information.
|
1996-07-09 08:22:35 +02:00
|
|
|
* ----------------
|
|
|
|
*/
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2012-10-19 01:30:43 +02:00
|
|
|
slist_foreach(iter, &CacheHdr->ch_caches)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-19 01:30:43 +02:00
|
|
|
CatCache *ccp = slist_container(CatCache, cc_next, iter.cur);
|
2011-08-17 01:27:46 +02:00
|
|
|
uint32 hashvalue;
|
|
|
|
Oid dbid;
|
|
|
|
|
2008-03-05 18:01:26 +01:00
|
|
|
if (ccp->cc_reloid != reloid)
|
|
|
|
continue;
|
|
|
|
|
2000-11-10 01:33:12 +01:00
|
|
|
/* Just in case cache hasn't finished initialization yet... */
|
|
|
|
if (ccp->cc_tupdesc == NULL)
|
|
|
|
CatalogCacheInitializeCache(ccp);
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
hashvalue = CatalogCacheComputeTupleHashValue(ccp, ccp->cc_nkeys, tuple);
|
2011-08-17 01:27:46 +02:00
|
|
|
dbid = ccp->cc_relisshared ? (Oid) 0 : MyDatabaseId;
|
|
|
|
|
|
|
|
(*function) (ccp->id, hashvalue, dbid);
|
|
|
|
|
|
|
|
if (newtuple)
|
|
|
|
{
|
|
|
|
uint32 newhashvalue;
|
|
|
|
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
newhashvalue = CatalogCacheComputeTupleHashValue(ccp, ccp->cc_nkeys, newtuple);
|
2011-08-17 01:27:46 +02:00
|
|
|
|
|
|
|
if (newhashvalue != hashvalue)
|
|
|
|
(*function) (ccp->id, newhashvalue, dbid);
|
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
2005-03-25 19:30:28 +01:00
|
|
|
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
/* ResourceOwner callbacks */
|
2005-03-25 19:30:28 +01:00
|
|
|
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
static void
|
|
|
|
ResOwnerReleaseCatCache(Datum res)
|
2005-03-25 19:30:28 +01:00
|
|
|
{
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
ReleaseCatCacheWithOwner((HeapTuple) DatumGetPointer(res), NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
static char *
|
|
|
|
ResOwnerPrintCatCache(Datum res)
|
|
|
|
{
|
|
|
|
HeapTuple tuple = (HeapTuple) DatumGetPointer(res);
|
2005-03-25 19:30:28 +01:00
|
|
|
CatCTup *ct = (CatCTup *) (((char *) tuple) -
|
|
|
|
offsetof(CatCTup, tuple));
|
|
|
|
|
|
|
|
/* Safety check to ensure we were handed a cache entry */
|
|
|
|
Assert(ct->ct_magic == CT_MAGIC);
|
|
|
|
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
return psprintf("cache %s (%d), tuple %u/%u has count %d",
|
|
|
|
ct->my_cache->cc_relname, ct->my_cache->id,
|
|
|
|
ItemPointerGetBlockNumber(&(tuple->t_self)),
|
|
|
|
ItemPointerGetOffsetNumber(&(tuple->t_self)),
|
|
|
|
ct->refcount);
|
2005-03-25 19:30:28 +01:00
|
|
|
}
|
|
|
|
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
static void
|
|
|
|
ResOwnerReleaseCatCacheList(Datum res)
|
2005-03-25 19:30:28 +01:00
|
|
|
{
|
Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.
The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.
All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.
The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.
Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget. There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.
Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.
There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.
Add tests in src/test/modules/test_resowner.
Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 12:30:50 +01:00
|
|
|
ReleaseCatCacheListWithOwner((CatCList *) DatumGetPointer(res), NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
static char *
|
|
|
|
ResOwnerPrintCatCacheList(Datum res)
|
|
|
|
{
|
|
|
|
CatCList *list = (CatCList *) DatumGetPointer(res);
|
|
|
|
|
|
|
|
return psprintf("cache %s (%d), list %p has count %d",
|
|
|
|
list->my_cache->cc_relname, list->my_cache->id,
|
|
|
|
list, list->refcount);
|
2005-03-25 19:30:28 +01:00
|
|
|
}
|