Replace CLOBBER_CACHE_ALWAYS with run-time GUC

Forced cache invalidation (CLOBBER_CACHE_ALWAYS) has been impractical
to use for testing in PostgreSQL because it's so slow and because it's
toggled on/off only at build time.  It is helpful when hunting bugs in
any code that uses the sycache/relcache because causes cache
invalidations to be injected whenever it would be possible for an
invalidation to occur, whether or not one was really pending.

Address this by providing run-time control over cache clobber
behaviour using the new debug_invalidate_system_caches_always GUC.
Support is not compiled in at all unless assertions are enabled or
CLOBBER_CACHE_ENABLED is explicitly defined at compile time.  It
defaults to 0 if compiled in, so it has negligible effect on assert
build performance by default.

When support is compiled in, test code can now set
debug_invalidate_system_caches_always=1 locally to a backend to test
specific queries, functions, extensions, etc.  Or tests can toggle it
globally for a specific test case while retaining normal performance
during test setup and teardown.

For backwards compatibility with existing test harnesses and scripts,
debug_invalidate_system_caches_always defaults to 1 if
CLOBBER_CACHE_ALWAYS is defined, and to 3 if CLOBBER_CACHE_RECURSIVE
is defined.

CLOBBER_CACHE_ENABLED is now visible in pg_config_manual.h, as is the
related RECOVER_RELATION_BUILD_MEMORY setting for the relcache.

Author: Craig Ringer <craig.ringer@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMsr+YF=+ctXBZj3ywmvKNUjWpxmuTuUKuv-rgbHGX5i5pLstQ@mail.gmail.com
This commit is contained in:
Peter Eisentraut 2021-01-06 10:15:19 +01:00
parent 8900b5a9d5
commit 4656e3d668
9 changed files with 169 additions and 54 deletions

View File

@ -9989,6 +9989,43 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
</listitem>
</varlistentry>
<varlistentry id="guc-debug-invalidate-system-caches-always" xreflabel="debug_invalidate_system_caches_always">
<term><varname>debug_invalidate_system_caches_always</varname> (<type>integer</type>)
<indexterm>
<primary><varname>debug_invalidate_system_caches_always</varname> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
When set to 1, each cache lookup for a system catalog entry is
invalidated at the first possible opportunity, irrespective of whether
anything that would render it invalid really occurred. Caching of
system catalogs is effectively disabled as a result, so the server
will run extremely slowly. Higher values run the cache invalidation
recursively, which is even slower and useful only useful for testing
in very specific scenarios.
</para>
<para>
This option can be very helpful when trying to trigger
hard-to-reproduce bugs involving concurrency and catalog changes but
is otherwise rarely needed. See the source code files
<filename>inval.c</filename> and
<filename>pg_config_manual.h</filename> for details.
</para>
<para>
This setting is supported but off by default (0) when
<symbol>CLOBBER_CACHE_ENABLED</symbol> is defined at compile time
(which happens automatically when using the
<literal>configure</literal> option
<option>--enable-cassert</option>). In production builds, its value
will always be <literal>0</literal> and attempts to set it to another
value will raise an error.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-ignore-system-indexes" xreflabel="ignore_system_indexes">
<term><varname>ignore_system_indexes</varname> (<type>boolean</type>)
<indexterm>

View File

@ -372,7 +372,8 @@ make check EXTRA_REGRESS_OPTS="--temp-config=test_postgresql.conf"
<para>
This can be useful to enable additional logging, adjust resource limits,
or enable extra run-time checks.
or enable extra run-time checks such as <xref
linkend="guc-debug-invalidate-system-caches-always"/>.
</para>
</sect2>

View File

@ -629,7 +629,7 @@ pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
* Check if any of these are in the list of interesting PIDs, that being
* the sessions that the isolation tester is running. We don't use
* "arrayoverlaps" here, because it would lead to cache lookups and one of
* our goals is to run quickly under CLOBBER_CACHE_ALWAYS. We expect
* our goals is to run quickly with debug_invalidate_system_caches_always > 0. We expect
* blocking_pids to be usually empty and otherwise a very small number in
* isolation tester cases, so make that the outer loop of a naive search
* for a match.

View File

@ -109,6 +109,7 @@
#include "storage/sinval.h"
#include "storage/smgr.h"
#include "utils/catcache.h"
#include "utils/guc.h"
#include "utils/inval.h"
#include "utils/memdebug.h"
#include "utils/memutils.h"
@ -179,6 +180,8 @@ static SharedInvalidationMessage *SharedInvalidMessagesArray;
static int numSharedInvalidMessagesArray;
static int maxSharedInvalidMessagesArray;
/* GUC storage */
int debug_invalidate_system_caches_always = 0;
/*
* Dynamically-registered callback functions. Current implementation
@ -689,35 +692,32 @@ AcceptInvalidationMessages(void)
/*
* Test code to force cache flushes anytime a flush could happen.
*
* If used with CLOBBER_FREED_MEMORY, CLOBBER_CACHE_ALWAYS provides a
* fairly thorough test that the system contains no cache-flush hazards.
* However, it also makes the system unbelievably slow --- the regression
* tests take about 100 times longer than normal.
* This helps detect intermittent faults caused by code that reads a
* cache entry and then performs an action that could invalidate the entry,
* but rarely actually does so. This can spot issues that would otherwise
* only arise with badly timed concurrent DDL, for example.
*
* If you're a glutton for punishment, try CLOBBER_CACHE_RECURSIVELY. This
* slows things by at least a factor of 10000, so I wouldn't suggest
* trying to run the entire regression tests that way. It's useful to try
* a few simple tests, to make sure that cache reload isn't subject to
* internal cache-flush hazards, but after you've done a few thousand
* recursive reloads it's unlikely you'll learn more.
* The default debug_invalidate_system_caches_always = 0 does no forced cache flushes.
*
* If used with CLOBBER_FREED_MEMORY, debug_invalidate_system_caches_always = 1
* (CLOBBER_CACHE_ALWAYS) provides a fairly thorough test that the system
* contains no cache-flush hazards. However, it also makes the system
* unbelievably slow --- the regression tests take about 100 times longer
* than normal.
*
* If you're a glutton for punishment, try debug_invalidate_system_caches_always = 3
* (CLOBBER_CACHE_RECURSIVELY). This slows things by at least a factor
* of 10000, so I wouldn't suggest trying to run the entire regression
* tests that way. It's useful to try a few simple tests, to make sure
* that cache reload isn't subject to internal cache-flush hazards, but
* after you've done a few thousand recursive reloads it's unlikely
* you'll learn more.
*/
#if defined(CLOBBER_CACHE_ALWAYS)
{
static bool in_recursion = false;
if (!in_recursion)
{
in_recursion = true;
InvalidateSystemCaches();
in_recursion = false;
}
}
#elif defined(CLOBBER_CACHE_RECURSIVELY)
#ifdef CLOBBER_CACHE_ENABLED
{
static int recursion_depth = 0;
/* Maximum depth is arbitrary depending on your threshold of pain */
if (recursion_depth < 3)
if (recursion_depth < debug_invalidate_system_caches_always)
{
recursion_depth++;
InvalidateSystemCaches();

View File

@ -897,7 +897,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* rejected a generic plan, it's possible to reach here with is_valid
* false due to an invalidation while making the generic plan. In theory
* the invalidation must be a false positive, perhaps a consequence of an
* sinval reset event or the CLOBBER_CACHE_ALWAYS debug code. But for
* sinval reset event or the debug_invalidate_system_caches_always code. But for
* safety, let's treat it as real and redo the RevalidateCachedQuery call.
*/
if (!plansource->is_valid)

View File

@ -91,15 +91,15 @@
#define RELCACHE_INIT_FILEMAGIC 0x573266 /* version ID value */
/*
* Default policy for whether to apply RECOVER_RELATION_BUILD_MEMORY:
* do so in clobber-cache builds but not otherwise. This choice can be
* overridden at compile time with -DRECOVER_RELATION_BUILD_MEMORY=1 or =0.
* Whether to bother checking if relation cache memory needs to be freed
* eagerly. See also RelationBuildDesc() and pg_config_manual.h.
*/
#ifndef RECOVER_RELATION_BUILD_MEMORY
#if defined(CLOBBER_CACHE_ALWAYS) || defined(CLOBBER_CACHE_RECURSIVELY)
#define RECOVER_RELATION_BUILD_MEMORY 1
#if defined(RECOVER_RELATION_BUILD_MEMORY) && (RECOVER_RELATION_BUILD_MEMORY != 0)
#define MAYBE_RECOVER_RELATION_BUILD_MEMORY 1
#else
#define RECOVER_RELATION_BUILD_MEMORY 0
#ifdef CLOBBER_CACHE_ENABLED
#define MAYBE_RECOVER_RELATION_BUILD_MEMORY 1
#endif
#endif
@ -1040,19 +1040,25 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
* scope, and relcache loads shouldn't happen so often that it's essential
* to recover transient data before end of statement/transaction. However
* that's definitely not true in clobber-cache test builds, and perhaps
* it's not true in other cases. If RECOVER_RELATION_BUILD_MEMORY is not
* zero, arrange to allocate the junk in a temporary context that we'll
* free before returning. Make it a child of caller's context so that it
* will get cleaned up appropriately if we error out partway through.
* it's not true in other cases.
*
* When cache clobbering is enabled or when forced to by
* RECOVER_RELATION_BUILD_MEMORY=1, arrange to allocate the junk in a
* temporary context that we'll free before returning. Make it a child
* of caller's context so that it will get cleaned up appropriately if
* we error out partway through.
*/
#if RECOVER_RELATION_BUILD_MEMORY
MemoryContext tmpcxt;
MemoryContext oldcxt;
#ifdef MAYBE_RECOVER_RELATION_BUILD_MEMORY
MemoryContext tmpcxt = NULL;
MemoryContext oldcxt = NULL;
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
"RelationBuildDesc workspace",
ALLOCSET_DEFAULT_SIZES);
oldcxt = MemoryContextSwitchTo(tmpcxt);
if (RECOVER_RELATION_BUILD_MEMORY || debug_invalidate_system_caches_always > 0)
{
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
"RelationBuildDesc workspace",
ALLOCSET_DEFAULT_SIZES);
oldcxt = MemoryContextSwitchTo(tmpcxt);
}
#endif
/*
@ -1065,10 +1071,13 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
*/
if (!HeapTupleIsValid(pg_class_tuple))
{
#if RECOVER_RELATION_BUILD_MEMORY
/* Return to caller's context, and blow away the temporary context */
MemoryContextSwitchTo(oldcxt);
MemoryContextDelete(tmpcxt);
#ifdef MAYBE_RECOVER_RELATION_BUILD_MEMORY
if (tmpcxt)
{
/* Return to caller's context, and blow away the temporary context */
MemoryContextSwitchTo(oldcxt);
MemoryContextDelete(tmpcxt);
}
#endif
return NULL;
}
@ -1247,10 +1256,13 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
/* It's fully valid */
relation->rd_isvalid = true;
#if RECOVER_RELATION_BUILD_MEMORY
/* Return to caller's context, and blow away the temporary context */
MemoryContextSwitchTo(oldcxt);
MemoryContextDelete(tmpcxt);
#ifdef MAYBE_RECOVER_RELATION_BUILD_MEMORY
if (tmpcxt)
{
/* Return to caller's context, and blow away the temporary context */
MemoryContextSwitchTo(oldcxt);
MemoryContextDelete(tmpcxt);
}
#endif
return relation;
@ -1646,8 +1658,9 @@ LookupOpclassInfo(Oid operatorClassOid,
* while we are loading the info, and it's very hard to provoke that if
* this happens only once per opclass per backend.
*/
#if defined(CLOBBER_CACHE_ALWAYS)
opcentry->valid = false;
#ifdef CLOBBER_CACHE_ENABLED
if (debug_invalidate_system_caches_always > 0)
opcentry->valid = false;
#endif
if (opcentry->valid)

View File

@ -99,6 +99,7 @@
#include "utils/rls.h"
#include "utils/snapmgr.h"
#include "utils/tzparser.h"
#include "utils/inval.h"
#include "utils/varlena.h"
#include "utils/xml.h"
@ -3402,6 +3403,29 @@ static struct config_int ConfigureNamesInt[] =
check_huge_page_size, NULL, NULL
},
{
{"debug_invalidate_system_caches_always", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Aggressively invalidate system caches for debugging purposes."),
NULL,
GUC_NOT_IN_SAMPLE
},
&debug_invalidate_system_caches_always,
#ifdef CLOBBER_CACHE_ENABLED
/* Set default based on older compile-time-only cache clobber macros */
#if defined(CLOBBER_CACHE_RECURSIVELY)
3,
#elif defined(CLOBBER_CACHE_ALWAYS)
1,
#else
0,
#endif
0, 5,
#else /* not CLOBBER_CACHE_ENABLED */
0, 0, 0,
#endif /* not CLOBBER_CACHE_ENABLED */
NULL, NULL, NULL
},
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL

View File

@ -309,6 +309,45 @@
*/
/* #define RANDOMIZE_ALLOCATED_MEMORY */
/*
* For cache invalidation debugging, define CLOBBER_CACHE_ENABLED to enable
* use of the debug_invalidate_system_caches_always GUC to aggressively flush
* syscache/relcache entries whenever it's possible to deliver invalidations.
* See AcceptInvalidationMessages() in src/backend/utils/cache/inval.c for
* details.
*
* USE_ASSERT_CHECKING builds default to enabling this. It's possible to use
* CLOBBER_CACHE_ENABLED without a cassert build and the implied
* CLOBBER_FREED_MEMORY and MEMORY_CONTEXT_CHECKING options but it's unlikely
* to be as effective at identifying problems.
*/
/* #define CLOBBER_CACHE_ENABLED */
#if defined(USE_ASSERT_CHECKING) && !defined(CLOBBER_CACHE_ENABLED)
#define CLOBBER_CACHE_ENABLED
#endif
/*
* Backwards compatibility for the older compile-time-only cache clobber
* macros.
*/
#if !defined(CLOBBER_CACHE_ENABLED) && (defined(CLOBBER_CACHE_ALWAYS) || defined(CLOBBER_CACHE_RECURSIVELY))
#define CLOBBER_CACHE_ENABLED
#endif
/*
* Recover memory used for relcache entries when invalidated. See
* RelationBuildDescr() in src/backend/utils/cache/relcache.c.
*
* This is active automatically for clobber cache builds when clobbering is
* active, but can be overridden here by explicitly defining
* RECOVER_RELATION_BUILD_MEMORY. Define to 1 to always free relation cache
* memory even when clobber is off, or to 0 to never free relation cache
* memory even when clobbering is on.
*/
/* #define RECOVER_RELATION_BUILD_MEMORY 0 */ /* Force disable */
/* #define RECOVER_RELATION_BUILD_MEMORY 1 */ /* Force enable */
/*
* Define this to force all parse and plan trees to be passed through
* copyObject(), to facilitate catching errors and omissions in

View File

@ -18,6 +18,7 @@
#include "storage/relfilenode.h"
#include "utils/relcache.h"
extern PGDLLIMPORT int debug_invalidate_system_caches_always;
typedef void (*SyscacheCallbackFunction) (Datum arg, int cacheid, uint32 hashvalue);
typedef void (*RelcacheCallbackFunction) (Datum arg, Oid relid);