2010-02-26 02:55:35 +01:00
|
|
|
ACCESS_ALLOWED_ACE
|
2018-04-26 20:45:04 +02:00
|
|
|
ACL
|
2010-02-26 02:55:35 +01:00
|
|
|
ACL_SIZE_INFORMATION
|
|
|
|
AFFIX
|
|
|
|
ASN1_INTEGER
|
2016-05-02 15:23:55 +02:00
|
|
|
ASN1_OBJECT
|
2010-02-26 02:55:35 +01:00
|
|
|
ASN1_OCTET_STRING
|
|
|
|
ASN1_STRING
|
|
|
|
AV
|
|
|
|
A_ArrayExpr
|
|
|
|
A_Const
|
|
|
|
A_Expr
|
|
|
|
A_Expr_Kind
|
|
|
|
A_Indices
|
|
|
|
A_Indirection
|
|
|
|
A_Star
|
|
|
|
AbsoluteTime
|
2016-04-27 17:47:28 +02:00
|
|
|
AccessMethodInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
AccessPriv
|
|
|
|
Acl
|
|
|
|
AclItem
|
|
|
|
AclMaskHow
|
|
|
|
AclMode
|
|
|
|
AclResult
|
2012-06-10 21:15:31 +02:00
|
|
|
AcquireSampleRowsFunc
|
2016-04-27 17:47:28 +02:00
|
|
|
ActionList
|
2010-02-26 02:55:35 +01:00
|
|
|
ActiveSnapshotElt
|
2012-06-10 21:15:31 +02:00
|
|
|
AddForeignUpdateTargets_function
|
2010-02-26 02:55:35 +01:00
|
|
|
AffixNode
|
|
|
|
AffixNodeData
|
|
|
|
AfterTriggerEvent
|
|
|
|
AfterTriggerEventChunk
|
|
|
|
AfterTriggerEventData
|
|
|
|
AfterTriggerEventList
|
|
|
|
AfterTriggerShared
|
|
|
|
AfterTriggerSharedData
|
|
|
|
AfterTriggersData
|
|
|
|
AfterTriggersQueryData
|
|
|
|
AfterTriggersTableData
|
|
|
|
AfterTriggersTransData
|
|
|
|
Agg
|
2011-06-09 20:01:49 +02:00
|
|
|
AggClauseCosts
|
2010-02-26 02:55:35 +01:00
|
|
|
AggInfo
|
2016-04-27 17:47:28 +02:00
|
|
|
AggPath
|
2016-08-15 19:42:51 +02:00
|
|
|
AggSplit
|
2010-02-26 02:55:35 +01:00
|
|
|
AggState
|
|
|
|
AggStatePerAgg
|
|
|
|
AggStatePerGroup
|
2017-05-17 21:52:16 +02:00
|
|
|
AggStatePerHash
|
2015-05-24 03:20:37 +02:00
|
|
|
AggStatePerPhase
|
2016-05-02 15:23:55 +02:00
|
|
|
AggStatePerTrans
|
2010-02-26 02:55:35 +01:00
|
|
|
AggStrategy
|
2021-05-12 19:14:10 +02:00
|
|
|
AggTransInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
Aggref
|
2016-04-27 17:47:28 +02:00
|
|
|
AggregateInstrumentation
|
2013-05-29 22:58:43 +02:00
|
|
|
AlenState
|
2010-02-26 02:55:35 +01:00
|
|
|
Alias
|
|
|
|
AllocBlock
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
AllocFreeListLink
|
2010-02-26 02:55:35 +01:00
|
|
|
AllocPointer
|
|
|
|
AllocSet
|
|
|
|
AllocSetContext
|
2018-04-26 20:45:04 +02:00
|
|
|
AllocSetFreeList
|
2010-02-26 02:55:35 +01:00
|
|
|
AllocateDesc
|
|
|
|
AllocateDescKind
|
2017-05-17 21:52:16 +02:00
|
|
|
AlterCollationStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterDatabaseRefreshCollStmt
|
|
|
|
AlterDatabaseSetStmt
|
|
|
|
AlterDatabaseStmt
|
|
|
|
AlterDefaultPrivilegesStmt
|
|
|
|
AlterDomainStmt
|
2011-04-09 05:11:37 +02:00
|
|
|
AlterEnumStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterEventTrigStmt
|
2011-04-09 05:11:37 +02:00
|
|
|
AlterExtensionContentsStmt
|
|
|
|
AlterExtensionStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterFdwStmt
|
|
|
|
AlterForeignServerStmt
|
|
|
|
AlterFunctionStmt
|
2016-04-05 23:38:54 +02:00
|
|
|
AlterObjectDependsStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterObjectSchemaStmt
|
|
|
|
AlterOpFamilyStmt
|
2016-04-27 17:47:28 +02:00
|
|
|
AlterOperatorStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterOwnerStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
AlterPolicyStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
AlterPublicationAction
|
|
|
|
AlterPublicationStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterRoleSetStmt
|
|
|
|
AlterRoleStmt
|
|
|
|
AlterSeqStmt
|
|
|
|
AlterStatsStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
AlterSubscriptionStmt
|
|
|
|
AlterSubscriptionType
|
2014-05-06 15:08:14 +02:00
|
|
|
AlterSystemStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
AlterTSConfigType
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterTSConfigurationStmt
|
|
|
|
AlterTSDictionaryStmt
|
|
|
|
AlterTableCmd
|
2014-08-22 01:06:17 +02:00
|
|
|
AlterTableMoveAllStmt
|
2014-05-06 15:08:14 +02:00
|
|
|
AlterTableSpaceOptionsStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterTableStmt
|
|
|
|
AlterTableType
|
|
|
|
AlterTableUtilityContext
|
2017-05-17 21:52:16 +02:00
|
|
|
AlterTypeRecurseParams
|
2010-02-26 02:55:35 +01:00
|
|
|
AlterTypeStmt
|
|
|
|
AlterUserMappingStmt
|
|
|
|
AlteredTableInfo
|
|
|
|
AlternativeSubPlan
|
|
|
|
AmcheckOptions
|
2012-06-10 21:15:31 +02:00
|
|
|
AnalyzeAttrComputeStatsFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
AnalyzeAttrFetchFunc
|
2012-06-10 21:15:31 +02:00
|
|
|
AnalyzeForeignTable_function
|
2021-05-12 19:14:10 +02:00
|
|
|
AnlExprData
|
2010-02-26 02:55:35 +01:00
|
|
|
AnlIndexData
|
2015-05-24 03:20:37 +02:00
|
|
|
AnyArrayType
|
2010-02-26 02:55:35 +01:00
|
|
|
Append
|
|
|
|
AppendPath
|
|
|
|
AppendRelInfo
|
|
|
|
AppendState
|
2021-08-27 05:00:23 +02:00
|
|
|
ApplyErrorCallbackArg
|
2012-06-10 21:15:31 +02:00
|
|
|
ApplyExecutionData
|
Add support for streaming to built-in logical replication.
To add support for streaming of in-progress transactions into the
built-in logical replication, we need to do three things:
* Extend the logical replication protocol, so identify in-progress
transactions, and allow adding additional bits of information (e.g.
XID of subtransactions).
* Modify the output plugin (pgoutput) to implement the new stream
API callbacks, by leveraging the extended replication protocol.
* Modify the replication apply worker, to properly handle streamed
in-progress transaction by spilling the data to disk and then
replaying them on commit.
We however must explicitly disable streaming replication during
replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover we don't have a replication connection open so we
don't have where to send the data anyway.
Author: Tomas Vondra, Dilip Kumar and Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian
Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-09-03 04:24:07 +02:00
|
|
|
ApplySubXactData
|
2010-02-26 02:55:35 +01:00
|
|
|
Archive
|
2022-05-12 21:17:30 +02:00
|
|
|
ArchiveCheckConfiguredCB
|
2017-05-17 21:52:16 +02:00
|
|
|
ArchiveEntryPtrType
|
2022-05-12 21:17:30 +02:00
|
|
|
ArchiveFileCB
|
2010-02-26 02:55:35 +01:00
|
|
|
ArchiveFormat
|
|
|
|
ArchiveHandle
|
|
|
|
ArchiveMode
|
|
|
|
ArchiveModuleCallbacks
|
|
|
|
ArchiveModuleInit
|
Redesign archive modules
A new callback named startup_cb, called shortly after a module is
loaded, is added. This makes possible the initialization of any
additional state data required by a module. This initial state data can
be saved in a ArchiveModuleState, that is now passed down to all the
callbacks that can be defined in a module. With this design, it is
possible to have a per-module state, aimed at opening the door to the
support of more than one archive module.
The initialization of the callbacks is changed so as
_PG_archive_module_init() does not anymore give in input a
ArchiveModuleCallbacks that a module has to fill in with callback
definitions. Instead, a module now needs to return a const
ArchiveModuleCallbacks.
All the structure and callback definitions of archive modules are moved
into their own header, named archive_module.h, from pgarch.h.
Command-based archiving follows the same line, with a new set of files
named shell_archive.{c,h}.
There are a few more items that are under discussion to improve the
design of archive modules, like the fact that basic_archive calls
sigsetjmp() by itself to define its own error handling flow. These will
be adjusted later, the changes done here cover already a good portion
of what has been discussed.
Any modules created for v15 will need to be adjusted to this new
design.
Author: Nathan Bossart
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20230130194810.6fztfgbn32e7qarj@awork3.anarazel.de
2023-02-17 06:26:42 +01:00
|
|
|
ArchiveModuleState
|
2019-05-22 18:55:34 +02:00
|
|
|
ArchiveOpts
|
2014-05-06 15:08:14 +02:00
|
|
|
ArchiveShutdownCB
|
2010-02-26 02:55:35 +01:00
|
|
|
ArchiveStreamState
|
2012-06-10 21:15:31 +02:00
|
|
|
ArchiverOutput
|
2010-02-26 02:55:35 +01:00
|
|
|
ArchiverStage
|
2012-06-10 21:15:31 +02:00
|
|
|
ArrayAnalyzeExtraData
|
2010-02-26 02:55:35 +01:00
|
|
|
ArrayBuildState
|
2015-05-24 03:20:37 +02:00
|
|
|
ArrayBuildStateAny
|
|
|
|
ArrayBuildStateArr
|
2010-02-26 02:55:35 +01:00
|
|
|
ArrayCoerceExpr
|
|
|
|
ArrayConstIterState
|
|
|
|
ArrayExpr
|
|
|
|
ArrayExprIterState
|
2017-05-17 21:52:16 +02:00
|
|
|
ArrayIOData
|
2011-04-09 05:11:37 +02:00
|
|
|
ArrayIterator
|
2010-02-26 02:55:35 +01:00
|
|
|
ArrayMapState
|
|
|
|
ArrayMetaState
|
|
|
|
ArrayParseState
|
Implementation of subscripting for jsonb
Subscripting for jsonb does not support slices, does not have a limit for the
number of subscripts, and an assignment expects a replace value to have jsonb
type. There is also one functional difference between assignment via
subscripting and assignment via jsonb_set(). When an original jsonb container
is NULL, the subscripting replaces it with an empty jsonb and proceeds with
an assignment.
For the sake of code reuse, we rearrange some parts of jsonb functionality
to allow the usage of the same functions for jsonb_set and assign subscripting
operation.
The original idea belongs to Oleg Bartunov.
Catversion is bumped.
Discussion: https://postgr.es/m/CA%2Bq6zcV8qvGcDXurwwgUbwACV86Th7G80pnubg42e-p9gsSf%3Dg%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2Bq6zcX3mdxGCgdThzuySwH-ApyHHM-G4oB1R0fn0j2hZqqkLQ%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2Bq6zcVDuGBv%3DM0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2Bq6zcVovR%2BXY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA%40mail.gmail.com
Author: Dmitry Dolgov
Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule, Dian M Fay
Reviewed-by: Andrew Dunstan, Chapman Flack, Merlin Moncure, Peter Geoghegan
Reviewed-by: Alvaro Herrera, Jim Nasby, Josh Berkus, Victor Wagner
Reviewed-by: Aleksander Alekseev, Robert Haas, Oleg Bartunov
2021-01-31 21:50:40 +01:00
|
|
|
ArraySubWorkspace
|
2010-02-26 02:55:35 +01:00
|
|
|
ArrayType
|
|
|
|
AsyncQueueControl
|
|
|
|
AsyncQueueEntry
|
Defer flushing of SLRU files.
Previously, we called fsync() after writing out individual pg_xact,
pg_multixact and pg_commit_ts pages due to cache pressure, leading to
regular I/O stalls in user backends and recovery. Collapse requests for
the same file into a single system call as part of the next checkpoint,
as we already did for relation files, using the infrastructure developed
by commit 3eb77eba. This can cause a significant improvement to
recovery performance, especially when it's otherwise CPU-bound.
Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer
that it applies to all the SLRU mini-buffer-pools as well as the main
buffer pool. Rearrange things so that data collected in CheckpointStats
includes SLRU activity.
Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions,
because they were redundant after the shutdown checkpoint that
immediately precedes them. (I'm not sure if they were ever needed, but
they aren't now.)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com>
Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
2020-09-25 08:49:43 +02:00
|
|
|
AsyncRequest
|
2010-02-26 02:55:35 +01:00
|
|
|
AttInMetadata
|
2017-05-17 21:52:16 +02:00
|
|
|
AttStatsSlot
|
2010-02-26 02:55:35 +01:00
|
|
|
AttoptCacheEntry
|
|
|
|
AttoptCacheKey
|
|
|
|
AttrDefInfo
|
|
|
|
AttrDefault
|
2019-12-18 08:23:02 +01:00
|
|
|
AttrMap
|
2018-04-26 20:45:04 +02:00
|
|
|
AttrMissing
|
2010-02-26 02:55:35 +01:00
|
|
|
AttrNumber
|
|
|
|
AttributeOpts
|
|
|
|
AuthRequest
|
Refactor code related to pg_hba_file_rules() into new file
hba.c is growing big, and more contents are planned for it. In order to
prepare for this future work, this commit moves all the code related to
the system function processing the contents of pg_hba.conf,
pg_hba_file_rules() to a new file called hbafuncs.c, which will be used
as the location for the SQL portion of the authentication file parsing.
While on it, HbaToken, the structure holding a string token lexed from a
configuration file related to authentication, is renamed to a more
generic AuthToken, as it gets used not only for pg_hba.conf, but also
for pg_ident.conf. TokenizedLine is now named TokenizedAuthLine.
The size of hba.c is reduced by ~12%.
Author: Julien Rouhaud
Reviewed-by: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
2022-03-24 04:42:30 +01:00
|
|
|
AuthToken
|
2017-08-21 20:43:00 +02:00
|
|
|
AutoPrewarmSharedState
|
2010-02-26 02:55:35 +01:00
|
|
|
AutoVacOpts
|
|
|
|
AutoVacuumShmemStruct
|
2017-05-17 21:52:16 +02:00
|
|
|
AutoVacuumWorkItem
|
|
|
|
AutoVacuumWorkItemType
|
2010-02-26 02:55:35 +01:00
|
|
|
AuxProcType
|
|
|
|
BF_ctx
|
|
|
|
BF_key
|
|
|
|
BF_word
|
2011-11-14 18:12:23 +01:00
|
|
|
BF_word_signed
|
2010-02-26 02:55:35 +01:00
|
|
|
BIGNUM
|
|
|
|
BIO
|
|
|
|
BIO_METHOD
|
|
|
|
BITVECP
|
2012-06-10 21:15:31 +02:00
|
|
|
BMS_Comparison
|
2010-02-26 02:55:35 +01:00
|
|
|
BMS_Membership
|
|
|
|
BN_CTX
|
|
|
|
BOOL
|
|
|
|
BOOLEAN
|
|
|
|
BOX
|
2011-11-14 18:12:23 +01:00
|
|
|
BTArrayKeyInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
BTBuildState
|
|
|
|
BTCycleId
|
2020-05-14 19:06:38 +02:00
|
|
|
BTDedupInterval
|
|
|
|
BTDedupState
|
2014-05-06 15:08:14 +02:00
|
|
|
BTDedupStateData
|
2010-02-26 02:55:35 +01:00
|
|
|
BTDeletedPageData
|
|
|
|
BTIndexStat
|
|
|
|
BTInsertState
|
2014-05-06 15:08:14 +02:00
|
|
|
BTInsertStateData
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
BTLeader
|
2010-02-26 02:55:35 +01:00
|
|
|
BTMetaPageData
|
|
|
|
BTOneVacInfo
|
2019-11-25 01:40:53 +01:00
|
|
|
BTOptions
|
2017-05-17 21:52:16 +02:00
|
|
|
BTPS_State
|
2010-02-26 02:55:35 +01:00
|
|
|
BTPageOpaque
|
|
|
|
BTPageOpaqueData
|
|
|
|
BTPageStat
|
|
|
|
BTPageState
|
2017-02-15 13:41:14 +01:00
|
|
|
BTParallelScanDesc
|
2021-05-12 19:14:10 +02:00
|
|
|
BTPendingFSM
|
2019-05-22 18:55:34 +02:00
|
|
|
BTScanInsert
|
2014-05-06 15:08:14 +02:00
|
|
|
BTScanInsertData
|
2010-02-26 02:55:35 +01:00
|
|
|
BTScanOpaque
|
|
|
|
BTScanOpaqueData
|
2015-05-24 03:20:37 +02:00
|
|
|
BTScanPos
|
2010-02-26 02:55:35 +01:00
|
|
|
BTScanPosData
|
|
|
|
BTScanPosItem
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
BTShared
|
2011-11-14 18:12:23 +01:00
|
|
|
BTSortArrayContext
|
2010-02-26 02:55:35 +01:00
|
|
|
BTSpool
|
|
|
|
BTStack
|
|
|
|
BTStackData
|
|
|
|
BTVacInfo
|
|
|
|
BTVacState
|
2020-05-14 19:06:38 +02:00
|
|
|
BTVacuumPosting
|
|
|
|
BTVacuumPostingData
|
2010-02-26 02:55:35 +01:00
|
|
|
BTWriteState
|
2021-05-12 19:14:10 +02:00
|
|
|
BUF_MEM
|
2010-02-26 02:55:35 +01:00
|
|
|
BYTE
|
|
|
|
BY_HANDLE_FILE_INFORMATION
|
|
|
|
Backend
|
|
|
|
BackendId
|
2011-04-09 05:11:37 +02:00
|
|
|
BackendParameters
|
2012-06-10 21:15:31 +02:00
|
|
|
BackendState
|
2017-05-17 21:52:16 +02:00
|
|
|
BackendType
|
2013-05-29 22:58:43 +02:00
|
|
|
BackgroundWorker
|
2014-05-06 15:08:14 +02:00
|
|
|
BackgroundWorkerArray
|
|
|
|
BackgroundWorkerHandle
|
|
|
|
BackgroundWorkerSlot
|
2022-09-26 04:15:47 +02:00
|
|
|
BackupState
|
2018-04-26 20:45:04 +02:00
|
|
|
Barrier
|
2011-04-09 05:11:37 +02:00
|
|
|
BaseBackupCmd
|
|
|
|
BaseBackupTargetHandle
|
|
|
|
BaseBackupTargetType
|
2016-05-02 15:23:55 +02:00
|
|
|
BeginDirectModify_function
|
2018-04-26 20:45:04 +02:00
|
|
|
BeginForeignInsert_function
|
2011-04-09 05:11:37 +02:00
|
|
|
BeginForeignModify_function
|
|
|
|
BeginForeignScan_function
|
2016-05-02 15:23:55 +02:00
|
|
|
BeginSampleScan_function
|
2015-05-24 03:20:37 +02:00
|
|
|
BernoulliSamplerData
|
2013-05-29 22:58:43 +02:00
|
|
|
BgWorkerStartTime
|
2014-05-06 15:08:14 +02:00
|
|
|
BgwHandleStatus
|
2019-05-22 18:55:34 +02:00
|
|
|
BinaryArithmFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
BindParamCbData
|
2015-05-24 03:20:37 +02:00
|
|
|
BipartiteMatchState
|
2022-05-12 21:17:30 +02:00
|
|
|
BitString
|
2010-02-26 02:55:35 +01:00
|
|
|
BitmapAnd
|
|
|
|
BitmapAndPath
|
|
|
|
BitmapAndState
|
|
|
|
BitmapHeapPath
|
|
|
|
BitmapHeapScan
|
|
|
|
BitmapHeapScanState
|
|
|
|
BitmapIndexScan
|
|
|
|
BitmapIndexScanState
|
|
|
|
BitmapOr
|
|
|
|
BitmapOrPath
|
|
|
|
BitmapOrState
|
|
|
|
Bitmapset
|
|
|
|
BlobInfo
|
|
|
|
Block
|
|
|
|
BlockId
|
|
|
|
BlockIdData
|
2017-08-21 20:43:00 +02:00
|
|
|
BlockInfoRecord
|
2010-02-26 02:55:35 +01:00
|
|
|
BlockNumber
|
|
|
|
BlockSampler
|
|
|
|
BlockSamplerData
|
2016-04-27 17:47:28 +02:00
|
|
|
BlockedProcData
|
|
|
|
BlockedProcsData
|
|
|
|
BloomBuildState
|
2018-04-01 02:49:41 +02:00
|
|
|
BloomFilter
|
2016-05-02 15:23:55 +02:00
|
|
|
BloomMetaPageData
|
|
|
|
BloomOpaque
|
2016-04-27 17:47:28 +02:00
|
|
|
BloomOptions
|
2016-05-02 15:23:55 +02:00
|
|
|
BloomPageOpaque
|
|
|
|
BloomPageOpaqueData
|
|
|
|
BloomScanOpaque
|
|
|
|
BloomScanOpaqueData
|
2016-06-09 18:15:33 +02:00
|
|
|
BloomSignatureWord
|
2016-04-27 17:47:28 +02:00
|
|
|
BloomState
|
|
|
|
BloomTuple
|
2014-05-06 15:08:14 +02:00
|
|
|
BoolAggState
|
2010-02-26 02:55:35 +01:00
|
|
|
BoolExpr
|
|
|
|
BoolExprType
|
|
|
|
BoolTestType
|
2022-05-12 21:17:30 +02:00
|
|
|
Boolean
|
2010-02-26 02:55:35 +01:00
|
|
|
BooleanTest
|
|
|
|
BpChar
|
2015-05-24 03:20:37 +02:00
|
|
|
BrinBuildState
|
|
|
|
BrinDesc
|
|
|
|
BrinMemTuple
|
|
|
|
BrinMetaPageData
|
|
|
|
BrinOpaque
|
|
|
|
BrinOpcInfo
|
|
|
|
BrinOptions
|
|
|
|
BrinRevmap
|
|
|
|
BrinSpecialSpace
|
2017-05-17 21:52:16 +02:00
|
|
|
BrinStatsData
|
2015-05-24 03:20:37 +02:00
|
|
|
BrinTuple
|
|
|
|
BrinValues
|
Add amcheck extension to contrib.
This is the beginning of a collection of SQL-callable functions to
verify the integrity of data files. For now it only contains code to
verify B-Tree indexes.
This adds two SQL-callable functions, validating B-Tree consistency to
a varying degree. Check the, extensive, docs for details.
The goal is to later extend the coverage of the module to further
access methods, possibly including the heap. Once checks for
additional access methods exist, we'll likely add some "dispatch"
functions that cover multiple access methods.
Author: Peter Geoghegan, editorialized by Andres Freund
Reviewed-By: Andres Freund, Tomas Vondra, Thomas Munro,
Anastasia Lubennikova, Robert Haas, Amit Langote
Discussion: CAM3SWZQzLMhMwmBqjzK+pRKXrNUZ4w90wYMUWfkeV8mZ3Debvw@mail.gmail.com
2017-03-10 00:50:40 +01:00
|
|
|
BtreeCheckState
|
|
|
|
BtreeLevel
|
2010-02-26 02:55:35 +01:00
|
|
|
Bucket
|
|
|
|
BufFile
|
|
|
|
Buffer
|
|
|
|
BufferAccessStrategy
|
|
|
|
BufferAccessStrategyType
|
|
|
|
BufferCachePagesContext
|
|
|
|
BufferCachePagesRec
|
|
|
|
BufferDesc
|
2015-05-24 03:20:37 +02:00
|
|
|
BufferDescPadded
|
2010-02-26 02:55:35 +01:00
|
|
|
BufferHeapTupleTableSlot
|
|
|
|
BufferLookupEnt
|
|
|
|
BufferStrategyControl
|
|
|
|
BufferTag
|
|
|
|
BufferUsage
|
|
|
|
BuildAccumulator
|
2016-04-27 17:47:28 +02:00
|
|
|
BuiltinScript
|
2010-02-26 02:55:35 +01:00
|
|
|
BulkInsertState
|
|
|
|
BulkInsertStateData
|
|
|
|
CACHESIGN
|
|
|
|
CAC_state
|
Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
entries - instead store them as a Datum array. This also allows to
get rid of having to build dummy tuples for negative & list
entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
specific number of attributes allows to unroll loops and avoid
other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
piece.
This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.
I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
shows up in profiles. Unfortunately it's not easy to do so safely as
an entry's memory location can change at various times, which
doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
but the win isn't big and the code for it is ugly, because the
tuples have to be freed as well.
- add more proper functions, rather than macros for
SearchSysCacheCopyN etc., but right now they don't show up in
profiles.
The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside. That might be a good idea
anyway, but it's for another day.
Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 22:16:50 +02:00
|
|
|
CCFastEqualFN
|
2017-11-29 15:24:24 +01:00
|
|
|
CCHashFN
|
2015-05-24 03:20:37 +02:00
|
|
|
CEOUC_WAIT_MODE
|
2010-02-26 02:55:35 +01:00
|
|
|
CFuncHashTabEntry
|
|
|
|
CHAR
|
2011-04-09 05:11:37 +02:00
|
|
|
CHECKPOINT
|
2010-02-26 02:55:35 +01:00
|
|
|
CHKVAL
|
|
|
|
CIRCLE
|
|
|
|
CMPDAffix
|
|
|
|
CONTEXT
|
|
|
|
COP
|
|
|
|
CRITICAL_SECTION
|
2017-05-17 21:52:16 +02:00
|
|
|
CRSSnapshotAction
|
2010-02-26 02:55:35 +01:00
|
|
|
CState
|
2021-05-12 19:14:10 +02:00
|
|
|
CTECycleClause
|
2019-03-27 22:59:19 +01:00
|
|
|
CTEMaterialize
|
2021-05-12 19:14:10 +02:00
|
|
|
CTESearchClause
|
2010-02-26 02:55:35 +01:00
|
|
|
CV
|
2019-05-22 18:55:34 +02:00
|
|
|
CachedExpression
|
2010-02-26 02:55:35 +01:00
|
|
|
CachedPlan
|
|
|
|
CachedPlanSource
|
2018-04-26 20:45:04 +02:00
|
|
|
CallContext
|
|
|
|
CallStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CancelRequestPacket
|
2022-05-12 21:17:30 +02:00
|
|
|
Cardinality
|
2010-02-26 02:55:35 +01:00
|
|
|
CaseExpr
|
|
|
|
CaseTestExpr
|
|
|
|
CaseWhen
|
|
|
|
Cash
|
|
|
|
CastInfo
|
|
|
|
CatCList
|
|
|
|
CatCTup
|
|
|
|
CatCache
|
|
|
|
CatCacheHeader
|
|
|
|
CatalogId
|
2017-05-17 21:52:16 +02:00
|
|
|
CatalogIdMapEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
CatalogIndexState
|
|
|
|
ChangeVarNodes_context
|
|
|
|
CheckPoint
|
|
|
|
CheckPointStmt
|
|
|
|
CheckpointStatsData
|
2012-06-10 21:15:31 +02:00
|
|
|
CheckpointerRequest
|
|
|
|
CheckpointerShmemStruct
|
2010-02-26 02:55:35 +01:00
|
|
|
Chromosome
|
2016-02-19 21:17:51 +01:00
|
|
|
CkptSortItem
|
|
|
|
CkptTsStatus
|
2011-04-09 05:11:37 +02:00
|
|
|
ClientAuthentication_hook_type
|
2019-05-22 18:55:34 +02:00
|
|
|
ClientCertMode
|
|
|
|
ClientCertName
|
Allow parallel workers to retrieve some data from Port
This commit moves authn_id into a new global structure called
ClientConnectionInfo (mapping to a MyClientConnectionInfo for each
backend) which is intended to hold all the client information that
should be shared between the backend and any of its parallel workers,
access for extensions and triggers being the primary use case. There is
no need to push all the data of Port to the workers, and authn_id is
quite a generic concept so using a separate structure provides the best
balance (the name of the structure has been suggested by Robert Haas).
While on it, and per discussion as this would be useful for a potential
SYSTEM_USER that can be accessed through parallel workers, a second
field is added for the authentication method, copied directly from
Port.
ClientConnectionInfo is serialized and restored using a new parallel
key and a structure tracks the length of the authn_id, making the
addition of more fields straight-forward.
Author: Jacob Champion
Reviewed-by: Bertrand Drouvot, Stephen Frost, Robert Haas, Tom Lane,
Michael Paquier, Julien Rouhaud
Discussion: https://postgr.es/m/793d990837ae5c06a558d58d62de9378ab525d83.camel@vmware.com
2022-08-24 05:57:13 +02:00
|
|
|
ClientConnectionInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
ClientData
|
2017-05-17 21:52:16 +02:00
|
|
|
ClonePtrType
|
2010-02-26 02:55:35 +01:00
|
|
|
ClosePortalStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
ClosePtrType
|
2010-02-26 02:55:35 +01:00
|
|
|
Clump
|
2010-07-06 21:18:19 +02:00
|
|
|
ClusterInfo
|
2021-01-18 06:03:10 +01:00
|
|
|
ClusterParams
|
2010-02-26 02:55:35 +01:00
|
|
|
ClusterStmt
|
|
|
|
CmdType
|
|
|
|
CoalesceExpr
|
|
|
|
CoerceParamHook
|
|
|
|
CoerceToDomain
|
|
|
|
CoerceToDomainValue
|
|
|
|
CoerceViaIO
|
|
|
|
CoercionContext
|
|
|
|
CoercionForm
|
|
|
|
CoercionPathType
|
2017-08-14 23:29:33 +02:00
|
|
|
CollAliasData
|
2011-04-09 05:11:37 +02:00
|
|
|
CollInfo
|
|
|
|
CollateClause
|
|
|
|
CollateExpr
|
|
|
|
CollateStrength
|
2015-05-24 03:20:37 +02:00
|
|
|
CollectedATSubcmd
|
|
|
|
CollectedCommand
|
|
|
|
CollectedCommandType
|
2013-05-29 22:58:43 +02:00
|
|
|
ColorTrgm
|
|
|
|
ColorTrgmInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
ColumnCompareData
|
|
|
|
ColumnDef
|
|
|
|
ColumnIOData
|
|
|
|
ColumnRef
|
|
|
|
ColumnsHashData
|
2017-05-17 21:52:16 +02:00
|
|
|
CombinationGenerator
|
2010-02-26 02:55:35 +01:00
|
|
|
ComboCidEntry
|
|
|
|
ComboCidEntryData
|
|
|
|
ComboCidKey
|
|
|
|
ComboCidKeyData
|
|
|
|
Command
|
|
|
|
CommandDest
|
|
|
|
CommandId
|
2020-05-14 19:06:38 +02:00
|
|
|
CommandTag
|
|
|
|
CommandTagBehavior
|
2010-02-26 02:55:35 +01:00
|
|
|
CommentItem
|
|
|
|
CommentStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
CommitTimestampEntry
|
|
|
|
CommitTimestampShared
|
2011-11-14 18:12:23 +01:00
|
|
|
CommonEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
CommonTableExpr
|
|
|
|
CompareScalarsContext
|
2018-03-20 10:20:46 +01:00
|
|
|
CompiledExprState
|
2017-05-17 21:52:16 +02:00
|
|
|
CompositeIOData
|
2010-02-26 02:55:35 +01:00
|
|
|
CompositeTypeStmt
|
2016-04-27 17:47:28 +02:00
|
|
|
CompoundAffixFlag
|
2023-02-23 18:33:30 +01:00
|
|
|
CompressFileHandle
|
2010-02-26 02:55:35 +01:00
|
|
|
CompressionLocation
|
2011-04-09 05:11:37 +02:00
|
|
|
CompressorState
|
snapshot scalability: Don't compute global horizons while building snapshots.
To make GetSnapshotData() more scalable, it cannot not look at at each proc's
xmin: While snapshot contents do not need to change whenever a read-only
transaction commits or a snapshot is released, a proc's xmin is modified in
those cases. The frequency of xmin modifications leads to, particularly on
higher core count systems, many cache misses inside GetSnapshotData(), despite
the data underlying a snapshot not changing. That is the most
significant source of GetSnapshotData() scaling poorly on larger systems.
Without accessing xmins, GetSnapshotData() cannot calculate accurate horizons /
thresholds as it has so far. But we don't really have to: The horizons don't
actually change that much between GetSnapshotData() calls. Nor are the horizons
actually used every time a snapshot is built.
The trick this commit introduces is to delay computation of accurate horizons
until there use and using horizon boundaries to determine whether accurate
horizons need to be computed.
The use of RecentGlobal[Data]Xmin to decide whether a row version could be
removed has been replaces with new GlobalVisTest* functions. These use two
thresholds to determine whether a row can be pruned:
1) definitely_needed, indicating that rows deleted by XIDs >= definitely_needed
are definitely still visible.
2) maybe_needed, indicating that rows deleted by XIDs < maybe_needed can
definitely be removed
GetSnapshotData() updates definitely_needed to be the xmin of the computed
snapshot.
When testing whether a row can be removed (with GlobalVisTestIsRemovableXid())
and the tested XID falls in between the two (i.e. XID >= maybe_needed && XID <
definitely_needed) the boundaries can be recomputed to be more accurate. As it
is not cheap to compute accurate boundaries, we limit the number of times that
happens in short succession. As the boundaries used by
GlobalVisTestIsRemovableXid() are never reset (with maybe_needed updated by
GetSnapshotData()), it is likely that further test can benefit from an earlier
computation of accurate horizons.
To avoid regressing performance when old_snapshot_threshold is set (as that
requires an accurate horizon to be computed), heap_page_prune_opt() doesn't
unconditionally call TransactionIdLimitedForOldSnapshots() anymore. Both the
computation of the limited horizon, and the triggering of errors (with
SetOldSnapshotThresholdTimestamp()) is now only done when necessary to remove
tuples.
This commit just removes the accesses to PGXACT->xmin from
GetSnapshotData(), but other members of PGXACT residing in the same
cache line are accessed. Therefore this in itself does not result in a
significant improvement. Subsequent commits will take advantage of the
fact that GetSnapshotData() now does not need to access xmins anymore.
Note: This contains a workaround in heap_page_prune_opt() to keep the
snapshot_too_old tests working. While that workaround is ugly, the tests
currently are not meaningful, and it seems best to address them separately.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
2020-08-13 01:03:49 +02:00
|
|
|
ComputeXidHorizonsResult
|
2016-12-13 16:51:32 +01:00
|
|
|
ConditionVariable
|
2021-03-10 22:05:58 +01:00
|
|
|
ConditionVariableMinimallyPadded
|
2017-05-17 21:52:16 +02:00
|
|
|
ConditionalStack
|
2016-04-27 17:47:28 +02:00
|
|
|
ConfigData
|
2011-04-09 05:11:37 +02:00
|
|
|
ConfigVariable
|
2010-02-26 02:55:35 +01:00
|
|
|
ConnCacheEntry
|
2013-05-29 22:58:43 +02:00
|
|
|
ConnCacheKey
|
Refactor and generalize the ParallelSlot machinery.
Create a wrapper object, ParallelSlotArray, to encapsulate the
number of slots and the slot array itself, plus some other relevant
bits of information. This reduces the number of parameters we have
to pass around all over the place.
Allow for a ParallelSlotArray to contain slots connected to
different databases within a single cluster. The current clients
of this mechanism don't need this, but it is expected to be used
by future patches.
Defer connecting to databases until we actually need the connection
for something. This is a slight behavior change for vacuumdb and
reindexdb. If you specify a number of jobs that is larger than the
number of objects, the extra connections will now not be used.
But, on the other hand, if you specify a number of jobs that is
so large that it's going to fail, the failure would previously have
happened before any operations were actually started, and now it
won't.
Mark Dilger, reviewed by me.
Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com
Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com
2021-03-11 19:17:46 +01:00
|
|
|
ConnParams
|
2010-02-26 02:55:35 +01:00
|
|
|
ConnStatusType
|
|
|
|
ConnType
|
2016-12-13 16:51:32 +01:00
|
|
|
ConnectionStateEnum
|
2011-11-14 18:12:23 +01:00
|
|
|
ConsiderSplitContext
|
2010-02-26 02:55:35 +01:00
|
|
|
Const
|
|
|
|
ConstrCheck
|
|
|
|
ConstrType
|
|
|
|
Constraint
|
|
|
|
ConstraintCategory
|
|
|
|
ConstraintInfo
|
|
|
|
ConstraintsSetStmt
|
2010-07-06 21:18:19 +02:00
|
|
|
ControlData
|
2010-02-26 02:55:35 +01:00
|
|
|
ControlFileData
|
|
|
|
ConvInfo
|
|
|
|
ConvProcInfo
|
|
|
|
ConversionLocation
|
|
|
|
ConvertRowtypeExpr
|
|
|
|
CookedConstraint
|
|
|
|
CopyDest
|
|
|
|
CopyFormatOptions
|
2021-05-12 19:14:10 +02:00
|
|
|
CopyFromState
|
2010-02-26 02:55:35 +01:00
|
|
|
CopyFromStateData
|
2022-05-12 21:17:30 +02:00
|
|
|
CopyHeaderChoice
|
tableam: Add table_multi_insert() and revamp/speed-up COPY FROM buffering.
This adds table_multi_insert(), and converts COPY FROM, the only user
of heap_multi_insert, to it.
A simple conversion of COPY FROM use slots would have yielded a
slowdown when inserting into a partitioned table for some
workloads. Different partitions might need different slots (both slot
types and their descriptors), and dropping / creating slots when
there's constant partition changes is measurable.
Thus instead revamp the COPY FROM buffering for partitioned tables to
allow to buffer inserts into multiple tables, flushing only when
limits are reached across all partition buffers. By only dropping
slots when there've been inserts into too many different partitions,
the aforementioned overhead is gone. By allowing larger batches, even
when there are frequent partition changes, we actuall speed such cases
up significantly.
By using slots COPY of very narrow rows into unlogged / temporary
might slow down very slightly (due to the indirect function calls).
Author: David Rowley, Andres Freund, Haribabu Kommi
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20190327054923.t3epfuewxfqdt22e@alap3.anarazel.de
2019-04-05 00:47:19 +02:00
|
|
|
CopyInsertMethod
|
|
|
|
CopyMultiInsertBuffer
|
|
|
|
CopyMultiInsertInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
CopySource
|
|
|
|
CopyStmt
|
2021-05-12 19:14:10 +02:00
|
|
|
CopyToState
|
2010-02-26 02:55:35 +01:00
|
|
|
CopyToStateData
|
|
|
|
Cost
|
|
|
|
CostSelector
|
|
|
|
Counters
|
2013-06-01 16:18:59 +02:00
|
|
|
CoverExt
|
2010-02-26 02:55:35 +01:00
|
|
|
CoverPos
|
2016-04-27 17:47:28 +02:00
|
|
|
CreateAmStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateCastStmt
|
|
|
|
CreateConversionStmt
|
Add new block-by-block strategy for CREATE DATABASE.
Because this strategy logs changes on a block-by-block basis, it
avoids the need to checkpoint before and after the operation.
However, because it logs each changed block individually, it might
generate a lot of extra write-ahead logging if the template database
is large. Therefore, the older strategy remains available via a new
STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy
option to createdb.
Somewhat controversially, this patch assembles the list of relations
to be copied to the new database by reading the pg_class relation of
the template database. Cross-database access like this isn't normally
possible, but it can be made to work here because there can't be any
connections to the database being copied, nor can it contain any
in-doubt transactions. Even so, we have to use lower-level interfaces
than normal, since the table scan and relcache interfaces will not
work for a database to which we're not connected. The advantage of
this approach is that we do not need to rely on the filesystem to
determine what ought to be copied, but instead on PostgreSQL's own
knowledge of the database structure. This avoids, for example,
copying stray files that happen to be located in the source database
directory.
Dilip Kumar, with a fairly large number of cosmetic changes by me.
Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor,
Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian,
Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro
Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others.
Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com
2022-03-29 17:31:43 +02:00
|
|
|
CreateDBRelInfo
|
|
|
|
CreateDBStrategy
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateDomainStmt
|
|
|
|
CreateEnumStmt
|
|
|
|
CreateEventTrigStmt
|
2011-04-09 05:11:37 +02:00
|
|
|
CreateExtensionStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateFdwStmt
|
|
|
|
CreateForeignServerStmt
|
2011-04-09 05:11:37 +02:00
|
|
|
CreateForeignTableStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateFunctionStmt
|
|
|
|
CreateOpClassItem
|
|
|
|
CreateOpClassStmt
|
|
|
|
CreateOpFamilyStmt
|
|
|
|
CreatePLangStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
CreatePolicyStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
CreatePublicationStmt
|
2011-11-14 18:12:23 +01:00
|
|
|
CreateRangeStmt
|
2014-05-06 15:08:14 +02:00
|
|
|
CreateReplicationSlotCmd
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateRoleStmt
|
|
|
|
CreateSchemaStmt
|
|
|
|
CreateSchemaStmtContext
|
|
|
|
CreateSeqStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
CreateStatsStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateStmt
|
|
|
|
CreateStmtContext
|
2017-05-17 21:52:16 +02:00
|
|
|
CreateSubscriptionStmt
|
2012-06-10 21:15:31 +02:00
|
|
|
CreateTableAsStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateTableSpaceStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
CreateTransformStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
CreateTrigStmt
|
|
|
|
CreateUserMappingStmt
|
|
|
|
CreatedbStmt
|
2011-04-09 05:11:37 +02:00
|
|
|
CredHandle
|
2010-02-26 02:55:35 +01:00
|
|
|
CteItem
|
|
|
|
CteScan
|
|
|
|
CteScanState
|
|
|
|
CteState
|
|
|
|
CtlCommand
|
2011-04-09 05:11:37 +02:00
|
|
|
CtxtHandle
|
2010-02-26 02:55:35 +01:00
|
|
|
CurrentOfExpr
|
2015-05-24 03:20:37 +02:00
|
|
|
CustomExecMethods
|
2017-05-17 21:52:16 +02:00
|
|
|
CustomOutPtrType
|
2015-05-24 03:20:37 +02:00
|
|
|
CustomPath
|
|
|
|
CustomScan
|
|
|
|
CustomScanMethods
|
|
|
|
CustomScanState
|
2010-02-26 02:55:35 +01:00
|
|
|
CycleCtr
|
|
|
|
DBState
|
|
|
|
DCHCacheEntry
|
|
|
|
DEADLOCK_INFO
|
2012-06-10 21:15:31 +02:00
|
|
|
DECountItem
|
2010-02-26 02:55:35 +01:00
|
|
|
DH
|
|
|
|
DIR
|
2014-05-06 15:08:14 +02:00
|
|
|
DNSServiceErrorType
|
|
|
|
DNSServiceRef
|
2010-02-26 02:55:35 +01:00
|
|
|
DR_copy
|
|
|
|
DR_intorel
|
|
|
|
DR_printtup
|
|
|
|
DR_sqlfunction
|
2013-05-29 22:58:43 +02:00
|
|
|
DR_transientrel
|
2017-05-17 21:52:16 +02:00
|
|
|
DSA
|
2010-02-26 02:55:35 +01:00
|
|
|
DWORD
|
|
|
|
DataDumperPtr
|
|
|
|
DataPageDeleteStack
|
2021-05-12 19:14:10 +02:00
|
|
|
DatabaseInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
DateADT
|
|
|
|
Datum
|
|
|
|
DatumTupleFields
|
2010-07-06 21:18:19 +02:00
|
|
|
DbInfo
|
|
|
|
DbInfoArr
|
2017-05-17 21:52:16 +02:00
|
|
|
DeClonePtrType
|
2010-02-26 02:55:35 +01:00
|
|
|
DeadLockState
|
|
|
|
DeallocateStmt
|
|
|
|
DeclareCursorStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
DecodedBkpBlock
|
2022-03-18 05:45:04 +01:00
|
|
|
DecodedXLogRecord
|
2014-05-06 15:08:14 +02:00
|
|
|
DecodingOutputState
|
2010-02-26 02:55:35 +01:00
|
|
|
DefElem
|
|
|
|
DefElemAction
|
|
|
|
DefaultACLInfo
|
|
|
|
DefineStmt
|
|
|
|
DeleteStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
DependencyGenerator
|
|
|
|
DependencyGeneratorData
|
2010-02-26 02:55:35 +01:00
|
|
|
DependencyType
|
|
|
|
DestReceiver
|
|
|
|
DictISpell
|
|
|
|
DictInt
|
|
|
|
DictSimple
|
|
|
|
DictSnowball
|
|
|
|
DictSubState
|
|
|
|
DictSyn
|
|
|
|
DictThesaurus
|
2011-04-09 05:11:37 +02:00
|
|
|
DimensionInfo
|
2016-12-13 16:51:32 +01:00
|
|
|
DirectoryMethodData
|
|
|
|
DirectoryMethodFile
|
2013-05-29 22:58:43 +02:00
|
|
|
DisableTimeoutParams
|
2010-02-26 02:55:35 +01:00
|
|
|
DiscardMode
|
|
|
|
DiscardStmt
|
2021-05-12 19:14:10 +02:00
|
|
|
DistanceValue
|
2010-02-26 02:55:35 +01:00
|
|
|
DistinctExpr
|
|
|
|
DoStmt
|
|
|
|
DocRepresentation
|
2015-05-24 03:20:37 +02:00
|
|
|
DomainConstraintCache
|
|
|
|
DomainConstraintRef
|
2010-02-26 02:55:35 +01:00
|
|
|
DomainConstraintState
|
|
|
|
DomainConstraintType
|
|
|
|
DomainIOData
|
|
|
|
DropBehavior
|
|
|
|
DropOwnedStmt
|
2014-02-01 04:45:17 +01:00
|
|
|
DropReplicationSlotCmd
|
2010-02-26 02:55:35 +01:00
|
|
|
DropRoleStmt
|
|
|
|
DropStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
DropSubscriptionStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
DropTableSpaceStmt
|
|
|
|
DropUserMappingStmt
|
|
|
|
DropdbStmt
|
2016-05-02 15:23:55 +02:00
|
|
|
DumpComponents
|
2010-02-26 02:55:35 +01:00
|
|
|
DumpId
|
2015-05-24 03:20:37 +02:00
|
|
|
DumpOptions
|
2016-06-09 18:15:33 +02:00
|
|
|
DumpSignalInformation
|
2022-05-12 21:17:30 +02:00
|
|
|
DumpableAcl
|
2010-02-26 02:55:35 +01:00
|
|
|
DumpableObject
|
|
|
|
DumpableObjectType
|
|
|
|
DumpableObjectWithAcl
|
|
|
|
DynamicFileList
|
2015-05-24 03:20:37 +02:00
|
|
|
DynamicZoneAbbrev
|
2014-05-06 15:08:14 +02:00
|
|
|
EC_KEY
|
2010-02-26 02:55:35 +01:00
|
|
|
EDGE
|
|
|
|
ENGINE
|
2015-05-24 03:20:37 +02:00
|
|
|
EOM_flatten_into_method
|
|
|
|
EOM_get_flat_size_method
|
2010-02-26 02:55:35 +01:00
|
|
|
EPQState
|
|
|
|
EPlan
|
|
|
|
EState
|
2022-05-12 21:17:30 +02:00
|
|
|
EStatus
|
2016-12-13 16:51:32 +01:00
|
|
|
EVP_CIPHER
|
|
|
|
EVP_CIPHER_CTX
|
2010-02-26 02:55:35 +01:00
|
|
|
EVP_MD
|
|
|
|
EVP_MD_CTX
|
|
|
|
EVP_PKEY
|
2013-05-29 22:58:43 +02:00
|
|
|
EachState
|
2010-02-26 02:55:35 +01:00
|
|
|
Edge
|
2016-04-27 17:47:28 +02:00
|
|
|
EditableObjectType
|
2013-05-29 22:58:43 +02:00
|
|
|
ElementsState
|
|
|
|
EnableTimeoutParams
|
2017-05-17 21:52:16 +02:00
|
|
|
EndBlobPtrType
|
|
|
|
EndBlobsPtrType
|
|
|
|
EndDataPtrType
|
2016-05-02 15:23:55 +02:00
|
|
|
EndDirectModify_function
|
2018-04-26 20:45:04 +02:00
|
|
|
EndForeignInsert_function
|
2011-04-09 05:11:37 +02:00
|
|
|
EndForeignModify_function
|
|
|
|
EndForeignScan_function
|
2022-02-16 08:30:38 +01:00
|
|
|
EndOfWalRecoveryInfo
|
2016-05-02 15:23:55 +02:00
|
|
|
EndSampleScan_function
|
2011-04-09 05:11:37 +02:00
|
|
|
EnumItem
|
2010-02-26 02:55:35 +01:00
|
|
|
EolType
|
2017-05-17 21:52:16 +02:00
|
|
|
EphemeralNameRelationType
|
|
|
|
EphemeralNamedRelation
|
|
|
|
EphemeralNamedRelationData
|
|
|
|
EphemeralNamedRelationMetadata
|
|
|
|
EphemeralNamedRelationMetadataData
|
2010-02-26 02:55:35 +01:00
|
|
|
EquivalenceClass
|
|
|
|
EquivalenceMember
|
|
|
|
ErrorContextCallback
|
|
|
|
ErrorData
|
2022-12-09 15:58:38 +01:00
|
|
|
ErrorSaveContext
|
2016-05-02 15:23:55 +02:00
|
|
|
EstimateDSMForeignScan_function
|
|
|
|
EstimationInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
EventTriggerCacheEntry
|
|
|
|
EventTriggerCacheItem
|
|
|
|
EventTriggerCacheStateType
|
|
|
|
EventTriggerData
|
|
|
|
EventTriggerEvent
|
|
|
|
EventTriggerInfo
|
|
|
|
EventTriggerQueryState
|
|
|
|
ExceptionLabelMap
|
2011-04-09 05:11:37 +02:00
|
|
|
ExceptionMap
|
|
|
|
ExecAuxRowMark
|
2018-04-26 20:45:04 +02:00
|
|
|
ExecEvalBoolSubroutine
|
2010-02-26 02:55:35 +01:00
|
|
|
ExecEvalJsonExprContext
|
2018-04-26 20:45:04 +02:00
|
|
|
ExecEvalSubroutine
|
2011-04-09 05:11:37 +02:00
|
|
|
ExecForeignBatchInsert_function
|
2012-06-10 21:15:31 +02:00
|
|
|
ExecForeignDelete_function
|
2011-04-09 05:11:37 +02:00
|
|
|
ExecForeignInsert_function
|
|
|
|
ExecForeignTruncate_function
|
|
|
|
ExecForeignUpdate_function
|
2016-04-27 17:47:28 +02:00
|
|
|
ExecParallelEstimateContext
|
|
|
|
ExecParallelInitializeDSMContext
|
|
|
|
ExecPhraseData
|
2017-08-14 23:29:33 +02:00
|
|
|
ExecProcNodeMtd
|
2010-02-26 02:55:35 +01:00
|
|
|
ExecRowMark
|
|
|
|
ExecScanAccessMtd
|
|
|
|
ExecScanRecheckMtd
|
|
|
|
ExecStatus
|
|
|
|
ExecStatusType
|
|
|
|
ExecuteStmt
|
2011-04-09 05:11:37 +02:00
|
|
|
ExecutorCheckPerms_hook_type
|
2010-02-26 02:55:35 +01:00
|
|
|
ExecutorEnd_hook_type
|
2011-04-09 05:11:37 +02:00
|
|
|
ExecutorFinish_hook_type
|
2010-02-26 02:55:35 +01:00
|
|
|
ExecutorRun_hook_type
|
|
|
|
ExecutorStart_hook_type
|
2015-05-24 03:20:37 +02:00
|
|
|
ExpandedArrayHeader
|
|
|
|
ExpandedObjectHeader
|
|
|
|
ExpandedObjectMethods
|
2021-05-12 19:14:10 +02:00
|
|
|
ExpandedRange
|
2018-04-26 20:45:04 +02:00
|
|
|
ExpandedRecordFieldInfo
|
|
|
|
ExpandedRecordHeader
|
2016-05-02 15:23:55 +02:00
|
|
|
ExplainDirectModify_function
|
2011-04-09 05:11:37 +02:00
|
|
|
ExplainForeignModify_function
|
|
|
|
ExplainForeignScan_function
|
2010-02-26 02:55:35 +01:00
|
|
|
ExplainFormat
|
|
|
|
ExplainOneQuery_hook_type
|
|
|
|
ExplainState
|
|
|
|
ExplainStmt
|
|
|
|
ExplainWorkersState
|
2017-06-21 20:09:24 +02:00
|
|
|
ExportedSnapshot
|
2010-02-26 02:55:35 +01:00
|
|
|
Expr
|
|
|
|
ExprContext
|
|
|
|
ExprContextCallbackFunction
|
|
|
|
ExprContext_CB
|
|
|
|
ExprDoneCond
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
ExprEvalOp
|
2018-01-09 22:25:38 +01:00
|
|
|
ExprEvalOpLookup
|
2011-04-09 05:11:37 +02:00
|
|
|
ExprEvalRowtypeCache
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
ExprEvalStep
|
2017-05-17 21:52:16 +02:00
|
|
|
ExprState
|
|
|
|
ExprStateEvalFunc
|
2016-04-27 17:47:28 +02:00
|
|
|
ExtensibleNode
|
|
|
|
ExtensibleNodeEntry
|
|
|
|
ExtensibleNodeMethods
|
2011-04-09 05:11:37 +02:00
|
|
|
ExtensionControlFile
|
|
|
|
ExtensionInfo
|
|
|
|
ExtensionVersionInfo
|
2013-05-29 22:58:43 +02:00
|
|
|
FDWCollateState
|
2011-04-09 05:11:37 +02:00
|
|
|
FD_SET
|
2010-02-26 02:55:35 +01:00
|
|
|
FILE
|
|
|
|
FILETIME
|
|
|
|
FPI
|
|
|
|
FSMAddress
|
|
|
|
FSMPage
|
|
|
|
FSMPageData
|
|
|
|
FakeRelCacheEntry
|
|
|
|
FakeRelCacheEntryData
|
2011-11-14 18:12:23 +01:00
|
|
|
FastPathStrongRelationLockData
|
2010-02-26 02:55:35 +01:00
|
|
|
FdwInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
FdwRoutine
|
2010-02-26 02:55:35 +01:00
|
|
|
FetchDirection
|
|
|
|
FetchStmt
|
|
|
|
FieldSelect
|
|
|
|
FieldStore
|
|
|
|
File
|
2011-04-09 05:11:37 +02:00
|
|
|
FileFdwExecutionState
|
2012-06-10 21:15:31 +02:00
|
|
|
FileFdwPlanState
|
2010-07-06 21:18:19 +02:00
|
|
|
FileNameMap
|
2021-08-30 05:15:35 +02:00
|
|
|
FileSet
|
2019-04-04 10:56:03 +02:00
|
|
|
FileTag
|
2015-05-24 03:20:37 +02:00
|
|
|
FinalPathExtraData
|
2010-02-26 02:55:35 +01:00
|
|
|
FindColsContext
|
|
|
|
FindSplitData
|
|
|
|
FindSplitStrat
|
2015-05-24 03:20:37 +02:00
|
|
|
FixedParallelExecutorState
|
|
|
|
FixedParallelState
|
2010-02-26 02:55:35 +01:00
|
|
|
FixedParamState
|
2016-04-27 17:47:28 +02:00
|
|
|
FlagMode
|
2022-05-12 21:17:30 +02:00
|
|
|
Float
|
2017-05-17 21:52:16 +02:00
|
|
|
FlushPosition
|
2010-02-26 02:55:35 +01:00
|
|
|
FmgrBuiltin
|
2011-04-09 05:11:37 +02:00
|
|
|
FmgrHookEventType
|
2010-02-26 02:55:35 +01:00
|
|
|
FmgrInfo
|
2020-05-14 19:06:38 +02:00
|
|
|
ForBothCellState
|
|
|
|
ForBothState
|
|
|
|
ForEachState
|
|
|
|
ForFiveState
|
|
|
|
ForFourState
|
|
|
|
ForThreeState
|
2011-04-09 05:11:37 +02:00
|
|
|
ForeignAsyncConfigureWait_function
|
|
|
|
ForeignAsyncNotify_function
|
2016-05-02 15:23:55 +02:00
|
|
|
ForeignAsyncRequest_function
|
2010-02-26 02:55:35 +01:00
|
|
|
ForeignDataWrapper
|
|
|
|
ForeignKeyCacheInfo
|
|
|
|
ForeignKeyOptInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
ForeignPath
|
|
|
|
ForeignScan
|
|
|
|
ForeignScanState
|
2010-02-26 02:55:35 +01:00
|
|
|
ForeignServer
|
|
|
|
ForeignServerInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
ForeignTable
|
Allow TRUNCATE command to truncate foreign tables.
This commit introduces new foreign data wrapper API for TRUNCATE.
It extends TRUNCATE command so that it accepts foreign tables as
the targets to truncate and invokes that API. Also it extends postgres_fdw
so that it can issue TRUNCATE command to foreign servers, by adding
new routine for that TRUNCATE API.
The information about options specified in TRUNCATE command, e.g.,
ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to
truncate is also passed to FDW. FDW truncates the foreign data sources
that the passed foreign tables specify, based on those information.
For example, postgres_fdw constructs TRUNCATE command using them
and issues it to the foreign server.
For performance, TRUNCATE command invokes the FDW routine for
TRUNCATE once per foreign server that foreign tables to truncate belong to.
Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao
Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com
Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com
2021-04-08 13:56:08 +02:00
|
|
|
ForeignTruncateInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
ForkNumber
|
|
|
|
FormData_pg_aggregate
|
|
|
|
FormData_pg_am
|
|
|
|
FormData_pg_amop
|
|
|
|
FormData_pg_amproc
|
|
|
|
FormData_pg_attrdef
|
|
|
|
FormData_pg_attribute
|
|
|
|
FormData_pg_auth_members
|
|
|
|
FormData_pg_authid
|
|
|
|
FormData_pg_cast
|
|
|
|
FormData_pg_class
|
2011-04-09 05:11:37 +02:00
|
|
|
FormData_pg_collation
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_constraint
|
|
|
|
FormData_pg_conversion
|
|
|
|
FormData_pg_database
|
|
|
|
FormData_pg_default_acl
|
|
|
|
FormData_pg_depend
|
|
|
|
FormData_pg_enum
|
|
|
|
FormData_pg_event_trigger
|
2011-04-09 05:11:37 +02:00
|
|
|
FormData_pg_extension
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_foreign_data_wrapper
|
|
|
|
FormData_pg_foreign_server
|
2011-04-09 05:11:37 +02:00
|
|
|
FormData_pg_foreign_table
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_index
|
|
|
|
FormData_pg_inherits
|
|
|
|
FormData_pg_language
|
|
|
|
FormData_pg_largeobject
|
|
|
|
FormData_pg_largeobject_metadata
|
|
|
|
FormData_pg_namespace
|
|
|
|
FormData_pg_opclass
|
|
|
|
FormData_pg_operator
|
|
|
|
FormData_pg_opfamily
|
2016-12-13 16:51:32 +01:00
|
|
|
FormData_pg_partitioned_table
|
2015-05-24 03:20:37 +02:00
|
|
|
FormData_pg_policy
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_proc
|
2017-05-17 21:52:16 +02:00
|
|
|
FormData_pg_publication
|
Allow publishing the tables of schema.
A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows
one or more schemas to be specified, whose tables are selected by the
publisher for sending the data to the subscriber.
The new syntax allows specifying both the tables and schemas. For example:
CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
A new system table "pg_publication_namespace" has been added, to maintain
the schemas that the user wants to publish through the publication.
Modified the output plugin (pgoutput) to publish the changes if the
relation is part of schema publication.
Updates pg_dump to identify and dump schema publications. Updates the \d
family of commands to display schema publications and \dRp+ variant will
now display associated schemas if any.
Author: Vignesh C, Hou Zhijie, Amit Kapila
Syntax-Suggested-by: Tom Lane, Alvaro Herrera
Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger
Tested-by: Haiying Tang
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com
2021-10-27 04:14:52 +02:00
|
|
|
FormData_pg_publication_namespace
|
2017-05-17 21:52:16 +02:00
|
|
|
FormData_pg_publication_rel
|
2011-11-14 18:12:23 +01:00
|
|
|
FormData_pg_range
|
2015-05-24 03:20:37 +02:00
|
|
|
FormData_pg_replication_origin
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_rewrite
|
|
|
|
FormData_pg_sequence
|
2017-05-17 21:52:16 +02:00
|
|
|
FormData_pg_sequence_data
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_shdepend
|
|
|
|
FormData_pg_statistic
|
2017-05-17 21:52:16 +02:00
|
|
|
FormData_pg_statistic_ext
|
|
|
|
FormData_pg_statistic_ext_data
|
|
|
|
FormData_pg_subscription
|
|
|
|
FormData_pg_subscription_rel
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_tablespace
|
2015-05-24 03:20:37 +02:00
|
|
|
FormData_pg_transform
|
2010-02-26 02:55:35 +01:00
|
|
|
FormData_pg_trigger
|
|
|
|
FormData_pg_ts_config
|
|
|
|
FormData_pg_ts_config_map
|
|
|
|
FormData_pg_ts_dict
|
|
|
|
FormData_pg_ts_parser
|
|
|
|
FormData_pg_ts_template
|
|
|
|
FormData_pg_type
|
|
|
|
FormData_pg_user_mapping
|
|
|
|
Form_pg_aggregate
|
|
|
|
Form_pg_am
|
|
|
|
Form_pg_amop
|
|
|
|
Form_pg_amproc
|
|
|
|
Form_pg_attrdef
|
|
|
|
Form_pg_attribute
|
|
|
|
Form_pg_auth_members
|
|
|
|
Form_pg_authid
|
|
|
|
Form_pg_cast
|
|
|
|
Form_pg_class
|
2011-04-09 05:11:37 +02:00
|
|
|
Form_pg_collation
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_constraint
|
|
|
|
Form_pg_conversion
|
|
|
|
Form_pg_database
|
|
|
|
Form_pg_default_acl
|
|
|
|
Form_pg_depend
|
|
|
|
Form_pg_enum
|
|
|
|
Form_pg_event_trigger
|
2011-04-09 05:11:37 +02:00
|
|
|
Form_pg_extension
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_foreign_data_wrapper
|
|
|
|
Form_pg_foreign_server
|
2011-04-09 05:11:37 +02:00
|
|
|
Form_pg_foreign_table
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_index
|
|
|
|
Form_pg_inherits
|
|
|
|
Form_pg_language
|
|
|
|
Form_pg_largeobject
|
|
|
|
Form_pg_largeobject_metadata
|
|
|
|
Form_pg_namespace
|
|
|
|
Form_pg_opclass
|
|
|
|
Form_pg_operator
|
|
|
|
Form_pg_opfamily
|
2016-12-13 16:51:32 +01:00
|
|
|
Form_pg_partitioned_table
|
2015-05-24 03:20:37 +02:00
|
|
|
Form_pg_policy
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_proc
|
2017-05-17 21:52:16 +02:00
|
|
|
Form_pg_publication
|
Allow publishing the tables of schema.
A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows
one or more schemas to be specified, whose tables are selected by the
publisher for sending the data to the subscriber.
The new syntax allows specifying both the tables and schemas. For example:
CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
A new system table "pg_publication_namespace" has been added, to maintain
the schemas that the user wants to publish through the publication.
Modified the output plugin (pgoutput) to publish the changes if the
relation is part of schema publication.
Updates pg_dump to identify and dump schema publications. Updates the \d
family of commands to display schema publications and \dRp+ variant will
now display associated schemas if any.
Author: Vignesh C, Hou Zhijie, Amit Kapila
Syntax-Suggested-by: Tom Lane, Alvaro Herrera
Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger
Tested-by: Haiying Tang
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com
2021-10-27 04:14:52 +02:00
|
|
|
Form_pg_publication_namespace
|
2017-05-17 21:52:16 +02:00
|
|
|
Form_pg_publication_rel
|
2011-11-14 18:12:23 +01:00
|
|
|
Form_pg_range
|
2015-05-24 03:20:37 +02:00
|
|
|
Form_pg_replication_origin
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_rewrite
|
|
|
|
Form_pg_sequence
|
2017-05-17 21:52:16 +02:00
|
|
|
Form_pg_sequence_data
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_shdepend
|
|
|
|
Form_pg_statistic
|
2017-05-17 21:52:16 +02:00
|
|
|
Form_pg_statistic_ext
|
|
|
|
Form_pg_statistic_ext_data
|
|
|
|
Form_pg_subscription
|
|
|
|
Form_pg_subscription_rel
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_tablespace
|
2015-05-24 03:20:37 +02:00
|
|
|
Form_pg_transform
|
2010-02-26 02:55:35 +01:00
|
|
|
Form_pg_trigger
|
|
|
|
Form_pg_ts_config
|
|
|
|
Form_pg_ts_config_map
|
|
|
|
Form_pg_ts_dict
|
|
|
|
Form_pg_ts_parser
|
|
|
|
Form_pg_ts_template
|
|
|
|
Form_pg_type
|
|
|
|
Form_pg_user_mapping
|
|
|
|
FormatNode
|
2016-05-02 15:23:55 +02:00
|
|
|
FreeBlockNumberArray
|
|
|
|
FreeListData
|
2016-12-02 18:03:30 +01:00
|
|
|
FreePageBtree
|
|
|
|
FreePageBtreeHeader
|
|
|
|
FreePageBtreeInternalKey
|
|
|
|
FreePageBtreeLeafKey
|
|
|
|
FreePageBtreeSearchResult
|
|
|
|
FreePageManager
|
|
|
|
FreePageSpanLeader
|
2010-02-26 02:55:35 +01:00
|
|
|
FromCharDateMode
|
|
|
|
FromExpr
|
2019-03-27 22:34:43 +01:00
|
|
|
FullTransactionId
|
2010-02-26 02:55:35 +01:00
|
|
|
FuncCall
|
|
|
|
FuncCallContext
|
|
|
|
FuncCandidateList
|
|
|
|
FuncDetailCode
|
|
|
|
FuncExpr
|
|
|
|
FuncInfo
|
2019-05-22 18:55:34 +02:00
|
|
|
FuncLookupError
|
2010-02-26 02:55:35 +01:00
|
|
|
FunctionCallInfo
|
Change function call information to be variable length.
Before this change FunctionCallInfoData, the struct arguments etc for
V1 function calls are stored in, always had space for
FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two
arrays. For nearly every function call 100 arguments is far more than
needed, therefore wasting memory. Arg and argnull being two separate
arrays also guarantees that to access a single argument, two
cachelines have to be touched.
Change the layout so there's a single variable-length array with pairs
of value / isnull. That drastically reduces memory consumption for
most function calls (on x86-64 a two argument function now uses
64bytes, previously 936 bytes), and makes it very likely that argument
value and its nullness are on the same cacheline.
Arguments are stored in a new NullableDatum struct, which, due to
padding, needs more memory per argument than before. But as usually
far fewer arguments are stored, and individual arguments are cheaper
to access, that's still a clear win. It's likely that there's other
places where conversion to NullableDatum arrays would make sense,
e.g. TupleTableSlots, but that's for another commit.
Because the function call information is now variable-length
allocations have to take the number of arguments into account. For
heap allocations that can be done with SizeForFunctionCallInfoData(),
for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro
that helps to allocate an appropriately sized and aligned variable.
Some places with stack allocation function call information don't know
the number of arguments at compile time, and currently variably sized
stack allocations aren't allowed in postgres. Therefore allow for
FUNC_MAX_ARGS space in these cases. They're not that common, so for
now that seems acceptable.
Because of the need to allocate FunctionCallInfo of the appropriate
size, older extensions may need to update their code. To avoid subtle
breakages, the FunctionCallInfoData struct has been renamed to
FunctionCallInfoBaseData. Most code only references FunctionCallInfo,
so that shouldn't cause much collateral damage.
This change is also a prerequisite for more efficient expression JIT
compilation (by allocating the function call information on the stack,
allowing LLVM to optimize it away); previously the size of the call
information caused problems inside LLVM's optimizer.
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 23:17:52 +01:00
|
|
|
FunctionCallInfoBaseData
|
2010-02-26 02:55:35 +01:00
|
|
|
FunctionParameter
|
|
|
|
FunctionParameterMode
|
|
|
|
FunctionScan
|
2014-05-06 15:08:14 +02:00
|
|
|
FunctionScanPerFuncState
|
2010-02-26 02:55:35 +01:00
|
|
|
FunctionScanState
|
2015-05-24 03:20:37 +02:00
|
|
|
FuzzyAttrMatchState
|
2010-02-26 02:55:35 +01:00
|
|
|
GBT_NUMKEY
|
|
|
|
GBT_NUMKEY_R
|
|
|
|
GBT_VARKEY
|
|
|
|
GBT_VARKEY_R
|
2015-05-24 03:20:37 +02:00
|
|
|
GENERAL_NAME
|
2011-11-14 18:12:23 +01:00
|
|
|
GISTBuildBuffers
|
2010-02-26 02:55:35 +01:00
|
|
|
GISTBuildState
|
|
|
|
GISTDeletedPageContents
|
|
|
|
GISTENTRY
|
|
|
|
GISTInsertStack
|
|
|
|
GISTInsertState
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
GISTIntArrayBigOptions
|
|
|
|
GISTIntArrayOptions
|
2011-11-14 18:12:23 +01:00
|
|
|
GISTNodeBuffer
|
|
|
|
GISTNodeBufferPage
|
2010-02-26 02:55:35 +01:00
|
|
|
GISTPageOpaque
|
|
|
|
GISTPageOpaqueData
|
2011-04-09 05:11:37 +02:00
|
|
|
GISTPageSplitInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
GISTSTATE
|
|
|
|
GISTScanOpaque
|
|
|
|
GISTScanOpaqueData
|
2011-04-09 05:11:37 +02:00
|
|
|
GISTSearchHeapItem
|
|
|
|
GISTSearchItem
|
2010-02-26 02:55:35 +01:00
|
|
|
GISTTYPE
|
|
|
|
GIST_SPLITVEC
|
2017-05-17 21:52:16 +02:00
|
|
|
GMReaderTupleBuffer
|
2022-05-12 21:17:30 +02:00
|
|
|
GROUP
|
2010-02-26 02:55:35 +01:00
|
|
|
GV
|
2016-04-27 17:47:28 +02:00
|
|
|
Gather
|
2017-03-09 13:40:36 +01:00
|
|
|
GatherMerge
|
|
|
|
GatherMergePath
|
|
|
|
GatherMergeState
|
2017-05-17 21:52:16 +02:00
|
|
|
GatherPath
|
|
|
|
GatherState
|
2010-02-26 02:55:35 +01:00
|
|
|
Gene
|
2018-04-26 20:45:04 +02:00
|
|
|
GeneratePruningStepsContext
|
2017-11-29 15:24:24 +01:00
|
|
|
GenerationBlock
|
2011-04-09 05:11:37 +02:00
|
|
|
GenerationContext
|
2017-05-17 21:52:16 +02:00
|
|
|
GenerationPointer
|
2013-05-29 22:58:43 +02:00
|
|
|
GenericCosts
|
2016-04-27 17:47:28 +02:00
|
|
|
GenericXLogState
|
2010-02-26 02:55:35 +01:00
|
|
|
GeqoPrivateData
|
Allow locking updated tuples in tuple_update() and tuple_delete()
Currently, in read committed transaction isolation mode (default), we have the
following sequence of actions when tuple_update()/tuple_delete() finds
the tuple updated by concurrent transaction.
1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which
returns TM_Updated.
2. Lock tuple with tuple_lock().
3. Re-evaluate plan qual (recheck if we still need to update/delete and
calculate the new tuple for update).
4. Second attempt to update/delete tuple with tuple_update()/tuple_delete().
This attempt should be successful, since the tuple was previously locked.
This patch eliminates step 2 by taking the lock during first
tuple_update()/tuple_delete() call. Heap table access method saves some
efforts by checking the updated tuple once instead of twice. Future
undo-based table access methods, which will start from the latest row version,
can immediately place a lock there.
The code in nodeModifyTable.c is simplified by removing the nested switch/case.
Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com
Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp
Reviewed-by: Andres Freund, Chris Travers
2023-03-22 22:13:37 +01:00
|
|
|
GetEPQSlotArg
|
2015-05-24 03:20:37 +02:00
|
|
|
GetForeignJoinPaths_function
|
2012-06-10 21:15:31 +02:00
|
|
|
GetForeignModifyBatchSize_function
|
|
|
|
GetForeignPaths_function
|
|
|
|
GetForeignPlan_function
|
|
|
|
GetForeignRelSize_function
|
2015-05-24 03:20:37 +02:00
|
|
|
GetForeignRowMarkType_function
|
2016-05-02 15:23:55 +02:00
|
|
|
GetForeignUpperPaths_function
|
2013-05-29 22:58:43 +02:00
|
|
|
GetState
|
2011-11-14 18:12:23 +01:00
|
|
|
GiSTOptions
|
2010-02-26 02:55:35 +01:00
|
|
|
GinBtree
|
|
|
|
GinBtreeData
|
2014-05-06 15:08:14 +02:00
|
|
|
GinBtreeDataLeafInsertData
|
|
|
|
GinBtreeEntryInsertData
|
2010-02-26 02:55:35 +01:00
|
|
|
GinBtreeStack
|
|
|
|
GinBuildState
|
|
|
|
GinChkVal
|
2019-04-01 17:08:15 +02:00
|
|
|
GinEntries
|
2011-04-09 05:11:37 +02:00
|
|
|
GinEntryAccumulator
|
2013-05-29 22:58:43 +02:00
|
|
|
GinIndexStat
|
2010-02-26 02:55:35 +01:00
|
|
|
GinMetaPageData
|
2011-04-09 05:11:37 +02:00
|
|
|
GinNullCategory
|
2010-02-26 02:55:35 +01:00
|
|
|
GinOptions
|
|
|
|
GinPageOpaque
|
|
|
|
GinPageOpaqueData
|
2014-05-06 15:08:14 +02:00
|
|
|
GinPlaceToPageRC
|
|
|
|
GinPostingList
|
2012-06-10 21:15:31 +02:00
|
|
|
GinQualCounts
|
2010-02-26 02:55:35 +01:00
|
|
|
GinScanEntry
|
|
|
|
GinScanKey
|
|
|
|
GinScanOpaque
|
|
|
|
GinScanOpaqueData
|
|
|
|
GinState
|
2011-04-09 05:11:37 +02:00
|
|
|
GinStatsData
|
2014-05-06 15:08:14 +02:00
|
|
|
GinTernaryValue
|
2010-02-26 02:55:35 +01:00
|
|
|
GinTupleCollector
|
|
|
|
GinVacuumState
|
2011-11-14 18:12:23 +01:00
|
|
|
GistBuildMode
|
2010-02-26 02:55:35 +01:00
|
|
|
GistEntryVector
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
GistHstoreOptions
|
2014-05-06 15:08:14 +02:00
|
|
|
GistInetKey
|
2010-02-26 02:55:35 +01:00
|
|
|
GistNSN
|
2020-03-19 19:40:45 +01:00
|
|
|
GistOptBufferingMode
|
2022-02-07 21:20:42 +01:00
|
|
|
GistSortedBuildLevelState
|
2010-02-26 02:55:35 +01:00
|
|
|
GistSplitUnion
|
|
|
|
GistSplitVector
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
GistTsVectorOptions
|
2019-05-22 18:55:34 +02:00
|
|
|
GistVacState
|
2010-02-26 02:55:35 +01:00
|
|
|
GlobalTransaction
|
2022-05-12 21:17:30 +02:00
|
|
|
GlobalVisHorizonKind
|
snapshot scalability: Don't compute global horizons while building snapshots.
To make GetSnapshotData() more scalable, it cannot not look at at each proc's
xmin: While snapshot contents do not need to change whenever a read-only
transaction commits or a snapshot is released, a proc's xmin is modified in
those cases. The frequency of xmin modifications leads to, particularly on
higher core count systems, many cache misses inside GetSnapshotData(), despite
the data underlying a snapshot not changing. That is the most
significant source of GetSnapshotData() scaling poorly on larger systems.
Without accessing xmins, GetSnapshotData() cannot calculate accurate horizons /
thresholds as it has so far. But we don't really have to: The horizons don't
actually change that much between GetSnapshotData() calls. Nor are the horizons
actually used every time a snapshot is built.
The trick this commit introduces is to delay computation of accurate horizons
until there use and using horizon boundaries to determine whether accurate
horizons need to be computed.
The use of RecentGlobal[Data]Xmin to decide whether a row version could be
removed has been replaces with new GlobalVisTest* functions. These use two
thresholds to determine whether a row can be pruned:
1) definitely_needed, indicating that rows deleted by XIDs >= definitely_needed
are definitely still visible.
2) maybe_needed, indicating that rows deleted by XIDs < maybe_needed can
definitely be removed
GetSnapshotData() updates definitely_needed to be the xmin of the computed
snapshot.
When testing whether a row can be removed (with GlobalVisTestIsRemovableXid())
and the tested XID falls in between the two (i.e. XID >= maybe_needed && XID <
definitely_needed) the boundaries can be recomputed to be more accurate. As it
is not cheap to compute accurate boundaries, we limit the number of times that
happens in short succession. As the boundaries used by
GlobalVisTestIsRemovableXid() are never reset (with maybe_needed updated by
GetSnapshotData()), it is likely that further test can benefit from an earlier
computation of accurate horizons.
To avoid regressing performance when old_snapshot_threshold is set (as that
requires an accurate horizon to be computed), heap_page_prune_opt() doesn't
unconditionally call TransactionIdLimitedForOldSnapshots() anymore. Both the
computation of the limited horizon, and the triggering of errors (with
SetOldSnapshotThresholdTimestamp()) is now only done when necessary to remove
tuples.
This commit just removes the accesses to PGXACT->xmin from
GetSnapshotData(), but other members of PGXACT residing in the same
cache line are accessed. Therefore this in itself does not result in a
significant improvement. Subsequent commits will take advantage of the
fact that GetSnapshotData() now does not need to access xmins anymore.
Note: This contains a workaround in heap_page_prune_opt() to keep the
snapshot_too_old tests working. While that workaround is ugly, the tests
currently are not meaningful, and it seems best to address them separately.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
2020-08-13 01:03:49 +02:00
|
|
|
GlobalVisState
|
2010-02-26 02:55:35 +01:00
|
|
|
GrantRoleStmt
|
|
|
|
GrantStmt
|
|
|
|
GrantTargetType
|
|
|
|
Group
|
|
|
|
GroupClause
|
2016-04-27 17:47:28 +02:00
|
|
|
GroupPath
|
Implement partition-wise grouping/aggregation.
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition. This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition. This should be no worse, and often better, than the normal
approach.
If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results. This is less certain
to be a win, but it's still useful.
Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.
Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-03-22 17:49:48 +01:00
|
|
|
GroupPathExtraData
|
2010-02-26 02:55:35 +01:00
|
|
|
GroupResultPath
|
|
|
|
GroupState
|
|
|
|
GroupVarInfo
|
2015-05-24 03:20:37 +02:00
|
|
|
GroupingFunc
|
|
|
|
GroupingSet
|
2017-05-17 21:52:16 +02:00
|
|
|
GroupingSetData
|
2015-05-24 03:20:37 +02:00
|
|
|
GroupingSetKind
|
2016-04-27 17:47:28 +02:00
|
|
|
GroupingSetsPath
|
2010-02-26 02:55:35 +01:00
|
|
|
GucAction
|
|
|
|
GucBoolAssignHook
|
2011-04-09 05:11:37 +02:00
|
|
|
GucBoolCheckHook
|
2010-02-26 02:55:35 +01:00
|
|
|
GucContext
|
|
|
|
GucEnumAssignHook
|
2011-04-09 05:11:37 +02:00
|
|
|
GucEnumCheckHook
|
2010-02-26 02:55:35 +01:00
|
|
|
GucIntAssignHook
|
2011-04-09 05:11:37 +02:00
|
|
|
GucIntCheckHook
|
2010-02-26 02:55:35 +01:00
|
|
|
GucRealAssignHook
|
2011-04-09 05:11:37 +02:00
|
|
|
GucRealCheckHook
|
2010-02-26 02:55:35 +01:00
|
|
|
GucShowHook
|
|
|
|
GucSource
|
|
|
|
GucStack
|
|
|
|
GucStackState
|
|
|
|
GucStringAssignHook
|
2011-04-09 05:11:37 +02:00
|
|
|
GucStringCheckHook
|
2023-02-23 18:33:30 +01:00
|
|
|
GzipCompressorState
|
2010-02-26 02:55:35 +01:00
|
|
|
HANDLE
|
|
|
|
HASHACTION
|
|
|
|
HASHBUCKET
|
|
|
|
HASHCTL
|
|
|
|
HASHELEMENT
|
|
|
|
HASHHDR
|
|
|
|
HASHSEGMENT
|
|
|
|
HASH_SEQ_STATUS
|
2011-04-09 05:11:37 +02:00
|
|
|
HE
|
2010-02-26 02:55:35 +01:00
|
|
|
HEntry
|
|
|
|
HIST_ENTRY
|
|
|
|
HKEY
|
|
|
|
HLOCAL
|
2021-05-12 19:14:10 +02:00
|
|
|
HMAC_CTX
|
2010-02-26 02:55:35 +01:00
|
|
|
HMODULE
|
|
|
|
HOldEntry
|
2011-11-14 18:12:23 +01:00
|
|
|
HRESULT
|
2010-02-26 02:55:35 +01:00
|
|
|
HSParser
|
|
|
|
HSpool
|
|
|
|
HStore
|
|
|
|
HTAB
|
|
|
|
HTSV_Result
|
|
|
|
HV
|
|
|
|
Hash
|
2020-05-14 19:06:38 +02:00
|
|
|
HashAggBatch
|
|
|
|
HashAggSpill
|
2010-02-26 02:55:35 +01:00
|
|
|
HashAllocFunc
|
|
|
|
HashBuildState
|
|
|
|
HashCompareFunc
|
|
|
|
HashCopyFunc
|
2017-05-17 21:52:16 +02:00
|
|
|
HashIndexStat
|
2018-04-26 20:45:04 +02:00
|
|
|
HashInstrumentation
|
2010-02-26 02:55:35 +01:00
|
|
|
HashJoin
|
|
|
|
HashJoinState
|
|
|
|
HashJoinTable
|
|
|
|
HashJoinTuple
|
2015-05-24 03:20:37 +02:00
|
|
|
HashMemoryChunk
|
2010-02-26 02:55:35 +01:00
|
|
|
HashMetaPage
|
|
|
|
HashMetaPageData
|
2019-11-25 01:40:53 +01:00
|
|
|
HashOptions
|
2010-02-26 02:55:35 +01:00
|
|
|
HashPageOpaque
|
|
|
|
HashPageOpaqueData
|
2017-05-17 21:52:16 +02:00
|
|
|
HashPageStat
|
2010-02-26 02:55:35 +01:00
|
|
|
HashPath
|
|
|
|
HashScanOpaque
|
|
|
|
HashScanOpaqueData
|
|
|
|
HashScanPosData
|
2017-05-17 21:52:16 +02:00
|
|
|
HashScanPosItem
|
2010-02-26 02:55:35 +01:00
|
|
|
HashSkewBucket
|
|
|
|
HashState
|
|
|
|
HashValueFunc
|
|
|
|
HbaLine
|
2017-05-17 21:52:16 +02:00
|
|
|
HeadlineJsonState
|
2010-02-26 02:55:35 +01:00
|
|
|
HeadlineParsedText
|
|
|
|
HeadlineWordEntry
|
2020-10-22 14:44:18 +02:00
|
|
|
HeapCheckContext
|
2023-01-23 11:08:38 +01:00
|
|
|
HeapPageFreeze
|
2010-02-26 02:55:35 +01:00
|
|
|
HeapScanDesc
|
|
|
|
HeapTuple
|
|
|
|
HeapTupleData
|
|
|
|
HeapTupleFields
|
2020-11-15 06:56:31 +01:00
|
|
|
HeapTupleForceOption
|
2023-01-23 11:08:38 +01:00
|
|
|
HeapTupleFreeze
|
2010-02-26 02:55:35 +01:00
|
|
|
HeapTupleHeader
|
|
|
|
HeapTupleHeaderData
|
|
|
|
HeapTupleTableSlot
|
|
|
|
HistControl
|
|
|
|
HotStandbyState
|
|
|
|
I32
|
2017-05-17 21:52:16 +02:00
|
|
|
ICU_Convert_Func
|
2018-04-26 20:45:04 +02:00
|
|
|
ID
|
2010-02-26 02:55:35 +01:00
|
|
|
INFIX
|
2017-05-17 21:52:16 +02:00
|
|
|
INT128
|
2011-04-09 05:11:37 +02:00
|
|
|
INTERFACE_INFO
|
pgstat: Infrastructure for more detailed IO statistics
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-09 05:53:42 +01:00
|
|
|
IOContext
|
2010-02-26 02:55:35 +01:00
|
|
|
IOFuncSelector
|
pgstat: Infrastructure for more detailed IO statistics
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-09 05:53:42 +01:00
|
|
|
IOObject
|
|
|
|
IOOp
|
2010-02-26 02:55:35 +01:00
|
|
|
IPCompareMethod
|
|
|
|
ITEM
|
|
|
|
IV
|
2013-05-29 22:58:43 +02:00
|
|
|
IdentLine
|
2010-02-26 02:55:35 +01:00
|
|
|
IdentifierLookup
|
2011-04-09 05:11:37 +02:00
|
|
|
IdentifySystemCmd
|
2017-05-17 21:52:16 +02:00
|
|
|
IfStackElem
|
2015-05-24 03:20:37 +02:00
|
|
|
ImportForeignSchemaStmt
|
|
|
|
ImportForeignSchemaType
|
|
|
|
ImportForeignSchema_function
|
|
|
|
ImportQual
|
2021-10-24 03:36:38 +02:00
|
|
|
InProgressEnt
|
2017-05-17 21:52:16 +02:00
|
|
|
IncludeWal
|
2015-05-24 03:20:37 +02:00
|
|
|
InclusionOpaque
|
2010-02-26 02:55:35 +01:00
|
|
|
IncrementVarSublevelsUp_context
|
|
|
|
IncrementalSort
|
2016-04-27 17:47:28 +02:00
|
|
|
IncrementalSortExecutionStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
IncrementalSortGroupInfo
|
|
|
|
IncrementalSortInfo
|
|
|
|
IncrementalSortPath
|
|
|
|
IncrementalSortState
|
|
|
|
Index
|
2016-08-15 19:42:51 +02:00
|
|
|
IndexAMProperty
|
2016-04-27 17:47:28 +02:00
|
|
|
IndexAmRoutine
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexArrayKeyInfo
|
2018-04-26 20:45:04 +02:00
|
|
|
IndexAttachInfo
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
IndexAttrBitmapKind
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexBuildCallback
|
|
|
|
IndexBuildResult
|
|
|
|
IndexBulkDeleteCallback
|
|
|
|
IndexBulkDeleteResult
|
2012-06-10 21:15:31 +02:00
|
|
|
IndexClause
|
|
|
|
IndexClauseSet
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexDeleteCounts
|
Compute XID horizon for page level index vacuum on primary.
Previously the xid horizon was only computed during WAL replay. That
had two major problems:
1) It relied on knowing what the table pointed to looks like. That was
easy enough before the introducing of tableam (we knew it had to be
heap, although some trickery around logging the heap relfilenodes
was required). But to properly handle table AMs we need
per-database catalog access to look up the AM handler, which
recovery doesn't allow.
2) Not knowing the xid horizon also makes it hard to support logical
decoding on standbys. When on a catalog table, we need to be able
to conflict with slots that have an xid horizon that's too old. But
computing the horizon by visiting the heap only works once
consistency is reached, but we always need to be able to detect
conflicts.
There's also a secondary problem, in that the current method performs
redundant work on every standby. But that's counterbalanced by
potentially computing the value when not necessary (either because
there's no standby, or because there's no connected backends).
Solve 1) and 2) by moving computation of the xid horizon to the
primary and by involving tableam in the computation of the horizon.
To address the potentially increased overhead, increase the efficiency
of the xid horizon computation for heap by sorting the tids, and
eliminating redundant buffer accesses. When prefetching is available,
additionally perform prefetching of buffers. As this is more of a
maintenance task, rather than something routinely done in every read
only query, we add an arbitrary 10 to the effective concurrency -
thereby using IO concurrency, when not globally enabled. That's
possibly not the perfect formula, but seems good enough for now.
Bumps WAL format, as latestRemovedXid is now part of the records, and
the heap's relfilenode isn't anymore.
Author: Andres Freund, Amit Khandekar, Robert Haas
Reviewed-By: Robert Haas
Discussion:
https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
https://postgr.es/m/20181214014235.dal5ogljs3bmlq44@alap3.anarazel.de
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-03-26 22:41:46 +01:00
|
|
|
IndexDeletePrefetchState
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexElem
|
tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:
1) Heap scans need to be generalized into table scans. Do this by
introducing TableScanDesc, which will be the "base class" for
individual AMs. This contains the AM independent fields from
HeapScanDesc.
The previous heap_{beginscan,rescan,endscan} et al. have been
replaced with a table_ version.
There's no direct replacement for heap_getnext(), as that returned
a HeapTuple, which is undesirable for a other AMs. Instead there's
table_scan_getnextslot(). But note that heap_getnext() lives on,
it's still used widely to access catalog tables.
This is achieved by new scan_begin, scan_end, scan_rescan,
scan_getnextslot callbacks.
2) The portion of parallel scans that's shared between backends need
to be able to do so without the user doing per-AM work. To achieve
that new parallelscan_{estimate, initialize, reinitialize}
callbacks are introduced, which operate on a new
ParallelTableScanDesc, which again can be subclassed by AMs.
As it is likely that several AMs are going to be block oriented,
block oriented callbacks that can be shared between such AMs are
provided and used by heap. table_block_parallelscan_{estimate,
intiialize, reinitialize} as callbacks, and
table_block_parallelscan_{nextpage, init} for use in AMs. These
operate on a ParallelBlockTableScanDesc.
3) Index scans need to be able to access tables to return a tuple, and
there needs to be state across individual accesses to the heap to
store state like buffers. That's now handled by introducing a
sort-of-scan IndexFetchTable, which again is intended to be
subclassed by individual AMs (for heap IndexFetchHeap).
The relevant callbacks for an AM are index_fetch_{end, begin,
reset} to create the necessary state, and index_fetch_tuple to
retrieve an indexed tuple. Note that index_fetch_tuple
implementations need to be smarter than just blindly fetching the
tuples for AMs that have optimizations similar to heap's HOT - the
currently alive tuple in the update chain needs to be fetched if
appropriate.
Similar to table_scan_getnextslot(), it's undesirable to continue
to return HeapTuples. Thus index_fetch_heap (might want to rename
that later) now accepts a slot as an argument. Core code doesn't
have a lot of call sites performing index scans without going
through the systable_* API (in contrast to loads of heap_getnext
calls and working directly with HeapTuples).
Index scans now store the result of a search in
IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
target is not generally a HeapTuple anymore that seems cleaner.
To be able to sensible adapt code to use the above, two further
callbacks have been introduced:
a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
slots capable of holding a tuple of the AMs
type. table_slot_callbacks() and table_slot_create() are based
upon that, but have additional logic to deal with views, foreign
tables, etc.
While this change could have been done separately, nearly all the
call sites that needed to be adapted for the rest of this commit
also would have been needed to be adapted for
table_slot_callbacks(), making separation not worthwhile.
b) tuple_satisfies_snapshot checks whether the tuple in a slot is
currently visible according to a snapshot. That's required as a few
places now don't have a buffer + HeapTuple around, but a
slot (which in heap's case internally has that information).
Additionally a few infrastructure changes were needed:
I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
internally uses a slot to keep track of tuples. While
systable_getnext() still returns HeapTuples, and will so for the
foreseeable future, the index API (see 1) above) now only deals with
slots.
The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.
Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 20:46:41 +01:00
|
|
|
IndexFetchHeapData
|
|
|
|
IndexFetchTableData
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexInfo
|
|
|
|
IndexList
|
2011-11-14 18:12:23 +01:00
|
|
|
IndexOnlyScan
|
|
|
|
IndexOnlyScanState
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexOptInfo
|
2019-09-19 20:30:19 +02:00
|
|
|
IndexOrderByDistance
|
2010-02-26 02:55:35 +01:00
|
|
|
IndexPath
|
|
|
|
IndexRuntimeKeyInfo
|
|
|
|
IndexScan
|
|
|
|
IndexScanDesc
|
|
|
|
IndexScanState
|
|
|
|
IndexStateFlagsAction
|
|
|
|
IndexStmt
|
|
|
|
IndexTuple
|
|
|
|
IndexTupleData
|
|
|
|
IndexUniqueCheck
|
|
|
|
IndexVacuumInfo
|
|
|
|
IndxInfo
|
2015-05-24 03:20:37 +02:00
|
|
|
InferClause
|
|
|
|
InferenceElem
|
2010-02-26 02:55:35 +01:00
|
|
|
InfoItem
|
|
|
|
InhInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
InheritableSocket
|
2016-05-02 15:23:55 +02:00
|
|
|
InitSampleScan_function
|
|
|
|
InitializeDSMForeignScan_function
|
|
|
|
InitializeWorkerForeignScan_function
|
2010-02-26 02:55:35 +01:00
|
|
|
InlineCodeBlock
|
|
|
|
InsertStmt
|
|
|
|
Instrumentation
|
2015-05-24 03:20:37 +02:00
|
|
|
Int128AggState
|
2010-02-26 02:55:35 +01:00
|
|
|
Int8TransTypeData
|
2017-11-29 15:24:24 +01:00
|
|
|
IntRBTreeNode
|
2022-05-12 21:17:30 +02:00
|
|
|
Integer
|
2019-05-22 18:55:34 +02:00
|
|
|
IntegerSet
|
2010-02-26 02:55:35 +01:00
|
|
|
InternalDefaultACL
|
|
|
|
InternalGrant
|
|
|
|
Interval
|
|
|
|
IntoClause
|
|
|
|
InvalMessageArray
|
|
|
|
InvalidationMsgsGroup
|
|
|
|
IpcMemoryId
|
|
|
|
IpcMemoryKey
|
2019-05-22 18:55:34 +02:00
|
|
|
IpcMemoryState
|
2010-02-26 02:55:35 +01:00
|
|
|
IpcSemaphoreId
|
|
|
|
IpcSemaphoreKey
|
2016-05-02 15:23:55 +02:00
|
|
|
IsForeignPathAsyncCapable_function
|
2014-05-06 15:08:14 +02:00
|
|
|
IsForeignRelUpdatable_function
|
2016-05-02 15:23:55 +02:00
|
|
|
IsForeignScanParallelSafe_function
|
2021-06-28 17:05:54 +02:00
|
|
|
IsoConnInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
IspellDict
|
|
|
|
Item
|
|
|
|
ItemId
|
|
|
|
ItemIdData
|
|
|
|
ItemPointer
|
|
|
|
ItemPointerData
|
2016-05-02 15:23:55 +02:00
|
|
|
IterateDirectModify_function
|
2011-04-09 05:11:37 +02:00
|
|
|
IterateForeignScan_function
|
2017-05-17 21:52:16 +02:00
|
|
|
IterateJsonStringValuesState
|
2014-05-06 15:08:14 +02:00
|
|
|
JEntry
|
2013-05-29 22:58:43 +02:00
|
|
|
JHashState
|
2010-02-26 02:55:35 +01:00
|
|
|
JOBOBJECTINFOCLASS
|
|
|
|
JOBOBJECT_BASIC_LIMIT_INFORMATION
|
|
|
|
JOBOBJECT_BASIC_UI_RESTRICTIONS
|
|
|
|
JOBOBJECT_SECURITY_LIMIT_INFORMATION
|
2018-04-26 20:45:04 +02:00
|
|
|
JitContext
|
2016-04-27 17:47:28 +02:00
|
|
|
JitInstrumentation
|
2018-04-26 20:45:04 +02:00
|
|
|
JitProviderCallbacks
|
|
|
|
JitProviderCompileExprCB
|
|
|
|
JitProviderInit
|
|
|
|
JitProviderReleaseContextCB
|
|
|
|
JitProviderResetAfterErrorCB
|
2010-02-26 02:55:35 +01:00
|
|
|
Join
|
2012-06-10 21:15:31 +02:00
|
|
|
JoinCostWorkspace
|
2010-02-26 02:55:35 +01:00
|
|
|
JoinExpr
|
|
|
|
JoinHashEntry
|
|
|
|
JoinPath
|
2015-05-24 03:20:37 +02:00
|
|
|
JoinPathExtraData
|
2010-02-26 02:55:35 +01:00
|
|
|
JoinState
|
|
|
|
JoinType
|
2017-05-17 21:52:16 +02:00
|
|
|
JsObject
|
|
|
|
JsValue
|
|
|
|
JsonAggConstructor
|
2016-04-27 17:47:28 +02:00
|
|
|
JsonAggState
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonArgument
|
|
|
|
JsonArrayAgg
|
2017-05-17 21:52:16 +02:00
|
|
|
JsonArrayConstructor
|
|
|
|
JsonArrayQueryConstructor
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonBaseObjectInfo
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonBehavior
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonBehaviorType
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonCoercion
|
|
|
|
JsonCommon
|
2017-05-17 21:52:16 +02:00
|
|
|
JsonConstructorExpr
|
|
|
|
JsonConstructorType
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonEncoding
|
|
|
|
JsonExpr
|
|
|
|
JsonExprOp
|
|
|
|
JsonFormat
|
|
|
|
JsonFormatType
|
|
|
|
JsonFunc
|
|
|
|
JsonFuncExpr
|
2010-02-26 02:55:35 +01:00
|
|
|
JsonHashEntry
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonIsPredicate
|
2017-05-17 21:52:16 +02:00
|
|
|
JsonItemCoercions
|
|
|
|
JsonIterateStringValuesAction
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonKeyValue
|
2012-06-10 21:15:31 +02:00
|
|
|
JsonLexContext
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonLikeRegexContext
|
2020-05-14 19:06:38 +02:00
|
|
|
JsonManifestFileField
|
2012-06-10 21:15:31 +02:00
|
|
|
JsonManifestParseContext
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonManifestParseState
|
|
|
|
JsonManifestSemanticState
|
2017-05-17 21:52:16 +02:00
|
|
|
JsonManifestWALRangeField
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonObjectAgg
|
2010-02-26 02:55:35 +01:00
|
|
|
JsonObjectConstructor
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonOutput
|
2012-06-10 21:15:31 +02:00
|
|
|
JsonParseContext
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonParseErrorType
|
|
|
|
JsonParseExpr
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonPath
|
|
|
|
JsonPathBool
|
|
|
|
JsonPathDatatypeStatus
|
|
|
|
JsonPathExecContext
|
|
|
|
JsonPathExecResult
|
2019-04-01 17:08:15 +02:00
|
|
|
JsonPathGinAddPathItemFunc
|
|
|
|
JsonPathGinContext
|
|
|
|
JsonPathGinExtractNodesFunc
|
|
|
|
JsonPathGinNode
|
|
|
|
JsonPathGinNodeType
|
|
|
|
JsonPathGinPath
|
|
|
|
JsonPathGinPathItem
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonPathItem
|
|
|
|
JsonPathItemType
|
2019-05-22 18:55:34 +02:00
|
|
|
JsonPathKeyword
|
JSON_TABLE
This feature allows jsonb data to be treated as a table and thus used in
a FROM clause like other tabular data. Data can be selected from the
jsonb using jsonpath expressions, and hoisted out of nested structures
in the jsonb to form multiple rows, more or less like an outer join.
Nikita Glukhov
Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu (whose
name I previously misspelled), Himanshu Upadhyaya, Daniel Gustafsson,
Justin Pryzby.
Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru
2022-04-04 21:36:03 +02:00
|
|
|
JsonPathMutableContext
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonPathParseItem
|
|
|
|
JsonPathParseResult
|
|
|
|
JsonPathPredicateCallback
|
2019-04-01 17:08:15 +02:00
|
|
|
JsonPathString
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonPathVarCallback
|
2019-04-01 17:08:15 +02:00
|
|
|
JsonPathVariableEvalContext
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonQuotes
|
|
|
|
JsonReturning
|
|
|
|
JsonScalarExpr
|
2012-06-10 21:15:31 +02:00
|
|
|
JsonSemAction
|
|
|
|
JsonTokenType
|
2017-05-17 21:52:16 +02:00
|
|
|
JsonTransformStringValuesAction
|
2015-05-24 03:20:37 +02:00
|
|
|
JsonTypeCategory
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonUniqueBuilderState
|
Add amcheck extension to contrib.
This is the beginning of a collection of SQL-callable functions to
verify the integrity of data files. For now it only contains code to
verify B-Tree indexes.
This adds two SQL-callable functions, validating B-Tree consistency to
a varying degree. Check the, extensive, docs for details.
The goal is to later extend the coverage of the module to further
access methods, possibly including the heap. Once checks for
additional access methods exist, we'll likely add some "dispatch"
functions that cover multiple access methods.
Author: Peter Geoghegan, editorialized by Andres Freund
Reviewed-By: Andres Freund, Tomas Vondra, Thomas Munro,
Anastasia Lubennikova, Robert Haas, Amit Langote
Discussion: CAM3SWZQzLMhMwmBqjzK+pRKXrNUZ4w90wYMUWfkeV8mZ3Debvw@mail.gmail.com
2017-03-10 00:50:40 +01:00
|
|
|
JsonUniqueCheckState
|
2010-02-26 02:55:35 +01:00
|
|
|
JsonUniqueHashEntry
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonUniqueParsingState
|
2010-02-26 02:55:35 +01:00
|
|
|
JsonUniqueStackEntry
|
Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database. The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them. This commit implements partial support JSON path language as
separate datatype called "jsonpath". The implementation is partial because
it's lacking datetime support and suppression of numeric errors. Missing
features will be added later by separate commits.
Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches. This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:
* jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
* jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
* jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).
This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly. These operators will have an index support
(implemented in subsequent patches).
Catversion bumped, to add new functions and operators.
Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work
was inspired by Oleg Bartunov.
Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 10:15:37 +01:00
|
|
|
JsonValueExpr
|
|
|
|
JsonValueList
|
|
|
|
JsonValueListIterator
|
2022-05-12 21:17:30 +02:00
|
|
|
JsonValueType
|
|
|
|
JsonWrapper
|
2014-05-06 15:08:14 +02:00
|
|
|
Jsonb
|
2016-04-27 17:47:28 +02:00
|
|
|
JsonbAggState
|
2015-05-24 03:20:37 +02:00
|
|
|
JsonbContainer
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonbInState
|
|
|
|
JsonbIterState
|
|
|
|
JsonbIterator
|
2015-05-24 03:20:37 +02:00
|
|
|
JsonbIteratorToken
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonbPair
|
|
|
|
JsonbParseState
|
Implementation of subscripting for jsonb
Subscripting for jsonb does not support slices, does not have a limit for the
number of subscripts, and an assignment expects a replace value to have jsonb
type. There is also one functional difference between assignment via
subscripting and assignment via jsonb_set(). When an original jsonb container
is NULL, the subscripting replaces it with an empty jsonb and proceeds with
an assignment.
For the sake of code reuse, we rearrange some parts of jsonb functionality
to allow the usage of the same functions for jsonb_set and assign subscripting
operation.
The original idea belongs to Oleg Bartunov.
Catversion is bumped.
Discussion: https://postgr.es/m/CA%2Bq6zcV8qvGcDXurwwgUbwACV86Th7G80pnubg42e-p9gsSf%3Dg%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2Bq6zcX3mdxGCgdThzuySwH-ApyHHM-G4oB1R0fn0j2hZqqkLQ%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2Bq6zcVDuGBv%3DM0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2Bq6zcVovR%2BXY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA%40mail.gmail.com
Author: Dmitry Dolgov
Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule, Dian M Fay
Reviewed-by: Andrew Dunstan, Chapman Flack, Merlin Moncure, Peter Geoghegan
Reviewed-by: Alvaro Herrera, Jim Nasby, Josh Berkus, Victor Wagner
Reviewed-by: Aleksander Alekseev, Robert Haas, Oleg Bartunov
2021-01-31 21:50:40 +01:00
|
|
|
JsonbSubWorkspace
|
2015-05-24 03:20:37 +02:00
|
|
|
JsonbTypeCategory
|
2014-05-06 15:08:14 +02:00
|
|
|
JsonbValue
|
2012-06-10 21:15:31 +02:00
|
|
|
JumbleState
|
2010-02-26 02:55:35 +01:00
|
|
|
JunkFilter
|
2022-05-12 21:17:30 +02:00
|
|
|
KeyAction
|
|
|
|
KeyActions
|
2011-04-09 05:11:37 +02:00
|
|
|
KeyArray
|
2010-02-26 02:55:35 +01:00
|
|
|
KeySuffix
|
|
|
|
KeyWord
|
|
|
|
LARGE_INTEGER
|
|
|
|
LDAP
|
|
|
|
LDAPMessage
|
2013-05-29 22:58:43 +02:00
|
|
|
LDAPURLDesc
|
2010-02-26 02:55:35 +01:00
|
|
|
LDAP_TIMEVAL
|
|
|
|
LINE
|
2018-04-26 20:45:04 +02:00
|
|
|
LLVMAttributeRef
|
|
|
|
LLVMBasicBlockRef
|
2018-03-22 19:10:33 +01:00
|
|
|
LLVMBuilderRef
|
2018-04-26 20:45:04 +02:00
|
|
|
LLVMIntPredicate
|
2018-03-22 03:28:28 +01:00
|
|
|
LLVMJitContext
|
Support for optimizing and emitting code in LLVM JIT provider.
This commit introduces the ability to actually generate code using
LLVM. In particular, this adds:
- Ability to emit code both in heavily optimized and largely
unoptimized fashion
- Batching facility to allow functions to be defined in small
increments, but optimized and emitted in executable form in larger
batches (for performance and memory efficiency)
- Type and function declaration synchronization between runtime
generated code and normal postgres code. This is critical to be able
to access struct fields etc.
- Developer oriented jit_dump_bitcode GUC, for inspecting / debugging
the generated code.
- per JitContext statistics of number of functions, time spent
generating code, optimizing, and emitting it. This will later be
employed for EXPLAIN support.
This commit doesn't yet contain any code actually generating
functions. That'll follow in later commits.
Documentation for GUCs added, and for JIT in general, will be added in
later commits.
Author: Andres Freund, with contributions by Pierre Ducroquet
Testing-By: Thomas Munro, Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 19:05:22 +01:00
|
|
|
LLVMJitHandle
|
2018-04-26 20:45:04 +02:00
|
|
|
LLVMMemoryBufferRef
|
|
|
|
LLVMModuleRef
|
|
|
|
LLVMOrcJITStackRef
|
|
|
|
LLVMOrcModuleHandle
|
|
|
|
LLVMOrcTargetAddress
|
|
|
|
LLVMPassManagerBuilderRef
|
|
|
|
LLVMPassManagerRef
|
|
|
|
LLVMSharedModuleRef
|
|
|
|
LLVMTargetMachineRef
|
|
|
|
LLVMTargetRef
|
Support for optimizing and emitting code in LLVM JIT provider.
This commit introduces the ability to actually generate code using
LLVM. In particular, this adds:
- Ability to emit code both in heavily optimized and largely
unoptimized fashion
- Batching facility to allow functions to be defined in small
increments, but optimized and emitted in executable form in larger
batches (for performance and memory efficiency)
- Type and function declaration synchronization between runtime
generated code and normal postgres code. This is critical to be able
to access struct fields etc.
- Developer oriented jit_dump_bitcode GUC, for inspecting / debugging
the generated code.
- per JitContext statistics of number of functions, time spent
generating code, optimizing, and emitting it. This will later be
employed for EXPLAIN support.
This commit doesn't yet contain any code actually generating
functions. That'll follow in later commits.
Documentation for GUCs added, and for JIT in general, will be added in
later commits.
Author: Andres Freund, with contributions by Pierre Ducroquet
Testing-By: Thomas Munro, Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 19:05:22 +01:00
|
|
|
LLVMTypeRef
|
|
|
|
LLVMValueRef
|
2010-02-26 02:55:35 +01:00
|
|
|
LOCALLOCK
|
|
|
|
LOCALLOCKOWNER
|
|
|
|
LOCALLOCKTAG
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
LOCALPREDICATELOCK
|
2010-02-26 02:55:35 +01:00
|
|
|
LOCK
|
|
|
|
LOCKMASK
|
|
|
|
LOCKMETHODID
|
|
|
|
LOCKMODE
|
|
|
|
LOCKTAG
|
|
|
|
LONG
|
2018-04-26 20:45:04 +02:00
|
|
|
LONG_PTR
|
2011-04-09 05:11:37 +02:00
|
|
|
LOOP
|
2010-02-26 02:55:35 +01:00
|
|
|
LPBYTE
|
|
|
|
LPCTSTR
|
|
|
|
LPCWSTR
|
|
|
|
LPDWORD
|
2022-05-12 21:17:30 +02:00
|
|
|
LPFILETIME
|
2010-02-26 02:55:35 +01:00
|
|
|
LPSECURITY_ATTRIBUTES
|
|
|
|
LPSERVICE_STATUS
|
|
|
|
LPSTR
|
|
|
|
LPTHREAD_START_ROUTINE
|
|
|
|
LPTSTR
|
|
|
|
LPVOID
|
|
|
|
LPWSTR
|
|
|
|
LSEG
|
2018-04-26 20:45:04 +02:00
|
|
|
LUID
|
|
|
|
LVPagePruneState
|
2010-02-26 02:55:35 +01:00
|
|
|
LVRelState
|
2020-07-01 04:28:36 +02:00
|
|
|
LVSavedErrInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
LWLock
|
2015-05-24 03:20:37 +02:00
|
|
|
LWLockHandle
|
2010-02-26 02:55:35 +01:00
|
|
|
LWLockMode
|
|
|
|
LWLockPadded
|
2023-02-23 21:19:19 +01:00
|
|
|
LZ4CompressorState
|
2010-02-26 02:55:35 +01:00
|
|
|
LZ4F_compressionContext_t
|
Rename backup_compression.{c,h} to compression.{c,h}
Compression option handling (level, algorithm or even workers) can be
used across several parts of the system and not only base backups.
Structures, objects and routines are renamed in consequence, to remove
the concept of base backups from this part of the code making this
change straight-forward.
pg_receivewal, that has gained support for LZ4 since babbbb5, will make
use of this infrastructure for its set of compression options, bringing
more consistency with pg_basebackup. This cleanup needs to be done
before releasing a beta of 15. pg_dump is a potential future target, as
well, and adding more compression options to it may happen in 16~.
Author: Michael Paquier
Reviewed-by: Robert Haas, Georgios Kokolatos
Discussion: https://postgr.es/m/YlPQGNAAa04raObK@paquier.xyz
2022-04-12 06:38:54 +02:00
|
|
|
LZ4F_decompressOptions_t
|
2010-02-26 02:55:35 +01:00
|
|
|
LZ4F_decompressionContext_t
|
2022-05-12 21:17:30 +02:00
|
|
|
LZ4F_errorCode_t
|
2010-02-26 02:55:35 +01:00
|
|
|
LZ4F_preferences_t
|
2023-02-23 21:19:19 +01:00
|
|
|
LZ4File
|
2011-04-09 05:11:37 +02:00
|
|
|
LabelProvider
|
2019-05-22 18:55:34 +02:00
|
|
|
LagTracker
|
2010-02-26 02:55:35 +01:00
|
|
|
LargeObjectDesc
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
LastAttnumInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
Latch
|
Allow locking updated tuples in tuple_update() and tuple_delete()
Currently, in read committed transaction isolation mode (default), we have the
following sequence of actions when tuple_update()/tuple_delete() finds
the tuple updated by concurrent transaction.
1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which
returns TM_Updated.
2. Lock tuple with tuple_lock().
3. Re-evaluate plan qual (recheck if we still need to update/delete and
calculate the new tuple for update).
4. Second attempt to update/delete tuple with tuple_update()/tuple_delete().
This attempt should be successful, since the tuple was previously locked.
This patch eliminates step 2 by taking the lock during first
tuple_update()/tuple_delete() call. Heap table access method saves some
efforts by checking the updated tuple once instead of twice. Future
undo-based table access methods, which will start from the latest row version,
can immediately place a lock there.
The code in nodeModifyTable.c is simplified by removing the nested switch/case.
Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com
Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp
Reviewed-by: Andres Freund, Chris Travers
2023-03-22 22:13:37 +01:00
|
|
|
LazyTupleTableSlot
|
2014-05-06 15:08:14 +02:00
|
|
|
LerpFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
LexDescr
|
|
|
|
LexemeEntry
|
|
|
|
LexemeHashKey
|
|
|
|
LexemeInfo
|
|
|
|
LexemeKey
|
|
|
|
LexizeData
|
2019-05-22 18:55:34 +02:00
|
|
|
LibraryInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
Limit
|
2020-05-14 19:06:38 +02:00
|
|
|
LimitOption
|
2016-04-27 17:47:28 +02:00
|
|
|
LimitPath
|
2010-02-26 02:55:35 +01:00
|
|
|
LimitState
|
|
|
|
LimitStateCond
|
|
|
|
List
|
|
|
|
ListCell
|
|
|
|
ListDictionary
|
|
|
|
ListParsedLex
|
|
|
|
ListenAction
|
|
|
|
ListenActionKind
|
|
|
|
ListenStmt
|
|
|
|
LoadStmt
|
|
|
|
LocalBufferLookupEnt
|
2014-05-06 15:08:14 +02:00
|
|
|
LocalPgBackendStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
LocalTransactionId
|
|
|
|
LocationIndex
|
2012-06-10 21:15:31 +02:00
|
|
|
LocationLen
|
2010-02-26 02:55:35 +01:00
|
|
|
LockAcquireResult
|
|
|
|
LockClauseStrength
|
|
|
|
LockData
|
|
|
|
LockInfoData
|
2011-11-14 18:12:23 +01:00
|
|
|
LockInstanceData
|
2010-02-26 02:55:35 +01:00
|
|
|
LockMethod
|
|
|
|
LockMethodData
|
|
|
|
LockRelId
|
|
|
|
LockRows
|
2016-04-27 17:47:28 +02:00
|
|
|
LockRowsPath
|
2010-02-26 02:55:35 +01:00
|
|
|
LockRowsState
|
|
|
|
LockStmt
|
|
|
|
LockTagType
|
|
|
|
LockTupleMode
|
2018-04-26 20:45:04 +02:00
|
|
|
LockViewRecurse_context
|
2015-05-24 03:20:37 +02:00
|
|
|
LockWaitPolicy
|
2010-02-26 02:55:35 +01:00
|
|
|
LockingClause
|
2011-04-09 05:11:37 +02:00
|
|
|
LogOpts
|
2010-02-26 02:55:35 +01:00
|
|
|
LogStmtLevel
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
LogicalDecodeBeginCB
|
2020-12-30 11:47:26 +01:00
|
|
|
LogicalDecodeBeginPrepareCB
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
LogicalDecodeChangeCB
|
|
|
|
LogicalDecodeCommitCB
|
2020-12-30 11:47:26 +01:00
|
|
|
LogicalDecodeCommitPreparedCB
|
2015-05-24 03:20:37 +02:00
|
|
|
LogicalDecodeFilterByOriginCB
|
2020-12-30 11:47:26 +01:00
|
|
|
LogicalDecodeFilterPrepareCB
|
2016-05-02 15:23:55 +02:00
|
|
|
LogicalDecodeMessageCB
|
2020-12-30 11:47:26 +01:00
|
|
|
LogicalDecodePrepareCB
|
|
|
|
LogicalDecodeRollbackPreparedCB
|
2014-05-06 15:08:14 +02:00
|
|
|
LogicalDecodeShutdownCB
|
2020-12-30 11:47:26 +01:00
|
|
|
LogicalDecodeStartupCB
|
|
|
|
LogicalDecodeStreamAbortCB
|
|
|
|
LogicalDecodeStreamChangeCB
|
|
|
|
LogicalDecodeStreamCommitCB
|
|
|
|
LogicalDecodeStreamMessageCB
|
|
|
|
LogicalDecodeStreamPrepareCB
|
2014-05-06 15:08:14 +02:00
|
|
|
LogicalDecodeStreamStartCB
|
|
|
|
LogicalDecodeStreamStopCB
|
2018-04-26 20:45:04 +02:00
|
|
|
LogicalDecodeStreamTruncateCB
|
|
|
|
LogicalDecodeTruncateCB
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
LogicalDecodingContext
|
2014-05-06 15:08:14 +02:00
|
|
|
LogicalErrorCallbackState
|
|
|
|
LogicalOutputPluginInit
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
LogicalOutputPluginWriterPrepareWrite
|
2017-05-17 21:52:16 +02:00
|
|
|
LogicalOutputPluginWriterUpdateProgress
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
LogicalOutputPluginWriterWrite
|
2017-05-17 21:52:16 +02:00
|
|
|
LogicalRepBeginData
|
|
|
|
LogicalRepCommitData
|
Add support for prepared transactions to built-in logical replication.
To add support for streaming transactions at prepare time into the
built-in logical replication, we need to do the following things:
* Modify the output plugin (pgoutput) to implement the new two-phase API
callbacks, by leveraging the extended replication protocol.
* Modify the replication apply worker, to properly handle two-phase
transactions by replaying them on prepare.
* Add a new SUBSCRIPTION option "two_phase" to allow users to enable
two-phase transactions. We enable the two_phase once the initial data sync
is over.
We however must explicitly disable replication of two-phase transactions
during replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover, we don't have a replication connection open so we don't know
where to send the data anyway.
The streaming option is not allowed with this new two_phase option. This
can be done as a separate patch.
We don't allow to toggle two_phase option of a subscription because it can
lead to an inconsistent replica. For the same reason, we don't allow to
refresh the publication once the two_phase is enabled for a subscription
unless copy_data option is false.
Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich
Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
2021-07-14 04:03:50 +02:00
|
|
|
LogicalRepCommitPreparedTxnData
|
2017-05-17 21:52:16 +02:00
|
|
|
LogicalRepCtxStruct
|
2023-01-30 03:32:08 +01:00
|
|
|
LogicalRepMode
|
2017-05-17 21:52:16 +02:00
|
|
|
LogicalRepMsgType
|
|
|
|
LogicalRepPartMapEntry
|
Add support for prepared transactions to built-in logical replication.
To add support for streaming transactions at prepare time into the
built-in logical replication, we need to do the following things:
* Modify the output plugin (pgoutput) to implement the new two-phase API
callbacks, by leveraging the extended replication protocol.
* Modify the replication apply worker, to properly handle two-phase
transactions by replaying them on prepare.
* Add a new SUBSCRIPTION option "two_phase" to allow users to enable
two-phase transactions. We enable the two_phase once the initial data sync
is over.
We however must explicitly disable replication of two-phase transactions
during replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover, we don't have a replication connection open so we don't know
where to send the data anyway.
The streaming option is not allowed with this new two_phase option. This
can be done as a separate patch.
We don't allow to toggle two_phase option of a subscription because it can
lead to an inconsistent replica. For the same reason, we don't allow to
refresh the publication once the two_phase is enabled for a subscription
unless copy_data option is false.
Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich
Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
2021-07-14 04:03:50 +02:00
|
|
|
LogicalRepPreparedTxnData
|
2017-05-17 21:52:16 +02:00
|
|
|
LogicalRepRelId
|
|
|
|
LogicalRepRelMapEntry
|
|
|
|
LogicalRepRelation
|
Add support for prepared transactions to built-in logical replication.
To add support for streaming transactions at prepare time into the
built-in logical replication, we need to do the following things:
* Modify the output plugin (pgoutput) to implement the new two-phase API
callbacks, by leveraging the extended replication protocol.
* Modify the replication apply worker, to properly handle two-phase
transactions by replaying them on prepare.
* Add a new SUBSCRIPTION option "two_phase" to allow users to enable
two-phase transactions. We enable the two_phase once the initial data sync
is over.
We however must explicitly disable replication of two-phase transactions
during replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover, we don't have a replication connection open so we don't know
where to send the data anyway.
The streaming option is not allowed with this new two_phase option. This
can be done as a separate patch.
We don't allow to toggle two_phase option of a subscription because it can
lead to an inconsistent replica. For the same reason, we don't allow to
refresh the publication once the two_phase is enabled for a subscription
unless copy_data option is false.
Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich
Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
2021-07-14 04:03:50 +02:00
|
|
|
LogicalRepRollbackPreparedTxnData
|
Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 02:30:39 +01:00
|
|
|
LogicalRepStreamAbortData
|
2017-05-17 21:52:16 +02:00
|
|
|
LogicalRepTupleData
|
|
|
|
LogicalRepTyp
|
|
|
|
LogicalRepWorker
|
2014-05-06 15:08:14 +02:00
|
|
|
LogicalRewriteMappingData
|
2010-02-26 02:55:35 +01:00
|
|
|
LogicalTape
|
|
|
|
LogicalTapeSet
|
2022-04-07 09:28:40 +02:00
|
|
|
LsnReadQueue
|
|
|
|
LsnReadQueueNextFun
|
|
|
|
LsnReadQueueNextStatus
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
LtreeGistOptions
|
|
|
|
LtreeSignature
|
2010-02-26 02:55:35 +01:00
|
|
|
MAGIC
|
|
|
|
MBuf
|
2019-05-22 18:55:34 +02:00
|
|
|
MCVItem
|
|
|
|
MCVList
|
2014-05-06 15:08:14 +02:00
|
|
|
MEMORY_BASIC_INFORMATION
|
2022-05-12 21:17:30 +02:00
|
|
|
MGVTBL
|
2011-04-09 05:11:37 +02:00
|
|
|
MINIDUMPWRITEDUMP
|
|
|
|
MINIDUMP_TYPE
|
2010-07-06 21:18:19 +02:00
|
|
|
MJEvalResult
|
2021-05-12 19:14:10 +02:00
|
|
|
MTTargetRelLookup
|
2017-05-17 21:52:16 +02:00
|
|
|
MVDependencies
|
|
|
|
MVDependency
|
|
|
|
MVNDistinct
|
|
|
|
MVNDistinctItem
|
2010-02-26 02:55:35 +01:00
|
|
|
Material
|
|
|
|
MaterialPath
|
|
|
|
MaterialState
|
|
|
|
MdfdVec
|
2021-07-14 02:43:58 +02:00
|
|
|
Memoize
|
|
|
|
MemoizeEntry
|
|
|
|
MemoizeInstrumentation
|
|
|
|
MemoizeKey
|
|
|
|
MemoizePath
|
|
|
|
MemoizeState
|
|
|
|
MemoizeTuple
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryChunk
|
2010-02-26 02:55:35 +01:00
|
|
|
MemoryContext
|
2015-05-24 03:20:37 +02:00
|
|
|
MemoryContextCallback
|
|
|
|
MemoryContextCallbackFunction
|
2016-04-27 17:47:28 +02:00
|
|
|
MemoryContextCounters
|
2010-02-26 02:55:35 +01:00
|
|
|
MemoryContextData
|
Improve performance of and reduce overheads of memory management
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
2022-08-29 07:15:00 +02:00
|
|
|
MemoryContextMethodID
|
2022-11-02 02:06:05 +01:00
|
|
|
MemoryContextMethods
|
2018-04-26 20:45:04 +02:00
|
|
|
MemoryStatsPrintFunc
|
2022-03-28 16:45:58 +02:00
|
|
|
MergeAction
|
|
|
|
MergeActionState
|
2011-04-09 05:11:37 +02:00
|
|
|
MergeAppend
|
|
|
|
MergeAppendPath
|
|
|
|
MergeAppendState
|
2010-02-26 02:55:35 +01:00
|
|
|
MergeJoin
|
|
|
|
MergeJoinClause
|
|
|
|
MergeJoinState
|
|
|
|
MergePath
|
|
|
|
MergeScanSelCache
|
2022-03-28 16:45:58 +02:00
|
|
|
MergeStmt
|
|
|
|
MergeWhenClause
|
2017-11-29 15:24:24 +01:00
|
|
|
MetaCommand
|
2010-02-26 02:55:35 +01:00
|
|
|
MinMaxAggInfo
|
2016-04-27 17:47:28 +02:00
|
|
|
MinMaxAggPath
|
2010-02-26 02:55:35 +01:00
|
|
|
MinMaxExpr
|
2021-05-12 19:14:10 +02:00
|
|
|
MinMaxMultiOptions
|
2010-02-26 02:55:35 +01:00
|
|
|
MinMaxOp
|
|
|
|
MinimalTuple
|
|
|
|
MinimalTupleData
|
|
|
|
MinimalTupleTableSlot
|
2015-05-24 03:20:37 +02:00
|
|
|
MinmaxMultiOpaque
|
|
|
|
MinmaxOpaque
|
2010-02-26 02:55:35 +01:00
|
|
|
ModifyTable
|
2022-03-17 11:47:04 +01:00
|
|
|
ModifyTableContext
|
2016-04-27 17:47:28 +02:00
|
|
|
ModifyTablePath
|
2010-02-26 02:55:35 +01:00
|
|
|
ModifyTableState
|
|
|
|
MonotonicFunction
|
2016-04-27 17:47:28 +02:00
|
|
|
MorphOpaque
|
2010-02-26 02:55:35 +01:00
|
|
|
MsgType
|
2015-05-24 03:20:37 +02:00
|
|
|
MultiAssignRef
|
2017-05-17 21:52:16 +02:00
|
|
|
MultiSortSupport
|
|
|
|
MultiSortSupportData
|
2010-02-26 02:55:35 +01:00
|
|
|
MultiXactId
|
2013-05-29 22:58:43 +02:00
|
|
|
MultiXactMember
|
2010-02-26 02:55:35 +01:00
|
|
|
MultiXactOffset
|
|
|
|
MultiXactStateData
|
|
|
|
MultiXactStatus
|
Multirange datatypes
Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.
Since v14, each range type automatically gets a corresponding multirange
datatype. There are both manual and automatic mechanisms for naming multirange
types. Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE. Otherwise, a multirange type name is generated
automatically. If the range type name contains "range" then we change that to
"multirange". Otherwise, we add "_multirange" to the end.
Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids. Altogether this format allows fetching a particular range by its index
in O(n).
Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.
Catversion is bumped.
Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov
2020-12-20 05:20:33 +01:00
|
|
|
MultirangeIOData
|
|
|
|
MultirangeParseState
|
|
|
|
MultirangeType
|
2010-02-26 02:55:35 +01:00
|
|
|
NDBOX
|
|
|
|
NODE
|
2021-05-12 19:14:10 +02:00
|
|
|
NTSTATUS
|
2010-02-26 02:55:35 +01:00
|
|
|
NUMCacheEntry
|
|
|
|
NUMDesc
|
|
|
|
NUMProc
|
|
|
|
NV
|
|
|
|
Name
|
|
|
|
NameData
|
2016-04-27 17:47:28 +02:00
|
|
|
NameHashEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
NamedArgExpr
|
2016-04-27 17:47:28 +02:00
|
|
|
NamedLWLockTranche
|
|
|
|
NamedLWLockTrancheRequest
|
2017-05-17 21:52:16 +02:00
|
|
|
NamedTuplestoreScan
|
|
|
|
NamedTuplestoreScanState
|
2010-02-26 02:55:35 +01:00
|
|
|
NamespaceInfo
|
|
|
|
NestLoop
|
2011-04-09 05:11:37 +02:00
|
|
|
NestLoopParam
|
2010-02-26 02:55:35 +01:00
|
|
|
NestLoopState
|
|
|
|
NestPath
|
|
|
|
NewColumnValue
|
|
|
|
NewConstraint
|
2016-05-02 15:23:55 +02:00
|
|
|
NextSampleBlock_function
|
|
|
|
NextSampleTuple_function
|
2017-05-17 21:52:16 +02:00
|
|
|
NextValueExpr
|
2010-02-26 02:55:35 +01:00
|
|
|
Node
|
|
|
|
NodeTag
|
2012-06-10 21:15:31 +02:00
|
|
|
NonEmptyRange
|
2010-02-26 02:55:35 +01:00
|
|
|
Notification
|
|
|
|
NotificationHash
|
|
|
|
NotificationList
|
|
|
|
NotifyStmt
|
|
|
|
Nsrt
|
2022-05-12 21:17:30 +02:00
|
|
|
NtDllRoutine
|
2010-02-26 02:55:35 +01:00
|
|
|
NullIfExpr
|
|
|
|
NullTest
|
|
|
|
NullTestType
|
Change function call information to be variable length.
Before this change FunctionCallInfoData, the struct arguments etc for
V1 function calls are stored in, always had space for
FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two
arrays. For nearly every function call 100 arguments is far more than
needed, therefore wasting memory. Arg and argnull being two separate
arrays also guarantees that to access a single argument, two
cachelines have to be touched.
Change the layout so there's a single variable-length array with pairs
of value / isnull. That drastically reduces memory consumption for
most function calls (on x86-64 a two argument function now uses
64bytes, previously 936 bytes), and makes it very likely that argument
value and its nullness are on the same cacheline.
Arguments are stored in a new NullableDatum struct, which, due to
padding, needs more memory per argument than before. But as usually
far fewer arguments are stored, and individual arguments are cheaper
to access, that's still a clear win. It's likely that there's other
places where conversion to NullableDatum arrays would make sense,
e.g. TupleTableSlots, but that's for another commit.
Because the function call information is now variable-length
allocations have to take the number of arguments into account. For
heap allocations that can be done with SizeForFunctionCallInfoData(),
for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro
that helps to allocate an appropriately sized and aligned variable.
Some places with stack allocation function call information don't know
the number of arguments at compile time, and currently variably sized
stack allocations aren't allowed in postgres. Therefore allow for
FUNC_MAX_ARGS space in these cases. They're not that common, so for
now that seems acceptable.
Because of the need to allocate FunctionCallInfo of the appropriate
size, older extensions may need to update their code. To avoid subtle
breakages, the FunctionCallInfoData struct has been renamed to
FunctionCallInfoBaseData. Most code only references FunctionCallInfo,
so that shouldn't cause much collateral damage.
This change is also a prerequisite for more efficient expression JIT
compilation (by allocating the function call information on the stack,
allowing LLVM to optimize it away); previously the size of the call
information caused problems inside LLVM's optimizer.
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 23:17:52 +01:00
|
|
|
NullableDatum
|
2010-02-26 02:55:35 +01:00
|
|
|
Numeric
|
2014-05-06 15:08:14 +02:00
|
|
|
NumericAggState
|
2010-02-26 02:55:35 +01:00
|
|
|
NumericDigit
|
2015-05-24 03:20:37 +02:00
|
|
|
NumericSortSupport
|
2016-12-13 16:51:32 +01:00
|
|
|
NumericSumAccum
|
2010-02-26 02:55:35 +01:00
|
|
|
NumericVar
|
2013-05-29 22:58:43 +02:00
|
|
|
OM_uint32
|
2010-02-26 02:55:35 +01:00
|
|
|
OP
|
2014-05-06 15:08:14 +02:00
|
|
|
OSAPerGroupState
|
|
|
|
OSAPerQueryState
|
2011-04-09 05:11:37 +02:00
|
|
|
OSInfo
|
2016-12-13 16:51:32 +01:00
|
|
|
OSSLCipher
|
2010-02-26 02:55:35 +01:00
|
|
|
OSSLDigest
|
|
|
|
OVERLAPPED
|
2012-06-10 21:15:31 +02:00
|
|
|
ObjectAccessDrop
|
2011-04-09 05:11:37 +02:00
|
|
|
ObjectAccessNamespaceSearch
|
|
|
|
ObjectAccessPostAlter
|
|
|
|
ObjectAccessPostCreate
|
|
|
|
ObjectAccessType
|
2010-02-26 02:55:35 +01:00
|
|
|
ObjectAddress
|
|
|
|
ObjectAddressAndFlags
|
|
|
|
ObjectAddressExtra
|
|
|
|
ObjectAddressStack
|
|
|
|
ObjectAddresses
|
|
|
|
ObjectClass
|
2011-11-14 18:12:23 +01:00
|
|
|
ObjectPropertyType
|
2010-02-26 02:55:35 +01:00
|
|
|
ObjectType
|
2016-12-28 18:00:00 +01:00
|
|
|
ObjectWithArgs
|
2010-02-26 02:55:35 +01:00
|
|
|
Offset
|
|
|
|
OffsetNumber
|
|
|
|
OffsetVarNodes_context
|
|
|
|
Oid
|
|
|
|
OidOptions
|
2013-05-29 22:58:43 +02:00
|
|
|
OkeysState
|
2016-04-11 22:43:52 +02:00
|
|
|
OldSnapshotControlData
|
|
|
|
OldSnapshotTimeMapping
|
2010-02-26 02:55:35 +01:00
|
|
|
OldToNewMapping
|
|
|
|
OldToNewMappingData
|
|
|
|
OnCommitAction
|
|
|
|
OnCommitItem
|
2015-05-24 03:20:37 +02:00
|
|
|
OnConflictAction
|
|
|
|
OnConflictClause
|
|
|
|
OnConflictExpr
|
2018-04-26 20:45:04 +02:00
|
|
|
OnConflictSetState
|
2011-11-14 18:12:23 +01:00
|
|
|
OpBtreeInterpretation
|
2010-02-26 02:55:35 +01:00
|
|
|
OpClassCacheEnt
|
|
|
|
OpExpr
|
|
|
|
OpFamilyMember
|
2016-04-27 17:47:28 +02:00
|
|
|
OpFamilyOpFuncGroup
|
2010-02-26 02:55:35 +01:00
|
|
|
OpclassInfo
|
|
|
|
Operator
|
2016-08-15 19:42:51 +02:00
|
|
|
OperatorElement
|
2010-02-26 02:55:35 +01:00
|
|
|
OpfamilyInfo
|
|
|
|
OprCacheEntry
|
|
|
|
OprCacheKey
|
|
|
|
OprInfo
|
|
|
|
OprProofCacheEntry
|
|
|
|
OprProofCacheKey
|
|
|
|
OutputContext
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
OutputPluginCallbacks
|
2014-05-06 15:08:14 +02:00
|
|
|
OutputPluginOptions
|
|
|
|
OutputPluginOutputType
|
2010-02-26 02:55:35 +01:00
|
|
|
OverrideSearchPath
|
|
|
|
OverrideStackEntry
|
2017-05-17 21:52:16 +02:00
|
|
|
OverridingKind
|
2010-02-26 02:55:35 +01:00
|
|
|
PACE_HEADER
|
|
|
|
PACL
|
|
|
|
PATH
|
|
|
|
PBOOL
|
2011-04-09 05:11:37 +02:00
|
|
|
PCtxtHandle
|
2012-06-10 21:15:31 +02:00
|
|
|
PERL_CONTEXT
|
2022-05-12 21:17:30 +02:00
|
|
|
PERL_SI
|
2010-02-26 02:55:35 +01:00
|
|
|
PFN
|
2019-05-22 18:55:34 +02:00
|
|
|
PGAlignedBlock
|
|
|
|
PGAlignedXLogBlock
|
2010-02-26 02:55:35 +01:00
|
|
|
PGAsyncStatusType
|
|
|
|
PGCALL2
|
Add options to enable and disable checksums in pg_checksums
An offline cluster can now work with more modes in pg_checksums:
- --enable enables checksums in a cluster, updating all blocks with a
correct checksum, and updating the control file at the end.
- --disable disables checksums in a cluster, updating only the control
file.
- --check is an extra option able to verify checksums for a cluster, and
the default used if no mode is specified.
When running --enable or --disable, the data folder gets fsync'd for
durability, and then it is followed by a control file update and flush
to keep the operation consistent should the tool be interrupted, killed
or the host unplugged. If no mode is specified in the options, then
--check is used for compatibility with older versions of pg_checksums
(named pg_verify_checksums in v11 where it was introduced).
Author: Michael Banck, Michael Paquier
Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-23 00:12:55 +01:00
|
|
|
PGChecksummablePage
|
2016-04-27 17:47:28 +02:00
|
|
|
PGContextVisibility
|
2010-02-26 02:55:35 +01:00
|
|
|
PGEvent
|
|
|
|
PGEventConnDestroy
|
|
|
|
PGEventConnReset
|
|
|
|
PGEventId
|
|
|
|
PGEventProc
|
|
|
|
PGEventRegister
|
|
|
|
PGEventResultCopy
|
|
|
|
PGEventResultCreate
|
|
|
|
PGEventResultDestroy
|
|
|
|
PGFInfoFunction
|
2020-09-07 08:11:46 +02:00
|
|
|
PGFileType
|
2010-02-26 02:55:35 +01:00
|
|
|
PGFunction
|
|
|
|
PGLZ_HistEntry
|
|
|
|
PGLZ_Strategy
|
|
|
|
PGMessageField
|
|
|
|
PGModuleMagicFunction
|
|
|
|
PGNoticeHooks
|
2017-05-17 21:52:16 +02:00
|
|
|
PGOutputData
|
Skip empty transactions for logical replication.
The current logical replication behavior is to send every transaction to
subscriber even if the transaction is empty. This can happen because
transaction doesn't contain changes from the selected publications or all
the changes got filtered. It is a waste of CPU cycles and network
bandwidth to build/transmit these empty transactions.
This patch addresses the above problem by postponing the BEGIN message
until the first change is sent. While processing a COMMIT message, if
there was no other change for that transaction, do not send the COMMIT
message. This allows us to skip sending BEGIN/COMMIT messages for empty
transactions.
When skipping empty transactions in synchronous replication mode, we send
a keepalive message to avoid delaying such transactions.
Author: Ajin Cherian, Hou Zhijie, Euler Taveira
Reviewed-by: Peter Smith, Takamichi Osumi, Shi Yu, Masahiko Sawada, Greg Nancarrow, Vignesh C, Amit Kapila
Discussion: https://postgr.es/m/CAMkU=1yohp9-dv48FLoSPrMqYEyyS5ZWkaZGD41RJr10xiNo_Q@mail.gmail.com
2022-03-30 04:11:05 +02:00
|
|
|
PGOutputTxnData
|
2010-02-26 02:55:35 +01:00
|
|
|
PGPROC
|
|
|
|
PGP_CFB
|
|
|
|
PGP_Context
|
|
|
|
PGP_MPI
|
|
|
|
PGP_PubKey
|
|
|
|
PGP_S2K
|
2011-04-09 05:11:37 +02:00
|
|
|
PGPing
|
2010-02-26 02:55:35 +01:00
|
|
|
PGQueryClass
|
|
|
|
PGRUsage
|
|
|
|
PGSemaphore
|
|
|
|
PGSemaphoreData
|
|
|
|
PGShmemHeader
|
2010-07-06 21:18:19 +02:00
|
|
|
PGTargetServerType
|
2021-05-12 19:14:10 +02:00
|
|
|
PGTernaryBool
|
2010-02-26 02:55:35 +01:00
|
|
|
PGTransactionStatusType
|
|
|
|
PGVerbosity
|
2011-06-09 20:01:49 +02:00
|
|
|
PG_Locale_Strategy
|
2010-02-26 02:55:35 +01:00
|
|
|
PG_Lock_Status
|
|
|
|
PG_init_t
|
|
|
|
PGcancel
|
2021-03-15 22:13:42 +01:00
|
|
|
PGcmdQueueEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
PGconn
|
2012-06-10 21:15:31 +02:00
|
|
|
PGdataValue
|
2010-02-26 02:55:35 +01:00
|
|
|
PGlobjfuncs
|
|
|
|
PGnotify
|
2021-03-15 22:13:42 +01:00
|
|
|
PGpipelineStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
PGresAttDesc
|
|
|
|
PGresAttValue
|
|
|
|
PGresParamDesc
|
|
|
|
PGresult
|
|
|
|
PGresult_data
|
|
|
|
PHANDLE
|
|
|
|
PLAINTREE
|
|
|
|
PLAssignStmt
|
|
|
|
PLUID_AND_ATTRIBUTES
|
|
|
|
PLcword
|
|
|
|
PLpgSQL_case_when
|
|
|
|
PLpgSQL_condition
|
|
|
|
PLpgSQL_datum
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_datum_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_diag_item
|
|
|
|
PLpgSQL_exception
|
|
|
|
PLpgSQL_exception_block
|
|
|
|
PLpgSQL_execstate
|
|
|
|
PLpgSQL_expr
|
|
|
|
PLpgSQL_func_hashkey
|
|
|
|
PLpgSQL_function
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_getdiag_kind
|
2011-11-14 18:12:23 +01:00
|
|
|
PLpgSQL_if_elsif
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_label_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_nsitem
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_nsitem_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_plugin
|
2018-04-26 20:45:04 +02:00
|
|
|
PLpgSQL_promise_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_raise_option
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_raise_option_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_rec
|
|
|
|
PLpgSQL_recfield
|
|
|
|
PLpgSQL_resolve_option
|
|
|
|
PLpgSQL_row
|
|
|
|
PLpgSQL_stmt
|
2015-05-24 03:20:37 +02:00
|
|
|
PLpgSQL_stmt_assert
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_stmt_assign
|
|
|
|
PLpgSQL_stmt_block
|
2018-04-26 20:45:04 +02:00
|
|
|
PLpgSQL_stmt_call
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_stmt_case
|
|
|
|
PLpgSQL_stmt_close
|
2018-04-26 20:45:04 +02:00
|
|
|
PLpgSQL_stmt_commit
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_stmt_dynexecute
|
|
|
|
PLpgSQL_stmt_dynfors
|
|
|
|
PLpgSQL_stmt_execsql
|
|
|
|
PLpgSQL_stmt_exit
|
|
|
|
PLpgSQL_stmt_fetch
|
|
|
|
PLpgSQL_stmt_forc
|
2011-04-09 05:11:37 +02:00
|
|
|
PLpgSQL_stmt_foreach_a
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_stmt_fori
|
|
|
|
PLpgSQL_stmt_forq
|
|
|
|
PLpgSQL_stmt_fors
|
|
|
|
PLpgSQL_stmt_getdiag
|
|
|
|
PLpgSQL_stmt_if
|
|
|
|
PLpgSQL_stmt_loop
|
|
|
|
PLpgSQL_stmt_open
|
|
|
|
PLpgSQL_stmt_perform
|
|
|
|
PLpgSQL_stmt_raise
|
|
|
|
PLpgSQL_stmt_return
|
|
|
|
PLpgSQL_stmt_return_next
|
|
|
|
PLpgSQL_stmt_return_query
|
2018-04-26 20:45:04 +02:00
|
|
|
PLpgSQL_stmt_rollback
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_stmt_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_stmt_while
|
|
|
|
PLpgSQL_trigtype
|
|
|
|
PLpgSQL_type
|
2016-12-13 16:51:32 +01:00
|
|
|
PLpgSQL_type_type
|
2010-02-26 02:55:35 +01:00
|
|
|
PLpgSQL_var
|
|
|
|
PLpgSQL_variable
|
|
|
|
PLwdatum
|
|
|
|
PLword
|
2017-11-29 15:24:24 +01:00
|
|
|
PLyArrayToOb
|
2012-06-10 21:15:31 +02:00
|
|
|
PLyCursorObject
|
2010-02-26 02:55:35 +01:00
|
|
|
PLyDatumToOb
|
|
|
|
PLyDatumToObFunc
|
2011-04-09 05:11:37 +02:00
|
|
|
PLyExceptionEntry
|
2012-06-10 21:15:31 +02:00
|
|
|
PLyExecutionContext
|
2017-11-29 15:24:24 +01:00
|
|
|
PLyObToArray
|
2010-02-26 02:55:35 +01:00
|
|
|
PLyObToDatum
|
|
|
|
PLyObToDatumFunc
|
2017-11-29 15:24:24 +01:00
|
|
|
PLyObToDomain
|
|
|
|
PLyObToScalar
|
|
|
|
PLyObToTransform
|
2010-02-26 02:55:35 +01:00
|
|
|
PLyObToTuple
|
2016-12-13 16:51:32 +01:00
|
|
|
PLyObject_AsString_t
|
2010-02-26 02:55:35 +01:00
|
|
|
PLyPlanObject
|
|
|
|
PLyProcedure
|
2011-04-09 05:11:37 +02:00
|
|
|
PLyProcedureEntry
|
|
|
|
PLyProcedureKey
|
2010-02-26 02:55:35 +01:00
|
|
|
PLyResultObject
|
2016-04-27 17:47:28 +02:00
|
|
|
PLySRFState
|
|
|
|
PLySavedArgs
|
2017-11-29 15:24:24 +01:00
|
|
|
PLyScalarToOb
|
2011-04-09 05:11:37 +02:00
|
|
|
PLySubtransactionData
|
|
|
|
PLySubtransactionObject
|
2017-11-29 15:24:24 +01:00
|
|
|
PLyTransformToOb
|
2010-02-26 02:55:35 +01:00
|
|
|
PLyTupleToOb
|
2016-12-13 16:51:32 +01:00
|
|
|
PLyUnicode_FromStringAndSize_t
|
2018-04-26 20:45:04 +02:00
|
|
|
PLy_elog_impl_t
|
2011-04-09 05:11:37 +02:00
|
|
|
PMINIDUMP_CALLBACK_INFORMATION
|
|
|
|
PMINIDUMP_EXCEPTION_INFORMATION
|
|
|
|
PMINIDUMP_USER_STREAM_INFORMATION
|
2010-02-26 02:55:35 +01:00
|
|
|
PMSignalData
|
|
|
|
PMSignalReason
|
|
|
|
PMState
|
|
|
|
POLYGON
|
|
|
|
PQArgBlock
|
|
|
|
PQEnvironmentOption
|
|
|
|
PQExpBuffer
|
|
|
|
PQExpBufferData
|
2015-05-24 03:20:37 +02:00
|
|
|
PQcommMethods
|
2010-02-26 02:55:35 +01:00
|
|
|
PQconninfoOption
|
|
|
|
PQnoticeProcessor
|
|
|
|
PQnoticeReceiver
|
|
|
|
PQprintOpt
|
|
|
|
PQsslKeyPassHook_OpenSSL_type
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
PREDICATELOCK
|
|
|
|
PREDICATELOCKTAG
|
|
|
|
PREDICATELOCKTARGET
|
|
|
|
PREDICATELOCKTARGETTAG
|
2010-02-26 02:55:35 +01:00
|
|
|
PROCESS_INFORMATION
|
|
|
|
PROCLOCK
|
|
|
|
PROCLOCKTAG
|
|
|
|
PROC_HDR
|
|
|
|
PSID
|
|
|
|
PSID_AND_ATTRIBUTES
|
2015-05-24 03:20:37 +02:00
|
|
|
PSQL_COMP_CASE
|
2010-02-26 02:55:35 +01:00
|
|
|
PSQL_ECHO
|
|
|
|
PSQL_ECHO_HIDDEN
|
|
|
|
PSQL_ERROR_ROLLBACK
|
2017-05-17 21:52:16 +02:00
|
|
|
PTEntryArray
|
|
|
|
PTIterationArray
|
2018-04-26 20:45:04 +02:00
|
|
|
PTOKEN_PRIVILEGES
|
2010-02-26 02:55:35 +01:00
|
|
|
PTOKEN_USER
|
2022-05-12 21:17:30 +02:00
|
|
|
PULONG
|
2016-12-13 16:51:32 +01:00
|
|
|
PUTENVPROC
|
2021-12-23 07:12:52 +01:00
|
|
|
PVIndStats
|
|
|
|
PVIndVacStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
PVOID
|
2021-12-23 07:12:52 +01:00
|
|
|
PVShared
|
2010-02-26 02:55:35 +01:00
|
|
|
PX_Alias
|
|
|
|
PX_Cipher
|
|
|
|
PX_Combo
|
|
|
|
PX_HMAC
|
|
|
|
PX_MD
|
|
|
|
Page
|
2016-04-27 17:47:28 +02:00
|
|
|
PageData
|
2013-05-29 22:58:43 +02:00
|
|
|
PageGistNSN
|
2010-02-26 02:55:35 +01:00
|
|
|
PageHeader
|
|
|
|
PageHeaderData
|
|
|
|
PageXLogRecPtr
|
|
|
|
PagetableEntry
|
|
|
|
Pairs
|
2018-04-26 20:45:04 +02:00
|
|
|
ParallelAppendState
|
Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 02:30:39 +01:00
|
|
|
ParallelApplyWorkerEntry
|
|
|
|
ParallelApplyWorkerInfo
|
|
|
|
ParallelApplyWorkerShared
|
2017-05-17 21:52:16 +02:00
|
|
|
ParallelBitmapHeapState
|
tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:
1) Heap scans need to be generalized into table scans. Do this by
introducing TableScanDesc, which will be the "base class" for
individual AMs. This contains the AM independent fields from
HeapScanDesc.
The previous heap_{beginscan,rescan,endscan} et al. have been
replaced with a table_ version.
There's no direct replacement for heap_getnext(), as that returned
a HeapTuple, which is undesirable for a other AMs. Instead there's
table_scan_getnextslot(). But note that heap_getnext() lives on,
it's still used widely to access catalog tables.
This is achieved by new scan_begin, scan_end, scan_rescan,
scan_getnextslot callbacks.
2) The portion of parallel scans that's shared between backends need
to be able to do so without the user doing per-AM work. To achieve
that new parallelscan_{estimate, initialize, reinitialize}
callbacks are introduced, which operate on a new
ParallelTableScanDesc, which again can be subclassed by AMs.
As it is likely that several AMs are going to be block oriented,
block oriented callbacks that can be shared between such AMs are
provided and used by heap. table_block_parallelscan_{estimate,
intiialize, reinitialize} as callbacks, and
table_block_parallelscan_{nextpage, init} for use in AMs. These
operate on a ParallelBlockTableScanDesc.
3) Index scans need to be able to access tables to return a tuple, and
there needs to be state across individual accesses to the heap to
store state like buffers. That's now handled by introducing a
sort-of-scan IndexFetchTable, which again is intended to be
subclassed by individual AMs (for heap IndexFetchHeap).
The relevant callbacks for an AM are index_fetch_{end, begin,
reset} to create the necessary state, and index_fetch_tuple to
retrieve an indexed tuple. Note that index_fetch_tuple
implementations need to be smarter than just blindly fetching the
tuples for AMs that have optimizations similar to heap's HOT - the
currently alive tuple in the update chain needs to be fetched if
appropriate.
Similar to table_scan_getnextslot(), it's undesirable to continue
to return HeapTuples. Thus index_fetch_heap (might want to rename
that later) now accepts a slot as an argument. Core code doesn't
have a lot of call sites performing index scans without going
through the systable_* API (in contrast to loads of heap_getnext
calls and working directly with HeapTuples).
Index scans now store the result of a search in
IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
target is not generally a HeapTuple anymore that seems cleaner.
To be able to sensible adapt code to use the above, two further
callbacks have been introduced:
a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
slots capable of holding a tuple of the AMs
type. table_slot_callbacks() and table_slot_create() are based
upon that, but have additional logic to deal with views, foreign
tables, etc.
While this change could have been done separately, nearly all the
call sites that needed to be adapted for the rest of this commit
also would have been needed to be adapted for
table_slot_callbacks(), making separation not worthwhile.
b) tuple_satisfies_snapshot checks whether the tuple in a slot is
currently visible according to a snapshot. That's required as a few
places now don't have a buffer + HeapTuple around, but a
slot (which in heap's case internally has that information).
Additionally a few infrastructure changes were needed:
I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
internally uses a slot to keep track of tuples. While
systable_getnext() still returns HeapTuples, and will so for the
foreseeable future, the index API (see 1) above) now only deals with
slots.
The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.
Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 20:46:41 +01:00
|
|
|
ParallelBlockTableScanDesc
|
|
|
|
ParallelBlockTableScanWorker
|
|
|
|
ParallelBlockTableScanWorkerData
|
2016-12-13 16:51:32 +01:00
|
|
|
ParallelCompletionPtr
|
2015-05-24 03:20:37 +02:00
|
|
|
ParallelContext
|
2016-04-27 17:47:28 +02:00
|
|
|
ParallelExecutorInfo
|
Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash. While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.
After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory. If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.
The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:
* avoids wasting memory on duplicated hash tables
* avoids wasting disk space on duplicated batch files
* divides the work of building the hash table over the CPUs
One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables. This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes. Another is that
outer batch 0 must be written to disk if multiple batches are required.
A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.
A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.
Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 08:39:21 +01:00
|
|
|
ParallelHashGrowth
|
|
|
|
ParallelHashJoinBatch
|
|
|
|
ParallelHashJoinBatchAccessor
|
|
|
|
ParallelHashJoinState
|
2017-05-17 21:52:16 +02:00
|
|
|
ParallelIndexScanDesc
|
|
|
|
ParallelReadyList
|
2010-02-26 02:55:35 +01:00
|
|
|
ParallelSlot
|
Refactor and generalize the ParallelSlot machinery.
Create a wrapper object, ParallelSlotArray, to encapsulate the
number of slots and the slot array itself, plus some other relevant
bits of information. This reduces the number of parameters we have
to pass around all over the place.
Allow for a ParallelSlotArray to contain slots connected to
different databases within a single cluster. The current clients
of this mechanism don't need this, but it is expected to be used
by future patches.
Defer connecting to databases until we actually need the connection
for something. This is a slight behavior change for vacuumdb and
reindexdb. If you specify a number of jobs that is larger than the
number of objects, the extra connections will now not be used.
But, on the other hand, if you specify a number of jobs that is
so large that it's going to fail, the failure would previously have
happened before any operations were actually started, and now it
won't.
Mark Dilger, reviewed by me.
Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com
Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com
2021-03-11 19:17:46 +01:00
|
|
|
ParallelSlotArray
|
tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:
1) Heap scans need to be generalized into table scans. Do this by
introducing TableScanDesc, which will be the "base class" for
individual AMs. This contains the AM independent fields from
HeapScanDesc.
The previous heap_{beginscan,rescan,endscan} et al. have been
replaced with a table_ version.
There's no direct replacement for heap_getnext(), as that returned
a HeapTuple, which is undesirable for a other AMs. Instead there's
table_scan_getnextslot(). But note that heap_getnext() lives on,
it's still used widely to access catalog tables.
This is achieved by new scan_begin, scan_end, scan_rescan,
scan_getnextslot callbacks.
2) The portion of parallel scans that's shared between backends need
to be able to do so without the user doing per-AM work. To achieve
that new parallelscan_{estimate, initialize, reinitialize}
callbacks are introduced, which operate on a new
ParallelTableScanDesc, which again can be subclassed by AMs.
As it is likely that several AMs are going to be block oriented,
block oriented callbacks that can be shared between such AMs are
provided and used by heap. table_block_parallelscan_{estimate,
intiialize, reinitialize} as callbacks, and
table_block_parallelscan_{nextpage, init} for use in AMs. These
operate on a ParallelBlockTableScanDesc.
3) Index scans need to be able to access tables to return a tuple, and
there needs to be state across individual accesses to the heap to
store state like buffers. That's now handled by introducing a
sort-of-scan IndexFetchTable, which again is intended to be
subclassed by individual AMs (for heap IndexFetchHeap).
The relevant callbacks for an AM are index_fetch_{end, begin,
reset} to create the necessary state, and index_fetch_tuple to
retrieve an indexed tuple. Note that index_fetch_tuple
implementations need to be smarter than just blindly fetching the
tuples for AMs that have optimizations similar to heap's HOT - the
currently alive tuple in the update chain needs to be fetched if
appropriate.
Similar to table_scan_getnextslot(), it's undesirable to continue
to return HeapTuples. Thus index_fetch_heap (might want to rename
that later) now accepts a slot as an argument. Core code doesn't
have a lot of call sites performing index scans without going
through the systable_* API (in contrast to loads of heap_getnext
calls and working directly with HeapTuples).
Index scans now store the result of a search in
IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
target is not generally a HeapTuple anymore that seems cleaner.
To be able to sensible adapt code to use the above, two further
callbacks have been introduced:
a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
slots capable of holding a tuple of the AMs
type. table_slot_callbacks() and table_slot_create() are based
upon that, but have additional logic to deal with views, foreign
tables, etc.
While this change could have been done separately, nearly all the
call sites that needed to be adapted for the rest of this commit
also would have been needed to be adapted for
table_slot_callbacks(), making separation not worthwhile.
b) tuple_satisfies_snapshot checks whether the tuple in a slot is
currently visible according to a snapshot. That's required as a few
places now don't have a buffer + HeapTuple around, but a
slot (which in heap's case internally has that information).
Additionally a few infrastructure changes were needed:
I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
internally uses a slot to keep track of tuples. While
systable_getnext() still returns HeapTuples, and will so for the
foreseeable future, the index API (see 1) above) now only deals with
slots.
The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.
Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 20:46:41 +01:00
|
|
|
ParallelSlotResultHandler
|
2012-06-10 21:15:31 +02:00
|
|
|
ParallelState
|
tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:
1) Heap scans need to be generalized into table scans. Do this by
introducing TableScanDesc, which will be the "base class" for
individual AMs. This contains the AM independent fields from
HeapScanDesc.
The previous heap_{beginscan,rescan,endscan} et al. have been
replaced with a table_ version.
There's no direct replacement for heap_getnext(), as that returned
a HeapTuple, which is undesirable for a other AMs. Instead there's
table_scan_getnextslot(). But note that heap_getnext() lives on,
it's still used widely to access catalog tables.
This is achieved by new scan_begin, scan_end, scan_rescan,
scan_getnextslot callbacks.
2) The portion of parallel scans that's shared between backends need
to be able to do so without the user doing per-AM work. To achieve
that new parallelscan_{estimate, initialize, reinitialize}
callbacks are introduced, which operate on a new
ParallelTableScanDesc, which again can be subclassed by AMs.
As it is likely that several AMs are going to be block oriented,
block oriented callbacks that can be shared between such AMs are
provided and used by heap. table_block_parallelscan_{estimate,
intiialize, reinitialize} as callbacks, and
table_block_parallelscan_{nextpage, init} for use in AMs. These
operate on a ParallelBlockTableScanDesc.
3) Index scans need to be able to access tables to return a tuple, and
there needs to be state across individual accesses to the heap to
store state like buffers. That's now handled by introducing a
sort-of-scan IndexFetchTable, which again is intended to be
subclassed by individual AMs (for heap IndexFetchHeap).
The relevant callbacks for an AM are index_fetch_{end, begin,
reset} to create the necessary state, and index_fetch_tuple to
retrieve an indexed tuple. Note that index_fetch_tuple
implementations need to be smarter than just blindly fetching the
tuples for AMs that have optimizations similar to heap's HOT - the
currently alive tuple in the update chain needs to be fetched if
appropriate.
Similar to table_scan_getnextslot(), it's undesirable to continue
to return HeapTuples. Thus index_fetch_heap (might want to rename
that later) now accepts a slot as an argument. Core code doesn't
have a lot of call sites performing index scans without going
through the systable_* API (in contrast to loads of heap_getnext
calls and working directly with HeapTuples).
Index scans now store the result of a search in
IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
target is not generally a HeapTuple anymore that seems cleaner.
To be able to sensible adapt code to use the above, two further
callbacks have been introduced:
a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
slots capable of holding a tuple of the AMs
type. table_slot_callbacks() and table_slot_create() are based
upon that, but have additional logic to deal with views, foreign
tables, etc.
While this change could have been done separately, nearly all the
call sites that needed to be adapted for the rest of this commit
also would have been needed to be adapted for
table_slot_callbacks(), making separation not worthwhile.
b) tuple_satisfies_snapshot checks whether the tuple in a slot is
currently visible according to a snapshot. That's required as a few
places now don't have a buffer + HeapTuple around, but a
slot (which in heap's case internally has that information).
Additionally a few infrastructure changes were needed:
I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
internally uses a slot to keep track of tuples. While
systable_getnext() still returns HeapTuples, and will so for the
foreseeable future, the index API (see 1) above) now only deals with
slots.
The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.
Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 20:46:41 +01:00
|
|
|
ParallelTableScanDesc
|
|
|
|
ParallelTableScanDescData
|
Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 02:30:39 +01:00
|
|
|
ParallelTransState
|
2021-12-23 07:12:52 +01:00
|
|
|
ParallelVacuumState
|
2017-11-17 02:28:11 +01:00
|
|
|
ParallelWorkerContext
|
2015-05-24 03:20:37 +02:00
|
|
|
ParallelWorkerInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
Param
|
2018-04-26 20:45:04 +02:00
|
|
|
ParamCompileHook
|
2010-02-26 02:55:35 +01:00
|
|
|
ParamExecData
|
|
|
|
ParamExternData
|
|
|
|
ParamFetchHook
|
|
|
|
ParamKind
|
|
|
|
ParamListInfo
|
2012-06-10 21:15:31 +02:00
|
|
|
ParamPathInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
ParamRef
|
|
|
|
ParamsErrorCbData
|
2012-06-10 21:15:31 +02:00
|
|
|
ParentMapEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
ParseCallbackState
|
2013-05-29 22:58:43 +02:00
|
|
|
ParseExprKind
|
2010-02-26 02:55:35 +01:00
|
|
|
ParseNamespaceColumn
|
|
|
|
ParseNamespaceItem
|
|
|
|
ParseParamRefHook
|
|
|
|
ParseState
|
|
|
|
ParsedLex
|
2016-04-27 17:47:28 +02:00
|
|
|
ParsedScript
|
2010-02-26 02:55:35 +01:00
|
|
|
ParsedText
|
|
|
|
ParsedWord
|
|
|
|
ParserSetupHook
|
|
|
|
ParserState
|
2018-04-26 20:45:04 +02:00
|
|
|
PartClauseInfo
|
|
|
|
PartClauseMatchStatus
|
|
|
|
PartClauseTarget
|
Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 02:30:39 +01:00
|
|
|
PartialFileSetState
|
2016-12-13 16:51:32 +01:00
|
|
|
PartitionBoundInfo
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
PartitionBoundInfoData
|
|
|
|
PartitionBoundSpec
|
|
|
|
PartitionCmd
|
2016-12-13 16:51:32 +01:00
|
|
|
PartitionDesc
|
|
|
|
PartitionDescData
|
|
|
|
PartitionDirectory
|
2010-02-26 02:55:35 +01:00
|
|
|
PartitionDirectoryEntry
|
2016-12-13 16:51:32 +01:00
|
|
|
PartitionDispatch
|
|
|
|
PartitionElem
|
Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly. This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.
At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work. Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.
Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few
final tweaks also by me.
Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-10 00:07:25 +01:00
|
|
|
PartitionHashBound
|
2016-12-13 16:51:32 +01:00
|
|
|
PartitionKey
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
PartitionListValue
|
2018-04-26 20:45:04 +02:00
|
|
|
PartitionMap
|
|
|
|
PartitionPruneCombineOp
|
|
|
|
PartitionPruneContext
|
|
|
|
PartitionPruneInfo
|
|
|
|
PartitionPruneState
|
|
|
|
PartitionPruneStep
|
|
|
|
PartitionPruneStepCombine
|
|
|
|
PartitionPruneStepOp
|
|
|
|
PartitionPruningData
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
PartitionRangeBound
|
2016-12-13 16:51:32 +01:00
|
|
|
PartitionRangeDatum
|
Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition bounds.
Previously, UNBOUNDED meant no lower bound when used in the FROM list,
and no upper bound when used in the TO list, which was OK for
single-column range partitioning, but problematic with multiple
columns. For example, an upper bound of (10.0, UNBOUNDED) would not be
collocated with a lower bound of (10.0, UNBOUNDED), thus making it
difficult or impossible to define contiguous multi-column range
partitions in some cases.
Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to
represent a partition column that is unbounded below or above
respectively. This syntax removes any ambiguity, and ensures that if
one partition's lower bound equals another partition's upper bound,
then the partitions are contiguous.
Also drop the constraint prohibiting finite values after an unbounded
column, and just document the fact that any values after MINVALUE or
MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED
multiple times, which was needlessly verbose.
Note: Forces a post-PG 10 beta2 initdb.
Report by Amul Sul, original patch by Amit Langote with some
additional hacking by me.
Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com
2017-07-21 10:20:47 +02:00
|
|
|
PartitionRangeDatumKind
|
2017-05-17 21:52:16 +02:00
|
|
|
PartitionScheme
|
2016-12-13 16:51:32 +01:00
|
|
|
PartitionSpec
|
Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint. In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one. This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented. (There is a pending patch to improve the
situation further, but it needs more review.)
Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me. A few final revisions by me.
Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 21:33:06 +01:00
|
|
|
PartitionTupleRouting
|
2018-04-26 20:45:04 +02:00
|
|
|
PartitionedRelPruneInfo
|
|
|
|
PartitionedRelPruningData
|
Implement partition-wise grouping/aggregation.
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition. This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition. This should be no worse, and often better, than the normal
approach.
If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results. This is less certain
to be a win, but it's still useful.
Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.
Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-03-22 17:49:48 +01:00
|
|
|
PartitionwiseAggregateType
|
2017-05-17 21:52:16 +02:00
|
|
|
PasswordType
|
2010-02-26 02:55:35 +01:00
|
|
|
Path
|
|
|
|
PathClauseUsage
|
2012-06-10 21:15:31 +02:00
|
|
|
PathCostComparison
|
2014-05-06 15:08:14 +02:00
|
|
|
PathHashStack
|
2010-02-26 02:55:35 +01:00
|
|
|
PathKey
|
2022-05-12 21:17:30 +02:00
|
|
|
PathKeyInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
PathKeysComparison
|
2016-04-27 17:47:28 +02:00
|
|
|
PathTarget
|
2015-05-24 03:20:37 +02:00
|
|
|
PathkeyMutatorState
|
2010-02-26 02:55:35 +01:00
|
|
|
PathkeySortCost
|
2021-05-12 19:14:10 +02:00
|
|
|
PatternInfo
|
|
|
|
PatternInfoArray
|
2010-02-26 02:55:35 +01:00
|
|
|
Pattern_Prefix_Status
|
|
|
|
Pattern_Type
|
2019-04-04 10:56:03 +02:00
|
|
|
PendingFsyncEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
PendingRelDelete
|
|
|
|
PendingRelSync
|
|
|
|
PendingUnlinkEntry
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
PendingWriteback
|
2022-05-12 21:17:30 +02:00
|
|
|
PerLockTagEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
PerlInterpreter
|
2012-06-10 21:15:31 +02:00
|
|
|
Perl_ppaddr_t
|
2014-05-06 15:08:14 +02:00
|
|
|
Permutation
|
|
|
|
PermutationStep
|
|
|
|
PermutationStepBlocker
|
2010-02-26 02:55:35 +01:00
|
|
|
PermutationStepBlockerType
|
Make archiver process an auxiliary process.
This commit changes WAL archiver process so that it's treated as
an auxiliary process and can use shared memory. This is an infrastructure
patch required for upcoming shared-memory based stats collector patch
series. These patch series basically need any processes including archiver
that can report the statistics to access to shared memory. Since this patch
itself is useful to simplify the code and when users monitor the status of
archiver, it's committed separately in advance.
This commit simplifies the code for WAL archiving. For example, previously
backends need to signal to archiver via postmaster when they notify
archiver that there are some WAL files to archive. On the other hand,
this commit removes that signal to postmaster and enables backends to
notify archier directly using shared latch.
Also, as the side of this change, the information about archiver process
becomes viewable at pg_stat_activity view.
Author: Kyotaro Horiguchi
Reviewed-by: Andres Freund, Álvaro Herrera, Julien Rouhaud, Tomas Vondra, Arthur Zakirov, Fujii Masao
Discussion: https://postgr.es/m/20180629.173418.190173462.horiguchi.kyotaro@lab.ntt.co.jp
2021-03-15 05:13:14 +01:00
|
|
|
PgArchData
|
2015-05-24 03:20:37 +02:00
|
|
|
PgBackendGSSStatus
|
|
|
|
PgBackendSSLStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
PgBackendStatus
|
2015-05-24 03:20:37 +02:00
|
|
|
PgBenchExpr
|
2016-05-02 15:23:55 +02:00
|
|
|
PgBenchExprLink
|
|
|
|
PgBenchExprList
|
2015-05-24 03:20:37 +02:00
|
|
|
PgBenchExprType
|
2016-04-27 17:47:28 +02:00
|
|
|
PgBenchFunction
|
|
|
|
PgBenchValue
|
|
|
|
PgBenchValueType
|
Add options to enable and disable checksums in pg_checksums
An offline cluster can now work with more modes in pg_checksums:
- --enable enables checksums in a cluster, updating all blocks with a
correct checksum, and updating the control file at the end.
- --disable disables checksums in a cluster, updating only the control
file.
- --check is an extra option able to verify checksums for a cluster, and
the default used if no mode is specified.
When running --enable or --disable, the data folder gets fsync'd for
durability, and then it is followed by a control file update and flush
to keep the operation consistent should the tool be interrupted, killed
or the host unplugged. If no mode is specified in the options, then
--check is used for compatibility with older versions of pg_checksums
(named pg_verify_checksums in v11 where it was introduced).
Author: Michael Banck, Michael Paquier
Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-23 00:12:55 +01:00
|
|
|
PgChecksumMode
|
2010-02-26 02:55:35 +01:00
|
|
|
PgFdwAnalyzeState
|
|
|
|
PgFdwConnState
|
2016-04-27 17:47:28 +02:00
|
|
|
PgFdwDirectModifyState
|
2010-02-26 02:55:35 +01:00
|
|
|
PgFdwModifyState
|
2013-05-29 22:58:43 +02:00
|
|
|
PgFdwOption
|
2015-05-24 03:20:37 +02:00
|
|
|
PgFdwPathExtraData
|
2010-02-26 02:55:35 +01:00
|
|
|
PgFdwRelationInfo
|
|
|
|
PgFdwScanState
|
|
|
|
PgIfAddrCallback
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStatShared_Archiver
|
|
|
|
PgStatShared_BgWriter
|
|
|
|
PgStatShared_Checkpointer
|
|
|
|
PgStatShared_Common
|
|
|
|
PgStatShared_Database
|
|
|
|
PgStatShared_Function
|
|
|
|
PgStatShared_HashEntry
|
pgstat: Infrastructure for more detailed IO statistics
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-09 05:53:42 +01:00
|
|
|
PgStatShared_IO
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStatShared_Relation
|
|
|
|
PgStatShared_ReplSlot
|
|
|
|
PgStatShared_SLRU
|
|
|
|
PgStatShared_Subscription
|
|
|
|
PgStatShared_Wal
|
2014-05-06 15:08:14 +02:00
|
|
|
PgStat_ArchiverStats
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStat_BackendSubEntry
|
2014-05-06 15:08:14 +02:00
|
|
|
PgStat_BgWriterStats
|
pgstat: Infrastructure for more detailed IO statistics
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-09 05:53:42 +01:00
|
|
|
PgStat_BktypeIO
|
2014-05-06 15:08:14 +02:00
|
|
|
PgStat_CheckpointerStats
|
2010-02-26 02:55:35 +01:00
|
|
|
PgStat_Counter
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStat_EntryRef
|
|
|
|
PgStat_EntryRefHashEntry
|
|
|
|
PgStat_FetchConsistency
|
2010-02-26 02:55:35 +01:00
|
|
|
PgStat_FunctionCallUsage
|
|
|
|
PgStat_FunctionCounts
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStat_HashKey
|
pgstat: Infrastructure for more detailed IO statistics
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-09 05:53:42 +01:00
|
|
|
PgStat_IO
|
2022-04-07 02:56:19 +02:00
|
|
|
PgStat_Kind
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStat_KindInfo
|
|
|
|
PgStat_LocalState
|
pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there
is no reliable way of getting rid of statistics entries. Because of that
pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the
current database with the catalog contents and tries to drop now-superfluous
entries. That's quite expensive. What's worse, it doesn't work on physical
replicas, despite physical replicas collection statistics entries.
This commit introduces infrastructure to create / drop statistics entries
transactionally, together with the underlying catalog objects (functions,
relations, subscriptions). pgstat_xact.c maintains a list of stats entries
created / dropped transactionally in the current transaction. To ensure the
removal of statistics entries is durable dropped statistics entries are
included in commit / abort (and prepare) records, which also ensures that
stats entries are dropped on standbys.
Statistics entries created separately from creating the underlying catalog
object (e.g. when stats were previously lost due to an immediate restart)
are *not* WAL logged. However that can only happen outside of the transaction
creating the catalog object, so it does not lead to "leaked" statistics
entries.
For this to work, functions creating / dropping functions / relations /
subscriptions need to call into pgstat. For subscriptions this was already
done when dropping subscriptions, via pgstat_report_subscription_drop() (now
renamed to pgstat_drop_subscription()).
This commit does not actually drop stats yet, it just provides the
infrastructure. It is however a largely independent piece of infrastructure,
so committing it separately makes sense.
Bumps XLOG_PAGE_MAGIC.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-07 03:22:22 +02:00
|
|
|
PgStat_PendingDroppedStatsItem
|
2010-02-26 02:55:35 +01:00
|
|
|
PgStat_SLRUStats
|
2022-04-07 06:29:46 +02:00
|
|
|
PgStat_ShmemControl
|
|
|
|
PgStat_Snapshot
|
|
|
|
PgStat_SnapshotEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
PgStat_StatDBEntry
|
|
|
|
PgStat_StatFuncEntry
|
2021-04-27 05:39:11 +02:00
|
|
|
PgStat_StatReplSlotEntry
|
2022-03-01 01:47:52 +01:00
|
|
|
PgStat_StatSubEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
PgStat_StatTabEntry
|
|
|
|
PgStat_SubXactStatus
|
|
|
|
PgStat_TableCounts
|
|
|
|
PgStat_TableStatus
|
|
|
|
PgStat_TableXactStatus
|
2020-10-02 03:17:11 +02:00
|
|
|
PgStat_WalStats
|
2011-11-14 18:12:23 +01:00
|
|
|
PgXmlErrorContext
|
|
|
|
PgXmlStrictness
|
2010-02-26 02:55:35 +01:00
|
|
|
Pg_finfo_record
|
|
|
|
Pg_magic_struct
|
|
|
|
PipeProtoChunk
|
|
|
|
PipeProtoHeader
|
|
|
|
PlaceHolderInfo
|
|
|
|
PlaceHolderVar
|
|
|
|
Plan
|
2016-05-02 15:23:55 +02:00
|
|
|
PlanDirectModify_function
|
2011-04-09 05:11:37 +02:00
|
|
|
PlanForeignModify_function
|
2010-02-26 02:55:35 +01:00
|
|
|
PlanInvalItem
|
|
|
|
PlanRowMark
|
|
|
|
PlanState
|
|
|
|
PlannedStmt
|
|
|
|
PlannerGlobal
|
|
|
|
PlannerInfo
|
|
|
|
PlannerParamItem
|
|
|
|
Point
|
|
|
|
Pointer
|
2015-05-24 03:20:37 +02:00
|
|
|
PolicyInfo
|
|
|
|
PolyNumAggState
|
2010-02-26 02:55:35 +01:00
|
|
|
Pool
|
2017-05-17 21:52:16 +02:00
|
|
|
PopulateArrayContext
|
|
|
|
PopulateArrayState
|
|
|
|
PopulateRecordCache
|
2010-02-26 02:55:35 +01:00
|
|
|
PopulateRecordsetState
|
|
|
|
Port
|
|
|
|
Portal
|
|
|
|
PortalHashEnt
|
|
|
|
PortalStatus
|
|
|
|
PortalStrategy
|
|
|
|
PostParseColumnRefHook
|
|
|
|
PostgresPollingStatusType
|
|
|
|
PostingItem
|
2014-05-06 15:08:14 +02:00
|
|
|
PostponedQual
|
2010-02-26 02:55:35 +01:00
|
|
|
PreParseColumnRefHook
|
|
|
|
PredClass
|
|
|
|
PredIterInfo
|
|
|
|
PredIterInfoData
|
2011-04-09 05:11:37 +02:00
|
|
|
PredXactList
|
|
|
|
PredXactListElement
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
PredicateLockData
|
2011-04-09 05:11:37 +02:00
|
|
|
PredicateLockTargetType
|
2020-05-14 19:06:38 +02:00
|
|
|
PrefetchBufferResult
|
2017-05-17 21:52:16 +02:00
|
|
|
PrepParallelRestorePtrType
|
2010-02-26 02:55:35 +01:00
|
|
|
PrepareStmt
|
|
|
|
PreparedStatement
|
2020-05-14 19:06:38 +02:00
|
|
|
PresortedKeyData
|
2014-05-06 15:08:14 +02:00
|
|
|
PrewarmType
|
2017-05-17 21:52:16 +02:00
|
|
|
PrintExtraTocPtrType
|
|
|
|
PrintTocDataPtrType
|
2011-04-09 05:11:37 +02:00
|
|
|
PrintfArgType
|
|
|
|
PrintfArgValue
|
|
|
|
PrintfTarget
|
2010-02-26 02:55:35 +01:00
|
|
|
PrinttupAttrInfo
|
|
|
|
PrivTarget
|
2015-05-24 03:20:37 +02:00
|
|
|
PrivateRefCountEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
ProcArrayStruct
|
|
|
|
ProcLangInfo
|
|
|
|
ProcSignalBarrierType
|
|
|
|
ProcSignalHeader
|
|
|
|
ProcSignalReason
|
|
|
|
ProcSignalSlot
|
|
|
|
ProcState
|
2021-05-12 19:14:10 +02:00
|
|
|
ProcWaitStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
ProcessUtilityContext
|
|
|
|
ProcessUtility_hook_type
|
|
|
|
ProcessingMode
|
2016-05-02 15:23:55 +02:00
|
|
|
ProgressCommandType
|
2017-05-17 21:52:16 +02:00
|
|
|
ProjectSet
|
|
|
|
ProjectSetPath
|
|
|
|
ProjectSetState
|
2010-02-26 02:55:35 +01:00
|
|
|
ProjectionInfo
|
2016-04-27 17:47:28 +02:00
|
|
|
ProjectionPath
|
2018-04-26 20:45:04 +02:00
|
|
|
PromptInterruptContext
|
2010-02-26 02:55:35 +01:00
|
|
|
ProtocolVersion
|
|
|
|
PrsStorage
|
|
|
|
PruneState
|
2018-04-26 20:45:04 +02:00
|
|
|
PruneStepResult
|
2016-04-27 17:47:28 +02:00
|
|
|
PsqlScanCallbacks
|
2017-05-17 21:52:16 +02:00
|
|
|
PsqlScanQuoteType
|
2010-02-26 02:55:35 +01:00
|
|
|
PsqlScanResult
|
|
|
|
PsqlScanState
|
|
|
|
PsqlScanStateData
|
|
|
|
PsqlSettings
|
2017-05-17 21:52:16 +02:00
|
|
|
Publication
|
|
|
|
PublicationActions
|
Allow specifying row filters for logical replication of tables.
This feature adds row filtering for publication tables. When a publication
is defined or modified, an optional WHERE clause can be specified. Rows
that don't satisfy this WHERE clause will be filtered out. This allows a
set of tables to be partially replicated. The row filter is per table. A
new row filter can be added simply by specifying a WHERE clause after the
table name. The WHERE clause must be enclosed by parentheses.
The row filter WHERE clause for a table added to a publication that
publishes UPDATE and/or DELETE operations must contain only columns that
are covered by REPLICA IDENTITY. The row filter WHERE clause for a table
added to a publication that publishes INSERT can use any column. If the
row filter evaluates to NULL, it is regarded as "false". The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined types, user-defined collations,
non-immutable built-in functions, or references to system columns. These
restrictions could be addressed in the future.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber. If the subscription
has several publications in which a table has been published with
different WHERE clauses, rows that satisfy ANY of the expressions will be
copied. If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher.
The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters (for the same publish operation), those
expressions get OR'ed together so that rows satisfying any of the
expressions will be replicated.
This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.
If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition's
row filter (if the parameter is false, the default) or the root
partitioned table's row filter.
Psql commands \dRp+ and \d <table-name> will display any row filters.
Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian
Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
2022-02-22 03:24:12 +01:00
|
|
|
PublicationDesc
|
2017-05-17 21:52:16 +02:00
|
|
|
PublicationInfo
|
Allow publishing the tables of schema.
A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows
one or more schemas to be specified, whose tables are selected by the
publisher for sending the data to the subscriber.
The new syntax allows specifying both the tables and schemas. For example:
CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
A new system table "pg_publication_namespace" has been added, to maintain
the schemas that the user wants to publish through the publication.
Modified the output plugin (pgoutput) to publish the changes if the
relation is part of schema publication.
Updates pg_dump to identify and dump schema publications. Updates the \d
family of commands to display schema publications and \dRp+ variant will
now display associated schemas if any.
Author: Vignesh C, Hou Zhijie, Amit Kapila
Syntax-Suggested-by: Tom Lane, Alvaro Herrera
Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger
Tested-by: Haiying Tang
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com
2021-10-27 04:14:52 +02:00
|
|
|
PublicationObjSpec
|
|
|
|
PublicationObjSpecType
|
2017-05-17 21:52:16 +02:00
|
|
|
PublicationPartOpt
|
|
|
|
PublicationRelInfo
|
Allow publishing the tables of schema.
A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows
one or more schemas to be specified, whose tables are selected by the
publisher for sending the data to the subscriber.
The new syntax allows specifying both the tables and schemas. For example:
CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
A new system table "pg_publication_namespace" has been added, to maintain
the schemas that the user wants to publish through the publication.
Modified the output plugin (pgoutput) to publish the changes if the
relation is part of schema publication.
Updates pg_dump to identify and dump schema publications. Updates the \d
family of commands to display schema publications and \dRp+ variant will
now display associated schemas if any.
Author: Vignesh C, Hou Zhijie, Amit Kapila
Syntax-Suggested-by: Tom Lane, Alvaro Herrera
Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger
Tested-by: Haiying Tang
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com
2021-10-27 04:14:52 +02:00
|
|
|
PublicationSchemaInfo
|
2021-09-06 19:24:50 +02:00
|
|
|
PublicationTable
|
2010-02-26 02:55:35 +01:00
|
|
|
PullFilter
|
|
|
|
PullFilterOps
|
|
|
|
PushFilter
|
|
|
|
PushFilterOps
|
|
|
|
PushFunction
|
2016-05-02 15:23:55 +02:00
|
|
|
PyCFunction
|
2012-06-10 21:15:31 +02:00
|
|
|
PyMappingMethods
|
2010-02-26 02:55:35 +01:00
|
|
|
PyMethodDef
|
2016-05-02 15:23:55 +02:00
|
|
|
PyModuleDef
|
2010-02-26 02:55:35 +01:00
|
|
|
PyObject
|
|
|
|
PySequenceMethods
|
|
|
|
PyTypeObject
|
|
|
|
Py_ssize_t
|
|
|
|
QPRS_STATE
|
|
|
|
QTN2QTState
|
|
|
|
QTNode
|
|
|
|
QUERYTYPE
|
|
|
|
QUERY_SECURITY_CONTEXT_TOKEN_FN
|
|
|
|
QualCost
|
|
|
|
QualItem
|
|
|
|
Query
|
2020-05-14 19:06:38 +02:00
|
|
|
QueryCompletion
|
2010-02-26 02:55:35 +01:00
|
|
|
QueryDesc
|
2017-05-17 21:52:16 +02:00
|
|
|
QueryEnvironment
|
2010-02-26 02:55:35 +01:00
|
|
|
QueryInfo
|
|
|
|
QueryItem
|
|
|
|
QueryItemType
|
|
|
|
QueryMode
|
|
|
|
QueryOperand
|
|
|
|
QueryOperator
|
|
|
|
QueryRepresentation
|
2016-04-27 17:47:28 +02:00
|
|
|
QueryRepresentationOperand
|
2010-02-26 02:55:35 +01:00
|
|
|
QuerySource
|
|
|
|
QueueBackendStatus
|
|
|
|
QueuePosition
|
|
|
|
QuitSignalReason
|
|
|
|
RBTNode
|
|
|
|
RBTOrderControl
|
|
|
|
RBTree
|
2016-12-13 16:51:32 +01:00
|
|
|
RBTreeIterator
|
2011-04-09 05:11:37 +02:00
|
|
|
REPARSE_JUNCTION_DATA_BUFFER
|
2010-02-26 02:55:35 +01:00
|
|
|
RIX
|
|
|
|
RI_CompareHashEntry
|
|
|
|
RI_CompareKey
|
|
|
|
RI_ConstraintInfo
|
|
|
|
RI_QueryHashEntry
|
|
|
|
RI_QueryKey
|
|
|
|
RTEKind
|
2022-12-06 16:09:24 +01:00
|
|
|
RTEPermissionInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
RWConflict
|
|
|
|
RWConflictPoolHeader
|
2016-04-27 17:47:28 +02:00
|
|
|
Range
|
2011-11-14 18:12:23 +01:00
|
|
|
RangeBound
|
2016-04-27 17:47:28 +02:00
|
|
|
RangeBox
|
2010-02-26 02:55:35 +01:00
|
|
|
RangeFunction
|
2012-06-10 21:15:31 +02:00
|
|
|
RangeIOData
|
2010-02-26 02:55:35 +01:00
|
|
|
RangeQueryClause
|
|
|
|
RangeSubselect
|
2017-05-17 21:52:16 +02:00
|
|
|
RangeTableFunc
|
|
|
|
RangeTableFuncCol
|
2015-05-24 03:20:37 +02:00
|
|
|
RangeTableSample
|
2010-02-26 02:55:35 +01:00
|
|
|
RangeTblEntry
|
2014-05-06 15:08:14 +02:00
|
|
|
RangeTblFunction
|
2010-02-26 02:55:35 +01:00
|
|
|
RangeTblRef
|
2011-11-14 18:12:23 +01:00
|
|
|
RangeType
|
2010-02-26 02:55:35 +01:00
|
|
|
RangeVar
|
2012-06-10 21:15:31 +02:00
|
|
|
RangeVarGetRelidCallback
|
2021-05-12 19:14:10 +02:00
|
|
|
Ranges
|
2010-02-26 02:55:35 +01:00
|
|
|
RawColumnDefault
|
2021-05-12 19:14:10 +02:00
|
|
|
RawParseMode
|
2017-05-17 21:52:16 +02:00
|
|
|
RawStmt
|
2016-05-02 15:23:55 +02:00
|
|
|
ReInitializeDSMForeignScan_function
|
2011-04-09 05:11:37 +02:00
|
|
|
ReScanForeignScan_function
|
2017-05-17 21:52:16 +02:00
|
|
|
ReadBufPtrType
|
2010-02-26 02:55:35 +01:00
|
|
|
ReadBufferMode
|
2017-05-17 21:52:16 +02:00
|
|
|
ReadBytePtrType
|
|
|
|
ReadExtraTocPtrType
|
2011-04-09 05:11:37 +02:00
|
|
|
ReadFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
ReadLocalXLogPageNoWaitPrivate
|
2021-10-25 00:40:42 +02:00
|
|
|
ReadReplicationSlotCmd
|
2010-02-26 02:55:35 +01:00
|
|
|
ReassignOwnedStmt
|
2016-05-02 15:23:55 +02:00
|
|
|
RecheckForeignScan_function
|
2010-02-26 02:55:35 +01:00
|
|
|
RecordCacheEntry
|
|
|
|
RecordCompareData
|
|
|
|
RecordIOData
|
2018-06-30 18:07:27 +02:00
|
|
|
RecoveryLockListsEntry
|
2020-04-24 01:48:28 +02:00
|
|
|
RecoveryPauseState
|
|
|
|
RecoveryState
|
2015-05-24 03:20:37 +02:00
|
|
|
RecoveryTargetTimeLineGoal
|
2010-07-06 21:18:19 +02:00
|
|
|
RecoveryTargetType
|
2016-04-27 17:47:28 +02:00
|
|
|
RectBox
|
2010-02-26 02:55:35 +01:00
|
|
|
RecursionContext
|
|
|
|
RecursiveUnion
|
2016-04-27 17:47:28 +02:00
|
|
|
RecursiveUnionPath
|
2010-02-26 02:55:35 +01:00
|
|
|
RecursiveUnionState
|
2015-05-24 03:20:37 +02:00
|
|
|
RefetchForeignRow_function
|
2013-05-29 22:58:43 +02:00
|
|
|
RefreshMatViewStmt
|
2010-02-26 02:55:35 +01:00
|
|
|
RegProcedure
|
|
|
|
Regis
|
|
|
|
RegisNode
|
2013-05-29 22:58:43 +02:00
|
|
|
RegisteredBgWorker
|
Add support for partitioned tables and indexes in REINDEX
Until now, REINDEX was not able to work with partitioned tables and
indexes, forcing users to reindex partitions one by one. This extends
REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned
index and table in input, respectively, to reindex all the partitions
assigned to them with physical storage (foreign tables, partitioned
tables and indexes are then discarded).
This shares some logic with schema and database REINDEX as each
partition gets processed in its own transaction after building a list of
relations to work on. This choice has the advantage to minimize the
number of invalid indexes to one partition with REINDEX CONCURRENTLY in
the event a cancellation or failure in-flight, as the only indexes
handled at once in a single REINDEX CONCURRENTLY loop are the ones from
the partition being working on.
Isolation tests are added to emulate some cases I bumped into while
developing this feature, particularly with the concurrent drop of a
leaf partition reindexed. However, this is rather limited as LOCK would
cause REINDEX to block in the first transaction building the list of
partitions.
Per its multi-transaction nature, this new flavor cannot run in a
transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE.
Author: Justin Pryzby, Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com
2020-09-08 03:09:22 +02:00
|
|
|
ReindexErrorInfo
|
2021-01-12 21:04:49 +01:00
|
|
|
ReindexIndexInfo
|
2015-05-24 03:20:37 +02:00
|
|
|
ReindexObjectType
|
2021-01-18 06:03:10 +01:00
|
|
|
ReindexParams
|
2010-02-26 02:55:35 +01:00
|
|
|
ReindexStmt
|
2019-07-02 04:36:53 +02:00
|
|
|
ReindexType
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelFileLocator
|
|
|
|
RelFileLocatorBackend
|
2010-02-26 02:55:35 +01:00
|
|
|
RelIdCacheEnt
|
2010-07-06 21:18:19 +02:00
|
|
|
RelInfo
|
|
|
|
RelInfoArr
|
2010-02-26 02:55:35 +01:00
|
|
|
RelMapFile
|
|
|
|
RelMapping
|
|
|
|
RelOptInfo
|
|
|
|
RelOptKind
|
|
|
|
RelToCheck
|
|
|
|
RelToCluster
|
|
|
|
RelabelType
|
|
|
|
Relation
|
|
|
|
RelationData
|
|
|
|
RelationInfo
|
|
|
|
RelationPtr
|
2017-05-17 21:52:16 +02:00
|
|
|
RelationSyncEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
RelcacheCallbackFunction
|
2022-05-12 21:17:30 +02:00
|
|
|
ReleaseMatchCB
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelfilenumberMapEntry
|
|
|
|
RelfilenumberMapKey
|
2010-02-26 02:55:35 +01:00
|
|
|
Relids
|
2011-11-14 18:12:23 +01:00
|
|
|
RelocationBufferInfo
|
2016-12-13 16:51:32 +01:00
|
|
|
RelptrFreePageBtree
|
|
|
|
RelptrFreePageManager
|
|
|
|
RelptrFreePageSpanLeader
|
2010-02-26 02:55:35 +01:00
|
|
|
RenameStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
ReopenPtrType
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
ReorderBuffer
|
|
|
|
ReorderBufferApplyChangeCB
|
2018-04-26 20:45:04 +02:00
|
|
|
ReorderBufferApplyTruncateCB
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
ReorderBufferBeginCB
|
|
|
|
ReorderBufferChange
|
Allow specifying row filters for logical replication of tables.
This feature adds row filtering for publication tables. When a publication
is defined or modified, an optional WHERE clause can be specified. Rows
that don't satisfy this WHERE clause will be filtered out. This allows a
set of tables to be partially replicated. The row filter is per table. A
new row filter can be added simply by specifying a WHERE clause after the
table name. The WHERE clause must be enclosed by parentheses.
The row filter WHERE clause for a table added to a publication that
publishes UPDATE and/or DELETE operations must contain only columns that
are covered by REPLICA IDENTITY. The row filter WHERE clause for a table
added to a publication that publishes INSERT can use any column. If the
row filter evaluates to NULL, it is regarded as "false". The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined types, user-defined collations,
non-immutable built-in functions, or references to system columns. These
restrictions could be addressed in the future.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber. If the subscription
has several publications in which a table has been published with
different WHERE clauses, rows that satisfy ANY of the expressions will be
copied. If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher.
The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters (for the same publish operation), those
expressions get OR'ed together so that rows satisfying any of the
expressions will be replicated.
This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.
If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition's
row filter (if the parameter is false, the default) or the root
partitioned table's row filter.
Psql commands \dRp+ and \d <table-name> will display any row filters.
Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian
Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
2022-02-22 03:24:12 +01:00
|
|
|
ReorderBufferChangeType
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
ReorderBufferCommitCB
|
|
|
|
ReorderBufferCommitPreparedCB
|
|
|
|
ReorderBufferDiskChange
|
|
|
|
ReorderBufferIterTXNEntry
|
|
|
|
ReorderBufferIterTXNState
|
2016-05-02 15:23:55 +02:00
|
|
|
ReorderBufferMessageCB
|
|
|
|
ReorderBufferPrepareCB
|
2020-12-30 11:47:26 +01:00
|
|
|
ReorderBufferRollbackPreparedCB
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
ReorderBufferStreamAbortCB
|
|
|
|
ReorderBufferStreamChangeCB
|
|
|
|
ReorderBufferStreamCommitCB
|
2016-05-02 15:23:55 +02:00
|
|
|
ReorderBufferStreamMessageCB
|
2018-04-26 20:45:04 +02:00
|
|
|
ReorderBufferStreamPrepareCB
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
ReorderBufferStreamStartCB
|
2018-04-26 20:45:04 +02:00
|
|
|
ReorderBufferStreamStopCB
|
|
|
|
ReorderBufferStreamTruncateCB
|
2014-05-06 15:08:14 +02:00
|
|
|
ReorderBufferTXN
|
|
|
|
ReorderBufferTXNByIdEnt
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
ReorderBufferToastEnt
|
|
|
|
ReorderBufferTupleBuf
|
|
|
|
ReorderBufferTupleCidEnt
|
|
|
|
ReorderBufferTupleCidKey
|
2023-02-08 03:28:25 +01:00
|
|
|
ReorderBufferUpdateProgressTxnCB
|
2015-05-24 03:20:37 +02:00
|
|
|
ReorderTuple
|
|
|
|
RepOriginId
|
|
|
|
ReparameterizeForeignPathByChild_function
|
2010-02-26 02:55:35 +01:00
|
|
|
ReplaceVarsFromTargetList_context
|
|
|
|
ReplaceVarsNoMatchOption
|
2014-05-06 15:08:14 +02:00
|
|
|
ReplicaIdentityStmt
|
|
|
|
ReplicationKind
|
|
|
|
ReplicationSlot
|
|
|
|
ReplicationSlotCtlData
|
|
|
|
ReplicationSlotOnDisk
|
|
|
|
ReplicationSlotPersistency
|
|
|
|
ReplicationSlotPersistentData
|
2015-05-24 03:20:37 +02:00
|
|
|
ReplicationState
|
|
|
|
ReplicationStateCtl
|
|
|
|
ReplicationStateOnDisk
|
2010-02-26 02:55:35 +01:00
|
|
|
ResTarget
|
2015-05-24 03:20:37 +02:00
|
|
|
ReservoirState
|
|
|
|
ReservoirStateData
|
2016-04-27 17:47:28 +02:00
|
|
|
ResourceArray
|
2010-02-26 02:55:35 +01:00
|
|
|
ResourceOwner
|
|
|
|
ResourceReleaseCallback
|
|
|
|
ResourceReleaseCallbackItem
|
|
|
|
ResourceReleasePhase
|
|
|
|
RestoreOptions
|
2017-08-14 23:29:33 +02:00
|
|
|
RestorePass
|
2010-02-26 02:55:35 +01:00
|
|
|
RestrictInfo
|
|
|
|
Result
|
|
|
|
ResultRelInfo
|
|
|
|
ResultState
|
|
|
|
ReturnSetInfo
|
2021-05-12 19:14:10 +02:00
|
|
|
ReturnStmt
|
2015-05-24 03:20:37 +02:00
|
|
|
RevmapContents
|
2014-05-06 15:08:14 +02:00
|
|
|
RewriteMappingDataEntry
|
|
|
|
RewriteMappingFile
|
2010-02-26 02:55:35 +01:00
|
|
|
RewriteRule
|
|
|
|
RewriteState
|
|
|
|
RmgrData
|
2013-05-29 22:58:43 +02:00
|
|
|
RmgrDescData
|
2010-02-26 02:55:35 +01:00
|
|
|
RmgrId
|
2014-05-06 15:08:14 +02:00
|
|
|
RoleNameItem
|
2015-05-24 03:20:37 +02:00
|
|
|
RoleSpec
|
|
|
|
RoleSpecType
|
2010-02-26 02:55:35 +01:00
|
|
|
RoleStmtType
|
2017-05-17 21:52:16 +02:00
|
|
|
RollupData
|
2010-02-26 02:55:35 +01:00
|
|
|
RowCompareExpr
|
|
|
|
RowCompareType
|
|
|
|
RowExpr
|
2021-05-12 19:14:10 +02:00
|
|
|
RowIdentityVarInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
RowMarkClause
|
|
|
|
RowMarkType
|
2015-05-24 03:20:37 +02:00
|
|
|
RowSecurityDesc
|
|
|
|
RowSecurityPolicy
|
2022-05-12 21:17:30 +02:00
|
|
|
RtlGetLastNtStatus_t
|
2010-02-26 02:55:35 +01:00
|
|
|
RuleInfo
|
|
|
|
RuleLock
|
|
|
|
RuleStmt
|
|
|
|
RunningTransactions
|
|
|
|
RunningTransactionsData
|
|
|
|
SC_HANDLE
|
|
|
|
SECURITY_ATTRIBUTES
|
2011-04-09 05:11:37 +02:00
|
|
|
SECURITY_STATUS
|
2010-02-26 02:55:35 +01:00
|
|
|
SEG
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
SERIALIZABLEXACT
|
|
|
|
SERIALIZABLEXID
|
|
|
|
SERIALIZABLEXIDTAG
|
2010-02-26 02:55:35 +01:00
|
|
|
SERVICE_STATUS
|
|
|
|
SERVICE_STATUS_HANDLE
|
|
|
|
SERVICE_TABLE_ENTRY
|
|
|
|
SID_AND_ATTRIBUTES
|
|
|
|
SID_IDENTIFIER_AUTHORITY
|
|
|
|
SID_NAME_USE
|
|
|
|
SISeg
|
2018-04-26 20:45:04 +02:00
|
|
|
SIZE_T
|
2010-02-26 02:55:35 +01:00
|
|
|
SMgrRelation
|
|
|
|
SMgrRelationData
|
2020-05-14 19:06:38 +02:00
|
|
|
SMgrSortArray
|
2013-05-29 22:58:43 +02:00
|
|
|
SOCKADDR
|
2011-04-09 05:11:37 +02:00
|
|
|
SOCKET
|
2010-02-26 02:55:35 +01:00
|
|
|
SPELL
|
2017-05-17 21:52:16 +02:00
|
|
|
SPICallbackArg
|
2016-04-27 17:47:28 +02:00
|
|
|
SPIExecuteOptions
|
2011-04-09 05:11:37 +02:00
|
|
|
SPIParseOpenOptions
|
2010-02-26 02:55:35 +01:00
|
|
|
SPIPlanPtr
|
|
|
|
SPIPrepareOptions
|
|
|
|
SPITupleTable
|
|
|
|
SPLITCOST
|
|
|
|
SPNode
|
|
|
|
SPNodeData
|
2012-06-10 21:15:31 +02:00
|
|
|
SPPageDesc
|
2013-05-29 22:58:43 +02:00
|
|
|
SQLDropObject
|
2010-02-26 02:55:35 +01:00
|
|
|
SQLFunctionCache
|
|
|
|
SQLFunctionCachePtr
|
2011-04-09 05:11:37 +02:00
|
|
|
SQLFunctionParseInfo
|
|
|
|
SQLFunctionParseInfoPtr
|
2010-02-26 02:55:35 +01:00
|
|
|
SSL
|
2016-04-27 17:47:28 +02:00
|
|
|
SSLExtensionInfoContext
|
2010-02-26 02:55:35 +01:00
|
|
|
SSL_CTX
|
|
|
|
STARTUPINFO
|
|
|
|
STRLEN
|
|
|
|
SV
|
2015-05-24 03:20:37 +02:00
|
|
|
SYNCHRONIZATION_BARRIER
|
|
|
|
SampleScan
|
2016-05-02 15:23:55 +02:00
|
|
|
SampleScanGetSampleSize_function
|
2015-05-24 03:20:37 +02:00
|
|
|
SampleScanState
|
|
|
|
SavedTransactionCharacteristics
|
2010-02-26 02:55:35 +01:00
|
|
|
ScalarArrayOpExpr
|
|
|
|
ScalarArrayOpExprHashEntry
|
|
|
|
ScalarArrayOpExprHashTable
|
2017-05-17 21:52:16 +02:00
|
|
|
ScalarIOData
|
2010-02-26 02:55:35 +01:00
|
|
|
ScalarItem
|
|
|
|
ScalarMCVItem
|
|
|
|
Scan
|
|
|
|
ScanDirection
|
|
|
|
ScanKey
|
|
|
|
ScanKeyData
|
|
|
|
ScanKeywordHashFunc
|
2012-06-10 21:15:31 +02:00
|
|
|
ScanKeywordList
|
2010-02-26 02:55:35 +01:00
|
|
|
ScanState
|
|
|
|
ScanTypeControl
|
2014-05-06 15:08:14 +02:00
|
|
|
ScannerCallbackState
|
2010-02-26 02:55:35 +01:00
|
|
|
SchemaQuery
|
2011-04-09 05:11:37 +02:00
|
|
|
SecBuffer
|
|
|
|
SecBufferDesc
|
|
|
|
SecLabelItem
|
|
|
|
SecLabelStmt
|
2017-05-17 21:52:16 +02:00
|
|
|
SeenRelsEntry
|
2020-05-14 19:06:38 +02:00
|
|
|
SelectLimit
|
2010-02-26 02:55:35 +01:00
|
|
|
SelectStmt
|
|
|
|
Selectivity
|
2019-05-22 18:55:34 +02:00
|
|
|
SemTPadded
|
2012-06-10 21:15:31 +02:00
|
|
|
SemiAntiJoinFactors
|
2010-02-26 02:55:35 +01:00
|
|
|
SeqScan
|
|
|
|
SeqScanState
|
|
|
|
SeqTable
|
|
|
|
SeqTableData
|
2011-04-09 05:11:37 +02:00
|
|
|
SerCommitSeqNo
|
2020-05-16 17:49:14 +02:00
|
|
|
SerialControl
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
SerializableXactHandle
|
2019-03-27 22:59:19 +01:00
|
|
|
SerializedActiveRelMaps
|
Allow parallel workers to retrieve some data from Port
This commit moves authn_id into a new global structure called
ClientConnectionInfo (mapping to a MyClientConnectionInfo for each
backend) which is intended to hold all the client information that
should be shared between the backend and any of its parallel workers,
access for extensions and triggers being the primary use case. There is
no need to push all the data of Port to the workers, and authn_id is
quite a generic concept so using a separate structure provides the best
balance (the name of the structure has been suggested by Robert Haas).
While on it, and per discussion as this would be useful for a potential
SYSTEM_USER that can be accessed through parallel workers, a second
field is added for the authentication method, copied directly from
Port.
ClientConnectionInfo is serialized and restored using a new parallel
key and a structure tracks the length of the authn_id, making the
addition of more fields straight-forward.
Author: Jacob Champion
Reviewed-by: Bertrand Drouvot, Stephen Frost, Robert Haas, Tom Lane,
Michael Paquier, Julien Rouhaud
Discussion: https://postgr.es/m/793d990837ae5c06a558d58d62de9378ab525d83.camel@vmware.com
2022-08-24 05:57:13 +02:00
|
|
|
SerializedClientConnectionInfo
|
2019-03-27 22:59:19 +01:00
|
|
|
SerializedRanges
|
2018-04-26 20:45:04 +02:00
|
|
|
SerializedReindexState
|
2015-05-24 03:20:37 +02:00
|
|
|
SerializedSnapshotData
|
2019-03-27 22:59:19 +01:00
|
|
|
SerializedTransactionState
|
2014-05-06 15:08:14 +02:00
|
|
|
Session
|
2017-05-17 21:52:16 +02:00
|
|
|
SessionBackupState
|
2021-05-12 19:14:10 +02:00
|
|
|
SessionEndType
|
2010-02-26 02:55:35 +01:00
|
|
|
SetConstraintState
|
|
|
|
SetConstraintStateData
|
|
|
|
SetConstraintTriggerData
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
SetExprState
|
2010-02-26 02:55:35 +01:00
|
|
|
SetFunctionReturnMode
|
|
|
|
SetOp
|
|
|
|
SetOpCmd
|
2016-04-27 17:47:28 +02:00
|
|
|
SetOpPath
|
2010-02-26 02:55:35 +01:00
|
|
|
SetOpState
|
|
|
|
SetOpStatePerGroup
|
|
|
|
SetOpStrategy
|
|
|
|
SetOperation
|
|
|
|
SetOperationStmt
|
2021-05-12 19:14:10 +02:00
|
|
|
SetQuantifier
|
2010-02-26 02:55:35 +01:00
|
|
|
SetToDefault
|
2017-05-17 21:52:16 +02:00
|
|
|
SetupWorkerPtrType
|
2016-04-27 17:47:28 +02:00
|
|
|
ShDependObjectInfo
|
2017-11-29 15:24:24 +01:00
|
|
|
SharedAggInfo
|
2017-05-17 21:52:16 +02:00
|
|
|
SharedBitmapState
|
2016-04-27 17:47:28 +02:00
|
|
|
SharedDependencyObjectType
|
2010-02-26 02:55:35 +01:00
|
|
|
SharedDependencyType
|
2016-04-27 17:47:28 +02:00
|
|
|
SharedExecutorInstrumentation
|
2017-12-02 01:30:56 +01:00
|
|
|
SharedFileSet
|
2018-04-26 20:45:04 +02:00
|
|
|
SharedHashInfo
|
2017-11-29 15:24:24 +01:00
|
|
|
SharedIncrementalSortInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
SharedInvalCatalogMsg
|
|
|
|
SharedInvalCatcacheMsg
|
|
|
|
SharedInvalRelcacheMsg
|
|
|
|
SharedInvalRelmapMsg
|
|
|
|
SharedInvalSmgrMsg
|
2014-05-06 15:08:14 +02:00
|
|
|
SharedInvalSnapshotMsg
|
2010-02-26 02:55:35 +01:00
|
|
|
SharedInvalidationMessage
|
2016-04-27 17:47:28 +02:00
|
|
|
SharedJitInstrumentation
|
2021-07-14 02:43:58 +02:00
|
|
|
SharedMemoizeInfo
|
2017-09-15 04:59:21 +02:00
|
|
|
SharedRecordTableEntry
|
|
|
|
SharedRecordTableKey
|
|
|
|
SharedRecordTypmodRegistry
|
2017-11-29 15:24:24 +01:00
|
|
|
SharedSortInfo
|
2017-12-18 23:23:19 +01:00
|
|
|
SharedTuplestore
|
|
|
|
SharedTuplestoreAccessor
|
2018-04-26 20:45:04 +02:00
|
|
|
SharedTuplestoreChunk
|
|
|
|
SharedTuplestoreParticipant
|
2017-09-15 04:59:21 +02:00
|
|
|
SharedTypmodTableEntry
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
Sharedsort
|
2010-02-26 02:55:35 +01:00
|
|
|
ShellTypeInfo
|
2016-04-27 17:47:28 +02:00
|
|
|
ShippableCacheEntry
|
|
|
|
ShippableCacheKey
|
2010-02-26 02:55:35 +01:00
|
|
|
ShmemIndexEnt
|
2017-05-17 21:52:16 +02:00
|
|
|
ShutdownForeignScan_function
|
2012-06-10 21:15:31 +02:00
|
|
|
ShutdownInformation
|
2010-02-26 02:55:35 +01:00
|
|
|
ShutdownMode
|
|
|
|
SignTSVector
|
2016-04-27 17:47:28 +02:00
|
|
|
SimpleActionList
|
|
|
|
SimpleActionListCell
|
2010-02-26 02:55:35 +01:00
|
|
|
SimpleEcontextStackEntry
|
|
|
|
SimpleOidList
|
|
|
|
SimpleOidListCell
|
|
|
|
SimplePtrList
|
|
|
|
SimplePtrListCell
|
2016-04-27 17:47:28 +02:00
|
|
|
SimpleStats
|
2010-02-26 02:55:35 +01:00
|
|
|
SimpleStringList
|
|
|
|
SimpleStringListCell
|
2012-06-10 21:15:31 +02:00
|
|
|
SingleBoundSortItem
|
2010-02-26 02:55:35 +01:00
|
|
|
Size
|
2020-10-22 14:44:18 +02:00
|
|
|
SkipPages
|
Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 12:41:44 +01:00
|
|
|
SlabBlock
|
2017-05-17 21:52:16 +02:00
|
|
|
SlabContext
|
2016-12-13 16:51:32 +01:00
|
|
|
SlabSlot
|
2011-04-09 05:11:37 +02:00
|
|
|
SlotNumber
|
2010-02-26 02:55:35 +01:00
|
|
|
SlruCtl
|
|
|
|
SlruCtlData
|
|
|
|
SlruErrorCause
|
|
|
|
SlruPageStatus
|
2011-11-14 18:12:23 +01:00
|
|
|
SlruScanCallback
|
2010-02-26 02:55:35 +01:00
|
|
|
SlruShared
|
|
|
|
SlruSharedData
|
Defer flushing of SLRU files.
Previously, we called fsync() after writing out individual pg_xact,
pg_multixact and pg_commit_ts pages due to cache pressure, leading to
regular I/O stalls in user backends and recovery. Collapse requests for
the same file into a single system call as part of the next checkpoint,
as we already did for relation files, using the infrastructure developed
by commit 3eb77eba. This can cause a significant improvement to
recovery performance, especially when it's otherwise CPU-bound.
Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer
that it applies to all the SLRU mini-buffer-pools as well as the main
buffer pool. Rearrange things so that data collected in CheckpointStats
includes SLRU activity.
Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions,
because they were redundant after the shutdown checkpoint that
immediately precedes them. (I'm not sure if they were ever needed, but
they aren't now.)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com>
Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
2020-09-25 08:49:43 +02:00
|
|
|
SlruWriteAll
|
|
|
|
SlruWriteAllData
|
2014-05-06 15:08:14 +02:00
|
|
|
SnapBuild
|
|
|
|
SnapBuildOnDisk
|
|
|
|
SnapBuildState
|
2010-02-26 02:55:35 +01:00
|
|
|
Snapshot
|
|
|
|
SnapshotData
|
2019-01-22 02:03:15 +01:00
|
|
|
SnapshotType
|
2010-02-26 02:55:35 +01:00
|
|
|
SockAddr
|
|
|
|
Sort
|
|
|
|
SortBy
|
|
|
|
SortByDir
|
|
|
|
SortByNulls
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
SortCoordinate
|
2010-02-26 02:55:35 +01:00
|
|
|
SortGroupClause
|
2017-05-17 21:52:16 +02:00
|
|
|
SortItem
|
2016-04-27 17:47:28 +02:00
|
|
|
SortPath
|
2012-06-10 21:15:31 +02:00
|
|
|
SortShimExtra
|
2010-02-26 02:55:35 +01:00
|
|
|
SortState
|
2012-06-10 21:15:31 +02:00
|
|
|
SortSupport
|
|
|
|
SortSupportData
|
2010-02-26 02:55:35 +01:00
|
|
|
SortTuple
|
2012-06-10 21:15:31 +02:00
|
|
|
SortTupleComparator
|
|
|
|
SortedPoint
|
|
|
|
SpGistBuildState
|
|
|
|
SpGistCache
|
|
|
|
SpGistDeadTuple
|
|
|
|
SpGistDeadTupleData
|
|
|
|
SpGistInnerTuple
|
|
|
|
SpGistInnerTupleData
|
|
|
|
SpGistLUPCache
|
|
|
|
SpGistLastUsedPage
|
|
|
|
SpGistLeafTuple
|
|
|
|
SpGistLeafTupleData
|
|
|
|
SpGistMetaPageData
|
|
|
|
SpGistNodeTuple
|
|
|
|
SpGistNodeTupleData
|
2019-11-25 01:40:53 +01:00
|
|
|
SpGistOptions
|
2012-06-10 21:15:31 +02:00
|
|
|
SpGistPageOpaque
|
|
|
|
SpGistPageOpaqueData
|
|
|
|
SpGistScanOpaque
|
|
|
|
SpGistScanOpaqueData
|
2011-04-09 05:11:37 +02:00
|
|
|
SpGistSearchItem
|
2012-06-10 21:15:31 +02:00
|
|
|
SpGistState
|
|
|
|
SpGistTypeDesc
|
2010-02-26 02:55:35 +01:00
|
|
|
SpecialJoinInfo
|
Allow Pin/UnpinBuffer to operate in a lockfree manner.
Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).
To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).
As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths. There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.
As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.
This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me. Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.
On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.
Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell
2016-04-11 05:12:32 +02:00
|
|
|
SpinDelayStatus
|
2011-11-14 18:12:23 +01:00
|
|
|
SplitInterval
|
2012-06-10 21:15:31 +02:00
|
|
|
SplitLR
|
2019-05-22 18:55:34 +02:00
|
|
|
SplitPoint
|
2017-05-17 21:52:16 +02:00
|
|
|
SplitTextOutputData
|
2010-02-26 02:55:35 +01:00
|
|
|
SplitVar
|
|
|
|
SplitedPageLayout
|
|
|
|
StackElem
|
2017-05-17 21:52:16 +02:00
|
|
|
StartBlobPtrType
|
|
|
|
StartBlobsPtrType
|
|
|
|
StartDataPtrType
|
2011-04-09 05:11:37 +02:00
|
|
|
StartReplicationCmd
|
2016-04-27 17:47:28 +02:00
|
|
|
StartupStatusEnum
|
2010-02-26 02:55:35 +01:00
|
|
|
StatEntry
|
2017-05-17 21:52:16 +02:00
|
|
|
StatExtEntry
|
2015-05-24 03:20:37 +02:00
|
|
|
StateFileChunk
|
2017-05-17 21:52:16 +02:00
|
|
|
StatisticExtInfo
|
2021-05-12 19:14:10 +02:00
|
|
|
StatsBuildData
|
2016-04-27 17:47:28 +02:00
|
|
|
StatsData
|
2021-05-12 19:14:10 +02:00
|
|
|
StatsElem
|
2017-05-17 21:52:16 +02:00
|
|
|
StatsExtInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
StdAnalyzeData
|
2021-06-28 17:05:54 +02:00
|
|
|
StdRdOptIndexCleanup
|
2010-02-26 02:55:35 +01:00
|
|
|
StdRdOptions
|
2014-05-06 15:08:14 +02:00
|
|
|
Step
|
2010-02-26 02:55:35 +01:00
|
|
|
StopList
|
|
|
|
StrategyNumber
|
2016-04-27 17:47:28 +02:00
|
|
|
StreamCtl
|
Add support for streaming to built-in logical replication.
To add support for streaming of in-progress transactions into the
built-in logical replication, we need to do three things:
* Extend the logical replication protocol, so identify in-progress
transactions, and allow adding additional bits of information (e.g.
XID of subtransactions).
* Modify the output plugin (pgoutput) to implement the new stream
API callbacks, by leveraging the extended replication protocol.
* Modify the replication apply worker, to properly handle streamed
in-progress transaction by spilling the data to disk and then
replaying them on commit.
We however must explicitly disable streaming replication during
replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover we don't have a replication connection open so we
don't have where to send the data anyway.
Author: Tomas Vondra, Dilip Kumar and Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian
Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-09-03 04:24:07 +02:00
|
|
|
String
|
2010-02-26 02:55:35 +01:00
|
|
|
StringInfo
|
|
|
|
StringInfoData
|
2015-05-24 03:20:37 +02:00
|
|
|
StripnullState
|
2010-02-26 02:55:35 +01:00
|
|
|
SubLink
|
|
|
|
SubLinkType
|
2021-07-06 04:16:50 +02:00
|
|
|
SubOpts
|
2010-02-26 02:55:35 +01:00
|
|
|
SubPlan
|
|
|
|
SubPlanState
|
Allow multiple xacts during table sync in logical replication.
For the initial table data synchronization in logical replication, we use
a single transaction to copy the entire table and then synchronize the
position in the stream with the main apply worker.
There are multiple downsides of this approach: (a) We have to perform the
entire copy operation again if there is any error (network breakdown,
error in the database operation, etc.) while we synchronize the WAL
position between tablesync worker and apply worker; this will be onerous
especially for large copies, (b) Using a single transaction in the
synchronization-phase (where we can receive WAL from multiple
transactions) will have the risk of exceeding the CID limit, (c) The slot
will hold the WAL till the entire sync is complete because we never commit
till the end.
This patch solves all the above downsides by allowing multiple
transactions during the tablesync phase. The initial copy is done in a
single transaction and after that, we commit each transaction as we
receive. To allow recovery after any error or crash, we use a permanent
slot and origin to track the progress. The slot and origin will be removed
once we finish the synchronization of the table. We also remove slot and
origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or
ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not
finished.
The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and
ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true
cannot be executed inside a transaction block because they can now drop
the slots for which we have no provision to rollback.
This will also open up the path for logical replication of 2PC
transactions on the subscriber side. Previously, we can't do that because
of the requirement of maintaining a single transaction in tablesync
workers.
Bump catalog version due to change of state in the catalog
(pg_subscription_rel).
Author: Peter Smith, Amit Kapila, and Takamichi Osumi
Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
2021-02-12 03:11:51 +01:00
|
|
|
SubRemoveRels
|
2010-02-26 02:55:35 +01:00
|
|
|
SubTransactionId
|
|
|
|
SubXactCallback
|
|
|
|
SubXactCallbackItem
|
|
|
|
SubXactEvent
|
2020-11-15 06:56:31 +01:00
|
|
|
SubXactInfo
|
2010-02-26 02:55:35 +01:00
|
|
|
SubqueryScan
|
2016-04-27 17:47:28 +02:00
|
|
|
SubqueryScanPath
|
2010-02-26 02:55:35 +01:00
|
|
|
SubqueryScanState
|
|
|
|
SubqueryScanStatus
|
2017-05-17 21:52:16 +02:00
|
|
|
SubscriptExecSetup
|
|
|
|
SubscriptExecSteps
|
|
|
|
SubscriptRoutines
|
|
|
|
SubscriptTransform
|
|
|
|
SubscriptingRef
|
|
|
|
SubscriptingRefState
|
|
|
|
Subscription
|
|
|
|
SubscriptionInfo
|
|
|
|
SubscriptionRelState
|
2019-05-22 18:55:34 +02:00
|
|
|
SupportRequestCost
|
2016-04-27 17:47:28 +02:00
|
|
|
SupportRequestIndexCondition
|
2019-05-22 18:55:34 +02:00
|
|
|
SupportRequestRows
|
2010-02-26 02:55:35 +01:00
|
|
|
SupportRequestSelectivity
|
2019-05-22 18:55:34 +02:00
|
|
|
SupportRequestSimplify
|
2016-04-27 17:47:28 +02:00
|
|
|
SupportRequestWFuncMonotonic
|
2010-02-26 02:55:35 +01:00
|
|
|
Syn
|
2019-04-04 10:56:03 +02:00
|
|
|
SyncOps
|
2016-04-27 17:47:28 +02:00
|
|
|
SyncRepConfigData
|
|
|
|
SyncRepStandbyData
|
Defer flushing of SLRU files.
Previously, we called fsync() after writing out individual pg_xact,
pg_multixact and pg_commit_ts pages due to cache pressure, leading to
regular I/O stalls in user backends and recovery. Collapse requests for
the same file into a single system call as part of the next checkpoint,
as we already did for relation files, using the infrastructure developed
by commit 3eb77eba. This can cause a significant improvement to
recovery performance, especially when it's otherwise CPU-bound.
Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer
that it applies to all the SLRU mini-buffer-pools as well as the main
buffer pool. Rearrange things so that data collected in CheckpointStats
includes SLRU activity.
Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions,
because they were redundant after the shutdown checkpoint that
immediately precedes them. (I'm not sure if they were ever needed, but
they aren't now.)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com>
Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
2020-09-25 08:49:43 +02:00
|
|
|
SyncRequestHandler
|
2019-04-04 10:56:03 +02:00
|
|
|
SyncRequestType
|
2021-05-12 19:14:10 +02:00
|
|
|
SysFKRelationship
|
2010-02-26 02:55:35 +01:00
|
|
|
SysScanDesc
|
|
|
|
SyscacheCallbackFunction
|
2016-04-27 17:47:28 +02:00
|
|
|
SystemRowsSamplerData
|
2015-05-24 03:20:37 +02:00
|
|
|
SystemSamplerData
|
2016-04-27 17:47:28 +02:00
|
|
|
SystemTimeSamplerData
|
2010-02-26 02:55:35 +01:00
|
|
|
TAR_MEMBER
|
|
|
|
TBMIterateResult
|
2017-05-17 21:52:16 +02:00
|
|
|
TBMIteratingState
|
2010-02-26 02:55:35 +01:00
|
|
|
TBMIterator
|
2017-05-17 21:52:16 +02:00
|
|
|
TBMSharedIterator
|
|
|
|
TBMSharedIteratorState
|
2010-02-26 02:55:35 +01:00
|
|
|
TBMStatus
|
|
|
|
TBlockState
|
|
|
|
TIDBitmap
|
tableam: Add tuple_{insert, delete, update, lock} and use.
This adds new, required, table AM callbacks for insert/delete/update
and lock_tuple. To be able to reasonably use those, the EvalPlanQual
mechanism had to be adapted, moving more logic into the AM.
Previously both delete/update/lock call-sites and the EPQ mechanism had
to have awareness of the specific tuple format to be able to fetch the
latest version of a tuple. Obviously that needs to be abstracted
away. To do so, move the logic that find the latest row version into
the AM. lock_tuple has a new flag argument,
TUPLE_LOCK_FLAG_FIND_LAST_VERSION, that forces it to lock the last
version, rather than the current one. It'd have been possible to do
so via a separate callback as well, but finding the last version
usually also necessitates locking the newest version, making it
sensible to combine the two. This replaces the previous use of
EvalPlanQualFetch(). Additionally HeapTupleUpdated, which previously
signaled either a concurrent update or delete, is now split into two,
to avoid callers needing AM specific knowledge to differentiate.
The move of finding the latest row version into tuple_lock means that
encountering a row concurrently moved into another partition will now
raise an error about "tuple to be locked" rather than "tuple to be
updated/deleted" - which is accurate, as that always happens when
locking rows. While possible slightly less helpful for users, it seems
like an acceptable trade-off.
As part of this commit HTSU_Result has been renamed to TM_Result, and
its members been expanded to differentiated between updating and
deleting. HeapUpdateFailureData has been renamed to TM_FailureData.
The interface to speculative insertion is changed so nodeModifyTable.c
does not have to set the speculative token itself anymore. Instead
there's a version of tuple_insert, tuple_insert_speculative, that
performs the speculative insertion (without requiring a flag to signal
that fact), and the speculative insertion is either made permanent
with table_complete_speculative(succeeded = true) or aborted with
succeeded = false).
Note that multi_insert is not yet routed through tableam, nor is
COPY. Changing multi_insert requires changes to copy.c that are large
enough to better be done separately.
Similarly, although simpler, CREATE TABLE AS and CREATE MATERIALIZED
VIEW are also only going to be adjusted in a later commit.
Author: Andres Freund and Haribabu Kommi
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20190313003903.nwvrxi7rw3ywhdel@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-24 03:55:57 +01:00
|
|
|
TM_FailureData
|
2021-05-12 19:14:10 +02:00
|
|
|
TM_IndexDelete
|
|
|
|
TM_IndexDeleteOp
|
|
|
|
TM_IndexStatus
|
tableam: Add tuple_{insert, delete, update, lock} and use.
This adds new, required, table AM callbacks for insert/delete/update
and lock_tuple. To be able to reasonably use those, the EvalPlanQual
mechanism had to be adapted, moving more logic into the AM.
Previously both delete/update/lock call-sites and the EPQ mechanism had
to have awareness of the specific tuple format to be able to fetch the
latest version of a tuple. Obviously that needs to be abstracted
away. To do so, move the logic that find the latest row version into
the AM. lock_tuple has a new flag argument,
TUPLE_LOCK_FLAG_FIND_LAST_VERSION, that forces it to lock the last
version, rather than the current one. It'd have been possible to do
so via a separate callback as well, but finding the last version
usually also necessitates locking the newest version, making it
sensible to combine the two. This replaces the previous use of
EvalPlanQualFetch(). Additionally HeapTupleUpdated, which previously
signaled either a concurrent update or delete, is now split into two,
to avoid callers needing AM specific knowledge to differentiate.
The move of finding the latest row version into tuple_lock means that
encountering a row concurrently moved into another partition will now
raise an error about "tuple to be locked" rather than "tuple to be
updated/deleted" - which is accurate, as that always happens when
locking rows. While possible slightly less helpful for users, it seems
like an acceptable trade-off.
As part of this commit HTSU_Result has been renamed to TM_Result, and
its members been expanded to differentiated between updating and
deleting. HeapUpdateFailureData has been renamed to TM_FailureData.
The interface to speculative insertion is changed so nodeModifyTable.c
does not have to set the speculative token itself anymore. Instead
there's a version of tuple_insert, tuple_insert_speculative, that
performs the speculative insertion (without requiring a flag to signal
that fact), and the speculative insertion is either made permanent
with table_complete_speculative(succeeded = true) or aborted with
succeeded = false).
Note that multi_insert is not yet routed through tableam, nor is
COPY. Changing multi_insert requires changes to copy.c that are large
enough to better be done separately.
Similarly, although simpler, CREATE TABLE AS and CREATE MATERIALIZED
VIEW are also only going to be adjusted in a later commit.
Author: Andres Freund and Haribabu Kommi
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20190313003903.nwvrxi7rw3ywhdel@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-24 03:55:57 +01:00
|
|
|
TM_Result
|
2010-02-26 02:55:35 +01:00
|
|
|
TOKEN_DEFAULT_DACL
|
|
|
|
TOKEN_INFORMATION_CLASS
|
2018-04-26 20:45:04 +02:00
|
|
|
TOKEN_PRIVILEGES
|
2010-02-26 02:55:35 +01:00
|
|
|
TOKEN_USER
|
|
|
|
TParser
|
|
|
|
TParserCharTest
|
|
|
|
TParserPosition
|
|
|
|
TParserSpecial
|
|
|
|
TParserState
|
|
|
|
TParserStateAction
|
|
|
|
TParserStateActionItem
|
2016-05-02 15:23:55 +02:00
|
|
|
TQueueDestReceiver
|
2010-02-26 02:55:35 +01:00
|
|
|
TRGM
|
|
|
|
TSAnyCacheEntry
|
|
|
|
TSConfigCacheEntry
|
|
|
|
TSConfigInfo
|
|
|
|
TSDictInfo
|
|
|
|
TSDictionaryCacheEntry
|
2017-05-17 21:52:16 +02:00
|
|
|
TSExecuteCallback
|
2010-02-26 02:55:35 +01:00
|
|
|
TSLexeme
|
|
|
|
TSParserCacheEntry
|
|
|
|
TSParserInfo
|
|
|
|
TSQuery
|
|
|
|
TSQueryData
|
|
|
|
TSQueryParserState
|
|
|
|
TSQuerySign
|
|
|
|
TSReadPointer
|
|
|
|
TSTemplateInfo
|
2014-05-06 15:08:14 +02:00
|
|
|
TSTernaryValue
|
2010-02-26 02:55:35 +01:00
|
|
|
TSTokenTypeStorage
|
|
|
|
TSVector
|
2017-05-17 21:52:16 +02:00
|
|
|
TSVectorBuildState
|
2010-02-26 02:55:35 +01:00
|
|
|
TSVectorData
|
|
|
|
TSVectorParseState
|
|
|
|
TSVectorStat
|
|
|
|
TState
|
2022-05-12 21:17:30 +02:00
|
|
|
TStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
TStoreState
|
2020-05-14 19:06:38 +02:00
|
|
|
TXNEntryFile
|
2010-02-26 02:55:35 +01:00
|
|
|
TYPCATEGORY
|
2013-05-29 22:58:43 +02:00
|
|
|
T_Action
|
|
|
|
T_WorkerStatus
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
TableAmRoutine
|
2010-02-26 02:55:35 +01:00
|
|
|
TableAttachInfo
|
|
|
|
TableDataInfo
|
2017-05-17 21:52:16 +02:00
|
|
|
TableFunc
|
|
|
|
TableFuncRoutine
|
|
|
|
TableFuncScan
|
|
|
|
TableFuncScanState
|
2010-02-26 02:55:35 +01:00
|
|
|
TableInfo
|
2012-06-10 21:15:31 +02:00
|
|
|
TableLikeClause
|
2015-05-24 03:20:37 +02:00
|
|
|
TableSampleClause
|
tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:
1) Heap scans need to be generalized into table scans. Do this by
introducing TableScanDesc, which will be the "base class" for
individual AMs. This contains the AM independent fields from
HeapScanDesc.
The previous heap_{beginscan,rescan,endscan} et al. have been
replaced with a table_ version.
There's no direct replacement for heap_getnext(), as that returned
a HeapTuple, which is undesirable for a other AMs. Instead there's
table_scan_getnextslot(). But note that heap_getnext() lives on,
it's still used widely to access catalog tables.
This is achieved by new scan_begin, scan_end, scan_rescan,
scan_getnextslot callbacks.
2) The portion of parallel scans that's shared between backends need
to be able to do so without the user doing per-AM work. To achieve
that new parallelscan_{estimate, initialize, reinitialize}
callbacks are introduced, which operate on a new
ParallelTableScanDesc, which again can be subclassed by AMs.
As it is likely that several AMs are going to be block oriented,
block oriented callbacks that can be shared between such AMs are
provided and used by heap. table_block_parallelscan_{estimate,
intiialize, reinitialize} as callbacks, and
table_block_parallelscan_{nextpage, init} for use in AMs. These
operate on a ParallelBlockTableScanDesc.
3) Index scans need to be able to access tables to return a tuple, and
there needs to be state across individual accesses to the heap to
store state like buffers. That's now handled by introducing a
sort-of-scan IndexFetchTable, which again is intended to be
subclassed by individual AMs (for heap IndexFetchHeap).
The relevant callbacks for an AM are index_fetch_{end, begin,
reset} to create the necessary state, and index_fetch_tuple to
retrieve an indexed tuple. Note that index_fetch_tuple
implementations need to be smarter than just blindly fetching the
tuples for AMs that have optimizations similar to heap's HOT - the
currently alive tuple in the update chain needs to be fetched if
appropriate.
Similar to table_scan_getnextslot(), it's undesirable to continue
to return HeapTuples. Thus index_fetch_heap (might want to rename
that later) now accepts a slot as an argument. Core code doesn't
have a lot of call sites performing index scans without going
through the systable_* API (in contrast to loads of heap_getnext
calls and working directly with HeapTuples).
Index scans now store the result of a search in
IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
target is not generally a HeapTuple anymore that seems cleaner.
To be able to sensible adapt code to use the above, two further
callbacks have been introduced:
a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
slots capable of holding a tuple of the AMs
type. table_slot_callbacks() and table_slot_create() are based
upon that, but have additional logic to deal with views, foreign
tables, etc.
While this change could have been done separately, nearly all the
call sites that needed to be adapted for the rest of this commit
also would have been needed to be adapted for
table_slot_callbacks(), making separation not worthwhile.
b) tuple_satisfies_snapshot checks whether the tuple in a slot is
currently visible according to a snapshot. That's required as a few
places now don't have a buffer + HeapTuple around, but a
slot (which in heap's case internally has that information).
Additionally a few infrastructure changes were needed:
I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
internally uses a slot to keep track of tuples. While
systable_getnext() still returns HeapTuples, and will so for the
foreseeable future, the index API (see 1) above) now only deals with
slots.
The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.
Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 20:46:41 +01:00
|
|
|
TableScanDesc
|
|
|
|
TableScanDescData
|
2010-02-26 02:55:35 +01:00
|
|
|
TableSpaceCacheEntry
|
|
|
|
TableSpaceOpts
|
2014-05-06 15:08:14 +02:00
|
|
|
TablespaceList
|
|
|
|
TablespaceListCell
|
2017-05-17 21:52:16 +02:00
|
|
|
TapeBlockTrailer
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
TapeShare
|
2016-12-13 16:51:32 +01:00
|
|
|
TarMethodData
|
|
|
|
TarMethodFile
|
2010-02-26 02:55:35 +01:00
|
|
|
TargetEntry
|
2016-04-27 17:47:28 +02:00
|
|
|
TclExceptionNameMap
|
2010-02-26 02:55:35 +01:00
|
|
|
Tcl_DString
|
|
|
|
Tcl_FileProc
|
|
|
|
Tcl_HashEntry
|
|
|
|
Tcl_HashTable
|
|
|
|
Tcl_Interp
|
|
|
|
Tcl_NotifierProcs
|
2016-04-27 17:47:28 +02:00
|
|
|
Tcl_Obj
|
2010-02-26 02:55:35 +01:00
|
|
|
Tcl_Time
|
2011-04-09 05:11:37 +02:00
|
|
|
TempNamespaceStatus
|
2014-05-06 15:08:14 +02:00
|
|
|
TestDecodingData
|
2020-11-17 07:44:53 +01:00
|
|
|
TestDecodingTxnData
|
2014-05-06 15:08:14 +02:00
|
|
|
TestSpec
|
2010-02-26 02:55:35 +01:00
|
|
|
TextFreq
|
|
|
|
TextPositionState
|
|
|
|
TheLexeme
|
|
|
|
TheSubstitute
|
2017-05-17 21:52:16 +02:00
|
|
|
TidExpr
|
2021-05-12 19:14:10 +02:00
|
|
|
TidExprType
|
2010-02-26 02:55:35 +01:00
|
|
|
TidHashKey
|
2021-05-12 19:14:10 +02:00
|
|
|
TidOpExpr
|
2010-02-26 02:55:35 +01:00
|
|
|
TidPath
|
2021-05-12 19:14:10 +02:00
|
|
|
TidRangePath
|
|
|
|
TidRangeScan
|
2010-02-26 02:55:35 +01:00
|
|
|
TidRangeScanState
|
|
|
|
TidScan
|
|
|
|
TidScanState
|
|
|
|
TimeADT
|
2013-05-29 22:58:43 +02:00
|
|
|
TimeLineHistoryCmd
|
|
|
|
TimeLineHistoryEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
TimeLineID
|
|
|
|
TimeOffset
|
2011-04-09 05:11:37 +02:00
|
|
|
TimeStamp
|
2010-02-26 02:55:35 +01:00
|
|
|
TimeTzADT
|
2011-04-09 05:11:37 +02:00
|
|
|
TimeZoneAbbrevTable
|
2013-05-29 22:58:43 +02:00
|
|
|
TimeoutId
|
|
|
|
TimeoutType
|
2010-02-26 02:55:35 +01:00
|
|
|
Timestamp
|
|
|
|
TimestampTz
|
|
|
|
TmFromChar
|
|
|
|
TmToChar
|
2020-03-23 22:54:33 +01:00
|
|
|
ToastAttrInfo
|
2015-05-24 03:20:37 +02:00
|
|
|
ToastCompressionId
|
2020-03-23 22:54:33 +01:00
|
|
|
ToastTupleContext
|
amcheck: fix multiple problems with TOAST pointer validation
First, don't perform database access while holding a buffer lock.
When checking a heap, we can validate that TOAST pointers are sane by
performing a scan on the TOAST index and looking up the chunks that
correspond to each value ID that appears in a TOAST poiner in the main
table. But, to do that while holding a buffer lock at least risks
causing other backends to wait uninterruptibly, and probably can cause
undetected and uninterruptible deadlocks. So, instead, make a list of
checks to perform while holding the lock, and then perform the checks
after releasing it.
Second, adjust things so that we don't try to follow TOAST pointers
for tuples that are already eligible to be pruned. The TOAST tuples
become eligible for pruning at the same time that the main tuple does,
so trying to check them may lead to spurious reports of corruption,
as observed in the buildfarm. The necessary infrastructure to decide
whether or not the tuple being checked is prunable was added by
commit 3b6c1259f9ca8e21860aaf24ec6735a8e5598ea0, but it wasn't
actually used for its intended purpose prior to this patch.
Mark Dilger, adjusted by me to avoid a memory leak.
Discussion: http://postgr.es/m/AC5479E4-6321-473D-AC92-5EC36299FBC2@enterprisedb.com
2021-04-07 19:28:35 +02:00
|
|
|
ToastedAttribute
|
2010-02-26 02:55:35 +01:00
|
|
|
TocEntry
|
|
|
|
TokenAuxData
|
Refactor code related to pg_hba_file_rules() into new file
hba.c is growing big, and more contents are planned for it. In order to
prepare for this future work, this commit moves all the code related to
the system function processing the contents of pg_hba.conf,
pg_hba_file_rules() to a new file called hbafuncs.c, which will be used
as the location for the SQL portion of the authentication file parsing.
While on it, HbaToken, the structure holding a string token lexed from a
configuration file related to authentication, is renamed to a more
generic AuthToken, as it gets used not only for pg_hba.conf, but also
for pg_ident.conf. TokenizedLine is now named TokenizedAuthLine.
The size of hba.c is reduced by ~12%.
Author: Julien Rouhaud
Reviewed-by: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
2022-03-24 04:42:30 +01:00
|
|
|
TokenizedAuthLine
|
2010-02-26 02:55:35 +01:00
|
|
|
TrackItem
|
Perform apply of large transactions by parallel workers.
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
2023-01-09 02:30:39 +01:00
|
|
|
TransApplyAction
|
2010-02-26 02:55:35 +01:00
|
|
|
TransInvalidationInfo
|
|
|
|
TransState
|
|
|
|
TransactionId
|
|
|
|
TransactionState
|
|
|
|
TransactionStateData
|
|
|
|
TransactionStmt
|
|
|
|
TransactionStmtKind
|
2015-05-24 03:20:37 +02:00
|
|
|
TransformInfo
|
2017-05-17 21:52:16 +02:00
|
|
|
TransformJsonStringValuesState
|
2010-02-26 02:55:35 +01:00
|
|
|
TransitionCaptureState
|
2013-05-29 22:58:43 +02:00
|
|
|
TrgmArc
|
|
|
|
TrgmArcInfo
|
2018-04-26 20:45:04 +02:00
|
|
|
TrgmBound
|
2013-05-29 22:58:43 +02:00
|
|
|
TrgmColor
|
|
|
|
TrgmColorInfo
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
TrgmGistOptions
|
2013-05-29 22:58:43 +02:00
|
|
|
TrgmNFA
|
|
|
|
TrgmPackArcInfo
|
|
|
|
TrgmPackedArc
|
|
|
|
TrgmPackedGraph
|
|
|
|
TrgmPackedState
|
|
|
|
TrgmPrefix
|
|
|
|
TrgmState
|
|
|
|
TrgmStateKey
|
|
|
|
TrieChar
|
2010-02-26 02:55:35 +01:00
|
|
|
Trigger
|
|
|
|
TriggerData
|
|
|
|
TriggerDesc
|
|
|
|
TriggerEvent
|
|
|
|
TriggerFlags
|
|
|
|
TriggerInfo
|
2016-12-13 16:51:32 +01:00
|
|
|
TriggerTransition
|
2010-02-26 02:55:35 +01:00
|
|
|
TruncateStmt
|
2016-04-27 17:47:28 +02:00
|
|
|
TsmRoutine
|
2010-02-26 02:55:35 +01:00
|
|
|
TupOutputState
|
|
|
|
TupSortStatus
|
|
|
|
TupStoreStatus
|
|
|
|
TupleConstr
|
|
|
|
TupleConversionMap
|
|
|
|
TupleDesc
|
|
|
|
TupleHashEntry
|
|
|
|
TupleHashEntryData
|
|
|
|
TupleHashIterator
|
|
|
|
TupleHashTable
|
Modify tqueue infrastructure to support transient record types.
Commit 4a4e6893aa080b9094dadbe0e65f8a75fee41ac6, which introduced this
mechanism, failed to account for the fact that the RECORD pseudo-type
uses transient typmods that are only meaningful within a single
backend. Transferring such tuples without modification between two
cooperating backends does not work. This commit installs a system
for passing the tuple descriptors over the same shm_mq being used to
send the tuples themselves. The two sides might not assign the same
transient typmod to any given tuple descriptor, so we must also
substitute the appropriate receiver-side typmod for the one used by
the sender. That adds some CPU overhead, but still seems better than
being unable to pass records between cooperating parallel processes.
Along the way, move the logic for handling multiple tuple queues from
tqueue.c to nodeGather.c; tqueue.c now provides a TupleQueueReader,
which reads from a single queue, rather than a TupleQueueFunnel, which
potentially reads from multiple queues. This change was suggested
previously as a way to make sure that nodeGather.c rather than tqueue.c
had policy control over the order in which to read from queues, but
it wasn't clear to me until now how good an idea it was. typmod
mapping needs to be performed separately for each queue, and it is
much simpler if the tqueue.c code handles that and leaves multiplexing
multiple queues to higher layers of the stack.
2015-11-06 22:58:45 +01:00
|
|
|
TupleQueueReader
|
2010-02-26 02:55:35 +01:00
|
|
|
TupleTableSlot
|
tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:
1) Heap scans need to be generalized into table scans. Do this by
introducing TableScanDesc, which will be the "base class" for
individual AMs. This contains the AM independent fields from
HeapScanDesc.
The previous heap_{beginscan,rescan,endscan} et al. have been
replaced with a table_ version.
There's no direct replacement for heap_getnext(), as that returned
a HeapTuple, which is undesirable for a other AMs. Instead there's
table_scan_getnextslot(). But note that heap_getnext() lives on,
it's still used widely to access catalog tables.
This is achieved by new scan_begin, scan_end, scan_rescan,
scan_getnextslot callbacks.
2) The portion of parallel scans that's shared between backends need
to be able to do so without the user doing per-AM work. To achieve
that new parallelscan_{estimate, initialize, reinitialize}
callbacks are introduced, which operate on a new
ParallelTableScanDesc, which again can be subclassed by AMs.
As it is likely that several AMs are going to be block oriented,
block oriented callbacks that can be shared between such AMs are
provided and used by heap. table_block_parallelscan_{estimate,
intiialize, reinitialize} as callbacks, and
table_block_parallelscan_{nextpage, init} for use in AMs. These
operate on a ParallelBlockTableScanDesc.
3) Index scans need to be able to access tables to return a tuple, and
there needs to be state across individual accesses to the heap to
store state like buffers. That's now handled by introducing a
sort-of-scan IndexFetchTable, which again is intended to be
subclassed by individual AMs (for heap IndexFetchHeap).
The relevant callbacks for an AM are index_fetch_{end, begin,
reset} to create the necessary state, and index_fetch_tuple to
retrieve an indexed tuple. Note that index_fetch_tuple
implementations need to be smarter than just blindly fetching the
tuples for AMs that have optimizations similar to heap's HOT - the
currently alive tuple in the update chain needs to be fetched if
appropriate.
Similar to table_scan_getnextslot(), it's undesirable to continue
to return HeapTuples. Thus index_fetch_heap (might want to rename
that later) now accepts a slot as an argument. Core code doesn't
have a lot of call sites performing index scans without going
through the systable_* API (in contrast to loads of heap_getnext
calls and working directly with HeapTuples).
Index scans now store the result of a search in
IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
target is not generally a HeapTuple anymore that seems cleaner.
To be able to sensible adapt code to use the above, two further
callbacks have been introduced:
a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
slots capable of holding a tuple of the AMs
type. table_slot_callbacks() and table_slot_create() are based
upon that, but have additional logic to deal with views, foreign
tables, etc.
While this change could have been done separately, nearly all the
call sites that needed to be adapted for the rest of this commit
also would have been needed to be adapted for
table_slot_callbacks(), making separation not worthwhile.
b) tuple_satisfies_snapshot checks whether the tuple in a slot is
currently visible according to a snapshot. That's required as a few
places now don't have a buffer + HeapTuple around, but a
slot (which in heap's case internally has that information).
Additionally a few infrastructure changes were needed:
I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
internally uses a slot to keep track of tuples. While
systable_getnext() still returns HeapTuples, and will so for the
foreseeable future, the index API (see 1) above) now only deals with
slots.
The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.
Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 20:46:41 +01:00
|
|
|
TupleTableSlotOps
|
2022-07-27 07:28:10 +02:00
|
|
|
TuplesortClusterArg
|
|
|
|
TuplesortDatumArg
|
|
|
|
TuplesortIndexArg
|
|
|
|
TuplesortIndexBTreeArg
|
|
|
|
TuplesortIndexHashArg
|
2016-04-27 17:47:28 +02:00
|
|
|
TuplesortInstrumentation
|
2017-11-29 15:24:24 +01:00
|
|
|
TuplesortMethod
|
2022-07-27 07:28:10 +02:00
|
|
|
TuplesortPublic
|
2010-02-26 02:55:35 +01:00
|
|
|
TuplesortSpaceType
|
|
|
|
Tuplesortstate
|
|
|
|
Tuplestorestate
|
|
|
|
TwoPhaseCallback
|
|
|
|
TwoPhaseFileHeader
|
|
|
|
TwoPhaseLockRecord
|
|
|
|
TwoPhasePgStatRecord
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
TwoPhasePredicateLockRecord
|
|
|
|
TwoPhasePredicateRecord
|
2011-04-09 05:11:37 +02:00
|
|
|
TwoPhasePredicateRecordType
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
TwoPhasePredicateXactRecord
|
2010-02-26 02:55:35 +01:00
|
|
|
TwoPhaseRecordOnDisk
|
|
|
|
TwoPhaseRmgrId
|
|
|
|
TwoPhaseStateData
|
|
|
|
Type
|
|
|
|
TypeCacheEntry
|
2011-04-09 05:11:37 +02:00
|
|
|
TypeCacheEnumData
|
2010-02-26 02:55:35 +01:00
|
|
|
TypeCast
|
2017-05-17 21:52:16 +02:00
|
|
|
TypeCat
|
2010-02-26 02:55:35 +01:00
|
|
|
TypeFuncClass
|
|
|
|
TypeInfo
|
|
|
|
TypeName
|
2018-04-26 20:45:04 +02:00
|
|
|
U
|
2010-02-26 02:55:35 +01:00
|
|
|
U32
|
|
|
|
U8
|
2017-05-17 21:52:16 +02:00
|
|
|
UChar
|
|
|
|
UCharIterator
|
2019-05-22 18:55:34 +02:00
|
|
|
UColAttribute
|
|
|
|
UColAttributeValue
|
2017-05-17 21:52:16 +02:00
|
|
|
UCollator
|
|
|
|
UConverter
|
|
|
|
UErrorCode
|
2010-02-26 02:55:35 +01:00
|
|
|
UINT
|
|
|
|
ULARGE_INTEGER
|
|
|
|
ULONG
|
|
|
|
ULONG_PTR
|
|
|
|
UV
|
2017-05-17 21:52:16 +02:00
|
|
|
UVersionInfo
|
2015-05-24 03:20:37 +02:00
|
|
|
UnicodeNormalizationForm
|
|
|
|
UnicodeNormalizationQC
|
2010-02-26 02:55:35 +01:00
|
|
|
Unique
|
|
|
|
UniquePath
|
|
|
|
UniquePathMethod
|
|
|
|
UniqueState
|
|
|
|
UnlistenStmt
|
|
|
|
UnresolvedTup
|
|
|
|
UnresolvedTupData
|
2022-03-17 11:47:04 +01:00
|
|
|
UpdateContext
|
2010-02-26 02:55:35 +01:00
|
|
|
UpdateStmt
|
2016-04-27 17:47:28 +02:00
|
|
|
UpperRelationKind
|
|
|
|
UpperUniquePath
|
2010-02-26 02:55:35 +01:00
|
|
|
UserAuth
|
|
|
|
UserMapping
|
2011-04-09 05:11:37 +02:00
|
|
|
UserOpts
|
2010-02-26 02:55:35 +01:00
|
|
|
VacAttrStats
|
|
|
|
VacAttrStatsP
|
2021-12-22 03:25:14 +01:00
|
|
|
VacDeadItems
|
Introduce vacuum errcontext to display additional information.
The additional information displayed will be block number for error
occurring while processing heap and index name for error occurring
while processing the index.
This will help us in diagnosing the problems that occur during a vacuum.
For ex. due to corruption (either caused by bad hardware or by some bug)
if we get some error while vacuuming, it can help us identify the block
in heap and or additional index information.
It sets up an error context callback to display additional information
with the error. During different phases of vacuum (heap scan, heap
vacuum, index vacuum, index clean up, heap truncate), we update the error
context callback to display appropriate information. We can extend it to
a bit more granular level like adding the phases for FSM operations or for
prefetching the blocks while truncating. However, I felt that it requires
adding many more error callback function calls and can make the code a bit
complex, so left those for now.
Author: Justin Pryzby, with few changes by Amit Kapila
Reviewed-by: Alvaro Herrera, Amit Kapila, Andres Freund, Michael Paquier
and Sawada Masahiko
Discussion: https://www.postgresql.org/message-id/20191120210600.GC30362@telsasoft.com
2020-03-30 04:03:38 +02:00
|
|
|
VacErrPhase
|
2014-05-06 15:08:14 +02:00
|
|
|
VacOptValue
|
2015-05-24 03:20:37 +02:00
|
|
|
VacuumParams
|
2017-11-29 15:24:24 +01:00
|
|
|
VacuumRelation
|
2010-02-26 02:55:35 +01:00
|
|
|
VacuumStmt
|
2019-03-28 03:59:06 +01:00
|
|
|
ValidateIndexState
|
2010-02-26 02:55:35 +01:00
|
|
|
ValuesScan
|
|
|
|
ValuesScanState
|
|
|
|
Var
|
|
|
|
VarBit
|
|
|
|
VarChar
|
|
|
|
VarParamState
|
2016-05-02 15:23:55 +02:00
|
|
|
VarString
|
2016-04-27 17:47:28 +02:00
|
|
|
VarStringSortSupport
|
2010-02-26 02:55:35 +01:00
|
|
|
Variable
|
|
|
|
VariableAssignHook
|
|
|
|
VariableCache
|
|
|
|
VariableCacheData
|
|
|
|
VariableSetKind
|
|
|
|
VariableSetStmt
|
|
|
|
VariableShowStmt
|
|
|
|
VariableSpace
|
|
|
|
VariableStatData
|
2017-05-17 21:52:16 +02:00
|
|
|
VariableSubstituteHook
|
2022-05-12 21:17:30 +02:00
|
|
|
Variables
|
2018-04-26 20:45:04 +02:00
|
|
|
VersionedQuery
|
2010-02-26 02:55:35 +01:00
|
|
|
Vfd
|
2014-05-06 15:08:14 +02:00
|
|
|
ViewCheckOption
|
2020-03-19 19:40:45 +01:00
|
|
|
ViewOptCheckOption
|
2014-07-14 23:24:40 +02:00
|
|
|
ViewOptions
|
2010-02-26 02:55:35 +01:00
|
|
|
ViewStmt
|
|
|
|
VirtualTransactionId
|
|
|
|
VirtualTupleTableSlot
|
2017-05-17 21:52:16 +02:00
|
|
|
VolatileFunctionStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
Vsrt
|
|
|
|
WAIT_ORDER
|
2020-05-14 19:06:38 +02:00
|
|
|
WALAvailability
|
2014-05-06 15:08:14 +02:00
|
|
|
WALInsertLock
|
|
|
|
WALInsertLockPadded
|
2020-05-14 19:06:38 +02:00
|
|
|
WALOpenSegment
|
|
|
|
WALReadError
|
2022-11-08 08:36:36 +01:00
|
|
|
WalRcvWakeupReason
|
2020-05-14 19:06:38 +02:00
|
|
|
WALSegmentCloseCB
|
2010-02-26 02:55:35 +01:00
|
|
|
WALSegmentContext
|
2020-05-14 19:06:38 +02:00
|
|
|
WALSegmentOpenCB
|
2010-02-26 02:55:35 +01:00
|
|
|
WCHAR
|
2015-05-24 03:20:37 +02:00
|
|
|
WCOKind
|
2016-12-13 16:51:32 +01:00
|
|
|
WFW_WaitOption
|
2014-05-06 15:08:14 +02:00
|
|
|
WIDGET
|
2010-02-26 02:55:35 +01:00
|
|
|
WORD
|
|
|
|
WORKSTATE
|
2011-04-09 05:11:37 +02:00
|
|
|
WSABUF
|
|
|
|
WSADATA
|
|
|
|
WSANETWORKEVENTS
|
|
|
|
WSAPROTOCOL_INFO
|
2016-04-27 17:47:28 +02:00
|
|
|
WaitEvent
|
2016-12-13 16:51:32 +01:00
|
|
|
WaitEventActivity
|
|
|
|
WaitEventClient
|
2017-05-17 21:52:16 +02:00
|
|
|
WaitEventIO
|
2016-12-13 16:51:32 +01:00
|
|
|
WaitEventIPC
|
2016-04-27 17:47:28 +02:00
|
|
|
WaitEventSet
|
2016-12-13 16:51:32 +01:00
|
|
|
WaitEventTimeout
|
2017-08-14 23:29:33 +02:00
|
|
|
WaitPMResult
|
2016-12-13 16:51:32 +01:00
|
|
|
WalCloseMethod
|
Add support for LZ4 with compression of full-page writes in WAL
The logic is implemented so as there can be a choice in the compression
used when building a WAL record, and an extra per-record bit is used to
track down if a block is compressed with PGLZ, LZ4 or nothing.
wal_compression, the existing parameter, is changed to an enum with
support for the following backward-compatible values:
- "off", the default, to not use compression.
- "pglz" or "on", to compress FPWs with PGLZ.
- "lz4", the new mode, to compress FPWs with LZ4.
Benchmarking has showed that LZ4 outclasses easily PGLZ. ZSTD would be
also an interesting choice, but going just with LZ4 for now makes the
patch minimalistic as toast compression is already able to use LZ4, so
there is no need to worry about any build-related needs for this
implementation.
Author: Andrey Borodin, Justin Pryzby
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/3037310D-ECB7-4BF1-AF20-01C10BB33A33@yandex-team.ru
2021-06-29 04:17:55 +02:00
|
|
|
WalCompression
|
2010-07-06 21:18:19 +02:00
|
|
|
WalLevel
|
2010-02-26 02:55:35 +01:00
|
|
|
WalRcvData
|
2017-05-17 21:52:16 +02:00
|
|
|
WalRcvExecResult
|
|
|
|
WalRcvExecStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
WalRcvState
|
2017-05-17 21:52:16 +02:00
|
|
|
WalRcvStreamOptions
|
2016-12-13 16:51:32 +01:00
|
|
|
WalReceiverConn
|
|
|
|
WalReceiverFunctionsType
|
2010-02-26 02:55:35 +01:00
|
|
|
WalSnd
|
|
|
|
WalSndCtlData
|
2014-03-10 18:50:28 +01:00
|
|
|
WalSndSendDataCallback
|
2011-04-09 05:11:37 +02:00
|
|
|
WalSndState
|
2017-05-17 21:52:16 +02:00
|
|
|
WalTimeSample
|
2020-04-04 06:32:08 +02:00
|
|
|
WalUsage
|
2016-12-13 16:51:32 +01:00
|
|
|
WalWriteMethod
|
|
|
|
Walfile
|
2010-02-26 02:55:35 +01:00
|
|
|
WindowAgg
|
2016-04-27 17:47:28 +02:00
|
|
|
WindowAggPath
|
2010-02-26 02:55:35 +01:00
|
|
|
WindowAggState
|
|
|
|
WindowAggStatus
|
|
|
|
WindowClause
|
|
|
|
WindowClauseSortData
|
|
|
|
WindowDef
|
|
|
|
WindowFunc
|
|
|
|
WindowFuncExprState
|
|
|
|
WindowFuncLists
|
|
|
|
WindowObject
|
|
|
|
WindowObjectData
|
|
|
|
WindowStatePerAgg
|
|
|
|
WindowStatePerAggData
|
|
|
|
WindowStatePerFunc
|
2014-05-06 15:08:14 +02:00
|
|
|
WithCheckOption
|
2010-02-26 02:55:35 +01:00
|
|
|
WithClause
|
|
|
|
WordEntry
|
|
|
|
WordEntryIN
|
|
|
|
WordEntryPos
|
|
|
|
WordEntryPosVector
|
2015-05-24 03:20:37 +02:00
|
|
|
WordEntryPosVector1
|
2010-02-26 02:55:35 +01:00
|
|
|
WorkTableScan
|
|
|
|
WorkTableScanState
|
|
|
|
WorkerInfo
|
|
|
|
WorkerInfoData
|
2016-04-27 17:47:28 +02:00
|
|
|
WorkerInstrumentation
|
2017-05-17 21:52:16 +02:00
|
|
|
WorkerJobDumpPtrType
|
|
|
|
WorkerJobRestorePtrType
|
2010-02-26 02:55:35 +01:00
|
|
|
Working_State
|
2017-05-17 21:52:16 +02:00
|
|
|
WriteBufPtrType
|
|
|
|
WriteBytePtrType
|
2014-03-10 18:50:28 +01:00
|
|
|
WriteDataCallback
|
2017-05-17 21:52:16 +02:00
|
|
|
WriteDataPtrType
|
|
|
|
WriteExtraTocPtrType
|
2011-04-09 05:11:37 +02:00
|
|
|
WriteFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
WriteManifestState
|
|
|
|
WriteTarState
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
WritebackContext
|
2010-02-26 02:55:35 +01:00
|
|
|
X509
|
2016-05-02 15:23:55 +02:00
|
|
|
X509_EXTENSION
|
2010-02-26 02:55:35 +01:00
|
|
|
X509_NAME
|
|
|
|
X509_NAME_ENTRY
|
|
|
|
X509_STORE
|
|
|
|
X509_STORE_CTX
|
2014-05-06 15:08:14 +02:00
|
|
|
XLTW_Oper
|
2010-02-26 02:55:35 +01:00
|
|
|
XLogCtlData
|
|
|
|
XLogCtlInsert
|
2013-05-29 22:58:43 +02:00
|
|
|
XLogDumpConfig
|
|
|
|
XLogDumpPrivate
|
2010-02-26 02:55:35 +01:00
|
|
|
XLogLongPageHeader
|
|
|
|
XLogLongPageHeaderData
|
|
|
|
XLogPageHeader
|
|
|
|
XLogPageHeaderData
|
|
|
|
XLogPageReadCB
|
|
|
|
XLogPageReadPrivate
|
2022-03-18 05:45:04 +01:00
|
|
|
XLogPageReadResult
|
2022-04-07 09:28:40 +02:00
|
|
|
XLogPrefetchStats
|
|
|
|
XLogPrefetcher
|
|
|
|
XLogPrefetcherFilter
|
2010-02-26 02:55:35 +01:00
|
|
|
XLogReaderRoutine
|
|
|
|
XLogReaderState
|
|
|
|
XLogRecData
|
|
|
|
XLogRecPtr
|
2022-04-07 09:28:40 +02:00
|
|
|
XLogRecStats
|
2010-02-26 02:55:35 +01:00
|
|
|
XLogRecord
|
2015-05-24 03:20:37 +02:00
|
|
|
XLogRecordBlockCompressHeader
|
|
|
|
XLogRecordBlockHeader
|
|
|
|
XLogRecordBlockImageHeader
|
Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 22:32:18 +01:00
|
|
|
XLogRecordBuffer
|
2022-02-16 08:30:38 +01:00
|
|
|
XLogRecoveryCtlData
|
2015-05-24 03:20:37 +02:00
|
|
|
XLogRedoAction
|
2013-05-29 22:58:43 +02:00
|
|
|
XLogSegNo
|
|
|
|
XLogSource
|
2022-05-12 21:17:30 +02:00
|
|
|
XLogStats
|
2010-02-26 02:55:35 +01:00
|
|
|
XLogwrtResult
|
|
|
|
XLogwrtRqst
|
2022-05-12 21:17:30 +02:00
|
|
|
XPV
|
2010-02-26 02:55:35 +01:00
|
|
|
XPVIV
|
2019-05-22 18:55:34 +02:00
|
|
|
XPVMG
|
2010-02-26 02:55:35 +01:00
|
|
|
XactCallback
|
|
|
|
XactCallbackItem
|
|
|
|
XactEvent
|
2014-05-06 15:08:14 +02:00
|
|
|
XactLockTableWaitInfo
|
2020-10-22 14:44:18 +02:00
|
|
|
XidBoundsViolation
|
|
|
|
XidCacheStatus
|
|
|
|
XidCommitStatus
|
2010-02-26 02:55:35 +01:00
|
|
|
XidStatus
|
|
|
|
XmlExpr
|
|
|
|
XmlExprOp
|
|
|
|
XmlOptionType
|
|
|
|
XmlSerialize
|
2017-05-17 21:52:16 +02:00
|
|
|
XmlTableBuilderData
|
2010-02-26 02:55:35 +01:00
|
|
|
YYLTYPE
|
|
|
|
YYSTYPE
|
|
|
|
YY_BUFFER_STATE
|
2022-05-12 21:17:30 +02:00
|
|
|
ZSTD_CCtx
|
|
|
|
ZSTD_DCtx
|
|
|
|
ZSTD_inBuffer
|
|
|
|
ZSTD_outBuffer
|
2010-02-26 02:55:35 +01:00
|
|
|
_SPI_connection
|
|
|
|
_SPI_plan
|
|
|
|
__AssignProcessToJobObject
|
|
|
|
__CreateJobObject
|
|
|
|
__CreateRestrictedToken
|
|
|
|
__IsProcessInJob
|
|
|
|
__QueryInformationJobObject
|
|
|
|
__SetInformationJobObject
|
2021-05-12 19:14:10 +02:00
|
|
|
__time64_t
|
|
|
|
_dev_t
|
|
|
|
_ino_t
|
2022-05-12 21:17:30 +02:00
|
|
|
_locale_t
|
2014-05-06 15:08:14 +02:00
|
|
|
_resultmap
|
|
|
|
_stringlist
|
|
|
|
acquireLocksOnSubLinks_context
|
2012-06-10 21:15:31 +02:00
|
|
|
adjust_appendrel_attrs_context
|
2021-05-12 19:14:10 +02:00
|
|
|
aff_regex_struct
|
2011-04-09 05:11:37 +02:00
|
|
|
allocfunc
|
2017-05-17 21:52:16 +02:00
|
|
|
amadjustmembers_function
|
2016-05-02 15:23:55 +02:00
|
|
|
ambeginscan_function
|
|
|
|
ambuild_function
|
|
|
|
ambuildempty_function
|
|
|
|
ambuildphasename_function
|
|
|
|
ambulkdelete_function
|
|
|
|
amcanreturn_function
|
|
|
|
amcostestimate_function
|
|
|
|
amendscan_function
|
2017-05-17 21:52:16 +02:00
|
|
|
amestimateparallelscan_function
|
2016-05-02 15:23:55 +02:00
|
|
|
amgetbitmap_function
|
|
|
|
amgettuple_function
|
2017-05-17 21:52:16 +02:00
|
|
|
aminitparallelscan_function
|
2016-05-02 15:23:55 +02:00
|
|
|
aminsert_function
|
|
|
|
ammarkpos_function
|
|
|
|
amoptions_function
|
2017-05-17 21:52:16 +02:00
|
|
|
amparallelrescan_function
|
2016-05-02 15:23:55 +02:00
|
|
|
amproperty_function
|
|
|
|
amrescan_function
|
|
|
|
amrestrpos_function
|
|
|
|
amvacuumcleanup_function
|
|
|
|
amvalidate_function
|
2015-05-24 03:20:37 +02:00
|
|
|
array_iter
|
2010-02-26 02:55:35 +01:00
|
|
|
array_unnest_fctx
|
2011-04-09 05:11:37 +02:00
|
|
|
assign_collations_context
|
2010-02-26 02:55:35 +01:00
|
|
|
autovac_table
|
|
|
|
av_relation
|
|
|
|
avl_dbase
|
2016-04-27 17:47:28 +02:00
|
|
|
avl_node
|
|
|
|
avl_tree
|
2010-02-26 02:55:35 +01:00
|
|
|
avw_dbase
|
|
|
|
backslashResult
|
2011-04-09 05:11:37 +02:00
|
|
|
backup_manifest_info
|
|
|
|
backup_manifest_option
|
2010-02-26 02:55:35 +01:00
|
|
|
base_yy_extra_type
|
2011-04-09 05:11:37 +02:00
|
|
|
basebackup_options
|
2022-05-12 21:17:30 +02:00
|
|
|
bbsink
|
|
|
|
bbsink_copystream
|
|
|
|
bbsink_gzip
|
|
|
|
bbsink_lz4
|
|
|
|
bbsink_ops
|
|
|
|
bbsink_server
|
|
|
|
bbsink_shell
|
|
|
|
bbsink_state
|
|
|
|
bbsink_throttle
|
|
|
|
bbsink_zstd
|
|
|
|
bbstreamer
|
2010-02-26 02:55:35 +01:00
|
|
|
bbstreamer_archive_context
|
|
|
|
bbstreamer_extractor
|
2015-05-24 03:20:37 +02:00
|
|
|
bbstreamer_gzip_decompressor
|
2022-05-12 21:17:30 +02:00
|
|
|
bbstreamer_gzip_writer
|
2011-11-14 18:12:23 +01:00
|
|
|
bbstreamer_lz4_frame
|
2022-05-12 21:17:30 +02:00
|
|
|
bbstreamer_member
|
|
|
|
bbstreamer_ops
|
|
|
|
bbstreamer_plain_writer
|
2016-05-02 15:23:55 +02:00
|
|
|
bbstreamer_recovery_injector
|
2022-04-07 06:29:46 +02:00
|
|
|
bbstreamer_tar_archiver
|
2011-11-14 18:12:23 +01:00
|
|
|
bbstreamer_tar_parser
|
|
|
|
bbstreamer_zstd_frame
|
2013-05-29 22:58:43 +02:00
|
|
|
bgworker_main_type
|
|
|
|
binaryheap
|
2010-02-26 02:55:35 +01:00
|
|
|
binaryheap_comparator
|
|
|
|
bitmapword
|
2017-11-29 15:24:24 +01:00
|
|
|
bits16
|
2010-02-26 02:55:35 +01:00
|
|
|
bits32
|
|
|
|
bits8
|
2018-04-01 02:49:41 +02:00
|
|
|
bloom_filter
|
2022-05-12 21:17:30 +02:00
|
|
|
boolKEY
|
2015-05-24 03:20:37 +02:00
|
|
|
brin_column_state
|
2011-04-09 05:11:37 +02:00
|
|
|
brin_serialize_callback_type
|
2010-02-26 02:55:35 +01:00
|
|
|
bytea
|
|
|
|
cached_re_str
|
2019-03-27 22:59:19 +01:00
|
|
|
canonicalize_state
|
2010-02-26 02:55:35 +01:00
|
|
|
cashKEY
|
2022-05-12 21:17:30 +02:00
|
|
|
catalogid_hash
|
2011-04-09 05:11:37 +02:00
|
|
|
cfp
|
2010-02-26 02:55:35 +01:00
|
|
|
check_agg_arguments_context
|
|
|
|
check_function_callback
|
|
|
|
check_network_data
|
2011-04-09 05:11:37 +02:00
|
|
|
check_object_relabel_type
|
2010-02-26 02:55:35 +01:00
|
|
|
check_password_hook_type
|
|
|
|
check_ungrouped_columns_context
|
|
|
|
chr
|
|
|
|
clock_t
|
2011-04-09 05:11:37 +02:00
|
|
|
cmpEntriesArg
|
2010-02-26 02:55:35 +01:00
|
|
|
codes_t
|
2011-04-09 05:11:37 +02:00
|
|
|
collation_cache_entry
|
2010-02-26 02:55:35 +01:00
|
|
|
color
|
2016-12-13 16:51:32 +01:00
|
|
|
colormaprange
|
2010-02-26 02:55:35 +01:00
|
|
|
compare_context
|
2011-04-09 05:11:37 +02:00
|
|
|
config_var_value
|
2010-02-26 02:55:35 +01:00
|
|
|
contain_aggs_of_level_context
|
|
|
|
convert_testexpr_context
|
2022-10-11 04:45:52 +02:00
|
|
|
copy_data_dest_cb
|
2017-05-17 21:52:16 +02:00
|
|
|
copy_data_source_cb
|
2010-02-26 02:55:35 +01:00
|
|
|
core_YYSTYPE
|
|
|
|
core_yy_extra_type
|
|
|
|
core_yyscan_t
|
2016-06-15 20:33:58 +02:00
|
|
|
corrupt_items
|
2010-02-26 02:55:35 +01:00
|
|
|
cost_qual_eval_context
|
2021-05-12 19:14:10 +02:00
|
|
|
cp_hash_func
|
2016-05-02 15:23:55 +02:00
|
|
|
create_upper_paths_hook_type
|
2010-02-26 02:55:35 +01:00
|
|
|
createdb_failure_params
|
|
|
|
crosstab_HashEnt
|
|
|
|
crosstab_cat_desc
|
2015-05-24 03:20:37 +02:00
|
|
|
datapagemap_iterator_t
|
|
|
|
datapagemap_t
|
2010-02-26 02:55:35 +01:00
|
|
|
dateKEY
|
|
|
|
datetkn
|
2015-05-24 03:20:37 +02:00
|
|
|
dce_uuid_t
|
2022-11-02 02:06:05 +01:00
|
|
|
dclist_head
|
2010-02-26 02:55:35 +01:00
|
|
|
decimal
|
|
|
|
deparse_columns
|
|
|
|
deparse_context
|
|
|
|
deparse_expr_cxt
|
|
|
|
deparse_namespace
|
|
|
|
destructor
|
|
|
|
dev_t
|
2018-04-26 20:45:04 +02:00
|
|
|
digit
|
2014-05-06 15:08:14 +02:00
|
|
|
disassembledLeaf
|
2013-05-29 22:58:43 +02:00
|
|
|
dlist_head
|
|
|
|
dlist_iter
|
|
|
|
dlist_mutable_iter
|
|
|
|
dlist_node
|
2010-02-26 02:55:35 +01:00
|
|
|
ds_state
|
2016-12-02 18:34:36 +01:00
|
|
|
dsa_area
|
|
|
|
dsa_area_control
|
|
|
|
dsa_area_pool
|
|
|
|
dsa_area_span
|
2016-12-13 16:51:32 +01:00
|
|
|
dsa_handle
|
|
|
|
dsa_pointer
|
2018-04-26 20:45:04 +02:00
|
|
|
dsa_pointer_atomic
|
2016-12-02 18:34:36 +01:00
|
|
|
dsa_segment_header
|
2016-12-13 16:51:32 +01:00
|
|
|
dsa_segment_index
|
2016-12-02 18:34:36 +01:00
|
|
|
dsa_segment_map
|
2017-08-23 07:41:32 +02:00
|
|
|
dshash_compare_function
|
|
|
|
dshash_hash
|
|
|
|
dshash_hash_function
|
|
|
|
dshash_parameters
|
|
|
|
dshash_partition
|
2022-03-10 21:54:54 +01:00
|
|
|
dshash_seq_status
|
2017-08-23 07:41:32 +02:00
|
|
|
dshash_table
|
|
|
|
dshash_table_control
|
|
|
|
dshash_table_handle
|
|
|
|
dshash_table_item
|
2014-05-06 15:08:14 +02:00
|
|
|
dsm_control_header
|
|
|
|
dsm_control_item
|
|
|
|
dsm_handle
|
|
|
|
dsm_op
|
|
|
|
dsm_segment
|
|
|
|
dsm_segment_detach_callback
|
2010-07-06 21:18:19 +02:00
|
|
|
eLogType
|
2010-02-26 02:55:35 +01:00
|
|
|
ean13
|
|
|
|
eary
|
2011-04-09 05:11:37 +02:00
|
|
|
ec_matches_callback_type
|
2013-05-29 22:58:43 +02:00
|
|
|
ec_member_foreign_arg
|
|
|
|
ec_member_matches_arg
|
2012-06-10 21:15:31 +02:00
|
|
|
emit_log_hook_type
|
2010-02-26 02:55:35 +01:00
|
|
|
eval_const_expressions_context
|
2013-05-29 22:58:43 +02:00
|
|
|
exec_thread_arg
|
2010-02-26 02:55:35 +01:00
|
|
|
execution_state
|
|
|
|
explain_get_index_name_hook_type
|
|
|
|
f_smgr
|
|
|
|
fd_set
|
2017-05-17 21:52:16 +02:00
|
|
|
fe_scram_state
|
|
|
|
fe_scram_state_enum
|
2020-11-04 10:21:18 +01:00
|
|
|
fetch_range_request
|
2015-05-24 03:20:37 +02:00
|
|
|
file_action_t
|
|
|
|
file_entry_t
|
|
|
|
file_type_t
|
2020-11-04 10:21:18 +01:00
|
|
|
filehash_hash
|
2016-10-15 02:22:51 +02:00
|
|
|
filehash_iterator
|
2015-05-24 03:20:37 +02:00
|
|
|
filemap_t
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
fill_string_relopt
|
2010-02-26 02:55:35 +01:00
|
|
|
finalize_primnode_context
|
|
|
|
find_dependent_phvs_context
|
|
|
|
find_expr_references_context
|
|
|
|
fix_join_expr_context
|
|
|
|
fix_scan_expr_context
|
|
|
|
fix_upper_expr_context
|
|
|
|
fix_windowagg_cond_context
|
|
|
|
flatten_join_alias_vars_context
|
2022-12-06 16:09:24 +01:00
|
|
|
flatten_rtes_walker_context
|
2010-02-26 02:55:35 +01:00
|
|
|
float4
|
|
|
|
float4KEY
|
|
|
|
float8
|
|
|
|
float8KEY
|
2019-05-22 18:55:34 +02:00
|
|
|
floating_decimal_32
|
|
|
|
floating_decimal_64
|
2014-05-06 15:08:14 +02:00
|
|
|
fmAggrefPtr
|
2015-05-24 03:20:37 +02:00
|
|
|
fmExprContextCallbackFunction
|
2010-02-26 02:55:35 +01:00
|
|
|
fmNodePtr
|
2012-06-10 21:15:31 +02:00
|
|
|
fmStringInfo
|
2011-04-09 05:11:37 +02:00
|
|
|
fmgr_hook_type
|
2013-05-29 22:58:43 +02:00
|
|
|
foreign_glob_cxt
|
|
|
|
foreign_loc_cxt
|
2011-04-09 05:11:37 +02:00
|
|
|
freeaddrinfo_ptr_t
|
2010-02-26 02:55:35 +01:00
|
|
|
freefunc
|
|
|
|
fsec_t
|
2011-06-09 20:01:49 +02:00
|
|
|
gbt_vsrt_arg
|
2010-02-26 02:55:35 +01:00
|
|
|
gbtree_ninfo
|
|
|
|
gbtree_vinfo
|
|
|
|
generate_series_fctx
|
2015-05-24 03:20:37 +02:00
|
|
|
generate_series_numeric_fctx
|
2010-02-26 02:55:35 +01:00
|
|
|
generate_series_timestamp_fctx
|
|
|
|
generate_series_timestamptz_fctx
|
|
|
|
generate_subscripts_fctx
|
|
|
|
get_attavgwidth_hook_type
|
|
|
|
get_index_stats_hook_type
|
|
|
|
get_relation_info_hook_type
|
|
|
|
get_relation_stats_hook_type
|
2011-04-09 05:11:37 +02:00
|
|
|
getaddrinfo_ptr_t
|
|
|
|
getnameinfo_ptr_t
|
2010-02-26 02:55:35 +01:00
|
|
|
gid_t
|
2015-05-24 03:20:37 +02:00
|
|
|
gin_leafpage_items_state
|
2010-02-26 02:55:35 +01:00
|
|
|
ginxlogCreatePostingTree
|
|
|
|
ginxlogDeleteListPages
|
|
|
|
ginxlogDeletePage
|
|
|
|
ginxlogInsert
|
2014-05-06 15:08:14 +02:00
|
|
|
ginxlogInsertDataInternal
|
|
|
|
ginxlogInsertEntry
|
2010-02-26 02:55:35 +01:00
|
|
|
ginxlogInsertListPage
|
2014-05-06 15:08:14 +02:00
|
|
|
ginxlogRecompressDataLeaf
|
2010-02-26 02:55:35 +01:00
|
|
|
ginxlogSplit
|
|
|
|
ginxlogUpdateMeta
|
2014-05-06 15:08:14 +02:00
|
|
|
ginxlogVacuumDataLeafPage
|
2010-02-26 02:55:35 +01:00
|
|
|
gistxlogDelete
|
|
|
|
gistxlogPage
|
|
|
|
gistxlogPageDelete
|
|
|
|
gistxlogPageReuse
|
|
|
|
gistxlogPageSplit
|
|
|
|
gistxlogPageUpdate
|
2017-05-17 21:52:16 +02:00
|
|
|
grouping_sets_data
|
2011-04-09 05:11:37 +02:00
|
|
|
gseg_picksplit_item
|
2010-02-26 02:55:35 +01:00
|
|
|
gss_buffer_desc
|
2013-05-29 22:58:43 +02:00
|
|
|
gss_cred_id_t
|
|
|
|
gss_ctx_id_t
|
|
|
|
gss_name_t
|
2010-02-26 02:55:35 +01:00
|
|
|
gtrgm_consistent_cache
|
|
|
|
gzFile
|
|
|
|
hashfunc
|
|
|
|
hbaPort
|
|
|
|
heap_page_items_state
|
|
|
|
help_handler
|
|
|
|
hlCheck
|
2016-12-13 16:51:32 +01:00
|
|
|
hstoreCheckKeyLen_t
|
|
|
|
hstoreCheckValLen_t
|
|
|
|
hstorePairs_t
|
|
|
|
hstoreUniquePairs_t
|
|
|
|
hstoreUpgrade_t
|
2015-05-24 03:20:37 +02:00
|
|
|
hyperLogLogState
|
2017-05-17 21:52:16 +02:00
|
|
|
ifState
|
2018-04-26 20:45:04 +02:00
|
|
|
ilist
|
2015-05-24 03:20:37 +02:00
|
|
|
import_error_callback_arg
|
2010-02-26 02:55:35 +01:00
|
|
|
indexed_tlist
|
|
|
|
inet
|
|
|
|
inetKEY
|
|
|
|
inet_struct
|
2014-05-06 15:08:14 +02:00
|
|
|
init_function
|
2010-02-26 02:55:35 +01:00
|
|
|
inline_cte_walker_context
|
2010-07-06 21:18:19 +02:00
|
|
|
inline_error_callback_arg
|
2010-02-26 02:55:35 +01:00
|
|
|
ino_t
|
|
|
|
instr_time
|
2015-05-24 03:20:37 +02:00
|
|
|
int128
|
2010-02-26 02:55:35 +01:00
|
|
|
int16
|
|
|
|
int16KEY
|
|
|
|
int2vector
|
|
|
|
int32
|
|
|
|
int32KEY
|
|
|
|
int32_t
|
|
|
|
int64
|
|
|
|
int64KEY
|
|
|
|
int8
|
|
|
|
internalPQconninfoOption
|
|
|
|
intptr_t
|
2019-05-22 18:55:34 +02:00
|
|
|
intset_internal_node
|
|
|
|
intset_leaf_node
|
|
|
|
intset_node
|
2010-02-26 02:55:35 +01:00
|
|
|
intvKEY
|
Add pg_stat_io view, providing more detailed IO statistics
Builds on 28e626bde00 and f30d62c2fc6. See the former for motivation.
Rows of the view show IO operations for a particular backend type, IO target
object, IO context combination (e.g. a client backend's operations on
permanent relations in shared buffers) and each column in the view is the
total number of IO Operations done (e.g. writes). So a cell in the view would
be, for example, the number of blocks of relation data written from shared
buffers by client backends since the last stats reset.
In anticipation of tracking WAL IO and non-block-oriented IO (such as
temporary file IO), the "op_bytes" column specifies the unit of the "reads",
"writes", and "extends" columns for a given row.
Rows for combinations of IO operation, backend type, target object and context
that never occur, are ommitted entirely. For example, checkpointer will never
operate on temporary relations.
Similarly, if an IO operation never occurs for such a combination, the IO
operation's cell will be null, to distinguish from 0 observed IO
operations. For example, bgwriter should not perform reads.
Note that some of the cells in the view are redundant with fields in
pg_stat_bgwriter (e.g. buffers_backend). For now, these have been kept for
backwards compatibility.
Bumps catversion.
Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Samay Sharma <smilingsamay@gmail.com>
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
2023-02-11 18:51:58 +01:00
|
|
|
io_stat_col
|
2010-02-26 02:55:35 +01:00
|
|
|
itemIdCompact
|
|
|
|
itemIdCompactData
|
2018-04-26 20:45:04 +02:00
|
|
|
iterator
|
2010-02-26 02:55:35 +01:00
|
|
|
jmp_buf
|
|
|
|
join_search_hook_type
|
2013-05-29 22:58:43 +02:00
|
|
|
json_aelem_action
|
2015-05-24 03:20:37 +02:00
|
|
|
json_manifest_error_callback
|
2012-06-10 21:15:31 +02:00
|
|
|
json_manifest_perfile_callback
|
|
|
|
json_manifest_perwalrange_callback
|
2013-05-29 22:58:43 +02:00
|
|
|
json_ofield_action
|
|
|
|
json_scalar_action
|
|
|
|
json_struct_action
|
2011-04-09 05:11:37 +02:00
|
|
|
keyEntryData
|
2010-02-26 02:55:35 +01:00
|
|
|
key_t
|
|
|
|
lclContext
|
|
|
|
lclTocEntry
|
2014-05-06 15:08:14 +02:00
|
|
|
leafSegmentInfo
|
2019-05-22 18:55:34 +02:00
|
|
|
leaf_item
|
2020-11-04 10:21:18 +01:00
|
|
|
libpq_source
|
2010-02-26 02:55:35 +01:00
|
|
|
line_t
|
2016-12-13 16:51:32 +01:00
|
|
|
lineno_t
|
2018-04-26 20:45:04 +02:00
|
|
|
list_sort_comparator
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
local_relopt
|
|
|
|
local_relopts
|
2020-11-04 10:21:18 +01:00
|
|
|
local_source
|
2011-04-09 05:11:37 +02:00
|
|
|
locale_t
|
2010-02-26 02:55:35 +01:00
|
|
|
locate_agg_of_level_context
|
|
|
|
locate_var_of_level_context
|
|
|
|
locate_windowfunc_context
|
2011-11-14 18:12:23 +01:00
|
|
|
logstreamer_param
|
2010-02-26 02:55:35 +01:00
|
|
|
lquery
|
|
|
|
lquery_level
|
|
|
|
lquery_variant
|
|
|
|
ltree
|
|
|
|
ltree_gist
|
|
|
|
ltree_level
|
|
|
|
ltxtquery
|
|
|
|
mXactCacheEnt
|
2017-05-17 21:52:16 +02:00
|
|
|
mac8KEY
|
2010-02-26 02:55:35 +01:00
|
|
|
macKEY
|
|
|
|
macaddr
|
2017-05-17 21:52:16 +02:00
|
|
|
macaddr8
|
|
|
|
macaddr_sortsupport_state
|
2020-05-14 19:06:38 +02:00
|
|
|
manifest_file
|
|
|
|
manifest_files_hash
|
2016-12-13 16:51:32 +01:00
|
|
|
manifest_files_iterator
|
2020-05-14 19:06:38 +02:00
|
|
|
manifest_wal_range
|
2010-02-26 02:55:35 +01:00
|
|
|
map_variable_attnos_context
|
2016-12-13 16:51:32 +01:00
|
|
|
max_parallel_hazard_context
|
2010-02-26 02:55:35 +01:00
|
|
|
mb2wchar_with_len_converter
|
|
|
|
mbchar_verifier
|
2011-11-14 18:12:23 +01:00
|
|
|
mbcharacter_incrementer
|
2010-02-26 02:55:35 +01:00
|
|
|
mbdisplaylen_converter
|
|
|
|
mblen_converter
|
|
|
|
mbstr_verifier
|
2021-07-14 02:43:58 +02:00
|
|
|
memoize_hash
|
|
|
|
memoize_iterator
|
2010-02-26 02:55:35 +01:00
|
|
|
metastring
|
|
|
|
mix_data_t
|
|
|
|
mixedStruct
|
|
|
|
mode_t
|
|
|
|
movedb_failure_params
|
Multirange datatypes
Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.
Since v14, each range type automatically gets a corresponding multirange
datatype. There are both manual and automatic mechanisms for naming multirange
types. Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE. Otherwise, a multirange type name is generated
automatically. If the range type name contains "range" then we change that to
"multirange". Otherwise, we add "_multirange" to the end.
Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids. Altogether this format allows fetching a particular range by its index
in O(n).
Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.
Catversion is bumped.
Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov
2020-12-20 05:20:33 +01:00
|
|
|
multirange_bsearch_comparison
|
2010-02-26 02:55:35 +01:00
|
|
|
multirange_unnest_fctx
|
2013-05-29 22:58:43 +02:00
|
|
|
mxact
|
|
|
|
mxtruncinfo
|
2011-04-09 05:11:37 +02:00
|
|
|
needs_fmgr_hook_type
|
2017-05-17 21:52:16 +02:00
|
|
|
network_sortsupport_state
|
2010-02-26 02:55:35 +01:00
|
|
|
nodeitem
|
|
|
|
normal_rand_fctx
|
|
|
|
ntile_context
|
|
|
|
numeric
|
2011-04-09 05:11:37 +02:00
|
|
|
object_access_hook_type
|
|
|
|
object_access_hook_type_str
|
2010-02-26 02:55:35 +01:00
|
|
|
off_t
|
|
|
|
oidKEY
|
|
|
|
oidvector
|
2014-05-06 15:08:14 +02:00
|
|
|
on_dsm_detach_callback
|
2012-06-10 21:15:31 +02:00
|
|
|
on_exit_nicely_callback
|
2011-04-09 05:11:37 +02:00
|
|
|
openssl_tls_init_hook_typ
|
2016-12-13 16:51:32 +01:00
|
|
|
ossl_EVP_cipher_func
|
2018-04-26 20:45:04 +02:00
|
|
|
other
|
2015-05-24 03:20:37 +02:00
|
|
|
output_type
|
2016-12-13 16:51:32 +01:00
|
|
|
pagetable_hash
|
|
|
|
pagetable_iterator
|
2015-05-24 03:20:37 +02:00
|
|
|
pairingheap
|
|
|
|
pairingheap_comparator
|
|
|
|
pairingheap_node
|
|
|
|
parallel_worker_main_type
|
2010-07-06 21:18:19 +02:00
|
|
|
parse_error_callback_arg
|
2010-02-26 02:55:35 +01:00
|
|
|
parser_context
|
Implement partition-wise grouping/aggregation.
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition. This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition. This should be no worse, and often better, than the normal
approach.
If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results. This is less certain
to be a win, but it's still useful.
Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.
Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-03-22 17:49:48 +01:00
|
|
|
partition_method_t
|
2010-02-26 02:55:35 +01:00
|
|
|
pendingPosition
|
|
|
|
pgParameterStatus
|
2015-05-24 03:20:37 +02:00
|
|
|
pg_atomic_flag
|
|
|
|
pg_atomic_uint32
|
|
|
|
pg_atomic_uint64
|
2021-07-07 03:55:15 +02:00
|
|
|
pg_be_sasl_mech
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_checksum_context
|
|
|
|
pg_checksum_raw_context
|
2012-06-10 21:15:31 +02:00
|
|
|
pg_checksum_type
|
Rename backup_compression.{c,h} to compression.{c,h}
Compression option handling (level, algorithm or even workers) can be
used across several parts of the system and not only base backups.
Structures, objects and routines are renamed in consequence, to remove
the concept of base backups from this part of the code making this
change straight-forward.
pg_receivewal, that has gained support for LZ4 since babbbb5, will make
use of this infrastructure for its set of compression options, bringing
more consistency with pg_basebackup. This cleanup needs to be done
before releasing a beta of 15. pg_dump is a potential future target, as
well, and adding more compression options to it may happen in 16~.
Author: Michael Paquier
Reviewed-by: Robert Haas, Georgios Kokolatos
Discussion: https://postgr.es/m/YlPQGNAAa04raObK@paquier.xyz
2022-04-12 06:38:54 +02:00
|
|
|
pg_compress_algorithm
|
|
|
|
pg_compress_specification
|
2016-12-13 16:51:32 +01:00
|
|
|
pg_conn_host
|
|
|
|
pg_conn_host_type
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_conv_map
|
|
|
|
pg_crc32
|
2015-05-24 03:20:37 +02:00
|
|
|
pg_crc32c
|
Move SHA2 routines to a new generic API layer for crypto hashes
Two new routines to allocate a hash context and to free it are created,
as these become necessary for the goal behind this refactoring: switch
the all cryptohash implementations for OpenSSL to use EVP (for FIPS and
also because upstream does not recommend the use of low-level cryptohash
functions for 20 years). Note that OpenSSL hides the internals of
cryptohash contexts since 1.1.0, so it is necessary to leave the
allocation to OpenSSL itself, explaining the need for those two new
routines. This part is going to require more work to properly track
hash contexts with resource owners, but this not introduced here.
Still, this refactoring makes the move possible.
This reduces the number of routines for all SHA2 implementations from
twelve (SHA{224,256,386,512} with init, update and final calls) to five
(create, free, init, update and final calls) by incorporating the hash
type directly into the hash context data.
The new cryptohash routines are moved to a new file, called cryptohash.c
for the fallback implementations, with SHA2 specifics becoming a part
internal to src/common/. OpenSSL specifics are part of
cryptohash_openssl.c. This infrastructure is usable for more hash
types, like MD5 or HMAC.
Any code paths using the internal SHA2 routines are adapted to report
correctly errors, which are most of the changes of this commit. The
zones mostly impacted are checksum manifests, libpq and SCRAM.
Note that e21cbb4 was a first attempt to switch SHA2 to EVP, but it
lacked the refactoring needed for libpq, as done here.
This patch has been tested on Linux and Windows, with and without
OpenSSL, and down to 1.0.1, the oldest version supported on HEAD.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz
2020-12-02 02:37:20 +01:00
|
|
|
pg_cryptohash_ctx
|
Improve error handling of cryptohash computations
The existing cryptohash facility was causing problems in some code paths
related to MD5 (frontend and backend) that relied on the fact that the
only type of error that could happen would be an OOM, as the MD5
implementation used in PostgreSQL ~13 (the in-core implementation is
used when compiling with or without OpenSSL in those older versions),
could fail only under this circumstance.
The new cryptohash facilities can fail for reasons other than OOMs, like
attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to
1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this
would cause incorrect reports to show up.
This commit extends the cryptohash APIs so as callers of those routines
can fetch more context when an error happens, by using a new routine
called pg_cryptohash_error(). The error states are stored within each
implementation's internal context data, so as it is possible to extend
the logic depending on what's suited for an implementation. The default
implementation requires few error states, but OpenSSL could report
various issues depending on its internal state so more is needed in
cryptohash_openssl.c, and the code is shaped so as we are always able to
grab the necessary information.
The core code is changed to adapt to the new error routine, painting
more "const" across the call stack where the static errors are stored,
particularly in authentication code paths on variables that provide
log details. This way, any future changes would warn if attempting to
free these strings. The MD5 authentication code was also a bit blurry
about the handling of "logdetail" (LOG sent to the postmaster), so
improve the comments related that, while on it.
The origin of the problem is 87ae969, that introduced the centralized
cryptohash facility. Extra changes are done for pgcrypto in v14 for the
non-OpenSSL code path to cope with the improvements done by this
commit.
Reported-by: Michael Mühlbeyer
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com
Backpatch-through: 14
2022-01-11 01:55:16 +01:00
|
|
|
pg_cryptohash_errno
|
Move SHA2 routines to a new generic API layer for crypto hashes
Two new routines to allocate a hash context and to free it are created,
as these become necessary for the goal behind this refactoring: switch
the all cryptohash implementations for OpenSSL to use EVP (for FIPS and
also because upstream does not recommend the use of low-level cryptohash
functions for 20 years). Note that OpenSSL hides the internals of
cryptohash contexts since 1.1.0, so it is necessary to leave the
allocation to OpenSSL itself, explaining the need for those two new
routines. This part is going to require more work to properly track
hash contexts with resource owners, but this not introduced here.
Still, this refactoring makes the move possible.
This reduces the number of routines for all SHA2 implementations from
twelve (SHA{224,256,386,512} with init, update and final calls) to five
(create, free, init, update and final calls) by incorporating the hash
type directly into the hash context data.
The new cryptohash routines are moved to a new file, called cryptohash.c
for the fallback implementations, with SHA2 specifics becoming a part
internal to src/common/. OpenSSL specifics are part of
cryptohash_openssl.c. This infrastructure is usable for more hash
types, like MD5 or HMAC.
Any code paths using the internal SHA2 routines are adapted to report
correctly errors, which are most of the changes of this commit. The
zones mostly impacted are checksum manifests, libpq and SCRAM.
Note that e21cbb4 was a first attempt to switch SHA2 to EVP, but it
lacked the refactoring needed for libpq, as done here.
This patch has been tested on Linux and Windows, with and without
OpenSSL, and down to 1.0.1, the oldest version supported on HEAD.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz
2020-12-02 02:37:20 +01:00
|
|
|
pg_cryptohash_type
|
2012-06-10 21:15:31 +02:00
|
|
|
pg_ctype_cache
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_enc
|
|
|
|
pg_enc2gettext
|
|
|
|
pg_enc2name
|
|
|
|
pg_encname
|
2021-07-07 03:55:15 +02:00
|
|
|
pg_fe_sasl_mech
|
2021-05-12 19:14:10 +02:00
|
|
|
pg_funcptr_t
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_gssinfo
|
Refactor HMAC implementations
Similarly to the cryptohash implementations, this refactors the existing
HMAC code into a single set of APIs that can be plugged with any crypto
libraries PostgreSQL is built with (only OpenSSL currently). If there
is no such libraries, a fallback implementation is available. Those new
APIs are designed similarly to the existing cryptohash layer, so there
is no real new design here, with the same logic around buffer bound
checks and memory handling.
HMAC has a dependency on cryptohashes, so all the cryptohash types
supported by cryptohash{_openssl}.c can be used with HMAC. This
refactoring is an advantage mainly for SCRAM, that included its own
implementation of HMAC with SHA256 without relying on the existing
crypto libraries even if PostgreSQL was built with their support.
This code has been tested on Windows and Linux, with and without
OpenSSL, across all the versions supported on HEAD from 1.1.1 down to
1.0.1. I have also checked that the implementations are working fine
using some sample results, a custom extension of my own, and doing
cross-checks across different major versions with SCRAM with the client
and the backend.
Author: Michael Paquier
Reviewed-by: Bruce Momjian
Discussion: https://postgr.es/m/X9m0nkEJEzIPXjeZ@paquier.xyz
2021-04-03 10:30:49 +02:00
|
|
|
pg_hmac_ctx
|
Improve error handling of HMAC computations
This is similar to b69aba7, except that this completes the work for
HMAC with a new routine called pg_hmac_error() that would provide more
context about the type of error that happened during a HMAC computation:
- The fallback HMAC implementation in hmac.c relies on cryptohashes, so
in some code paths it is necessary to return back the error generated by
cryptohashes.
- For the OpenSSL implementation (hmac_openssl.c), the logic is very
similar to cryptohash_openssl.c, where the error context comes from
OpenSSL if one of its internal routines failed, with different error
codes if something internal to hmac_openssl.c failed or was incorrect.
Any in-core code paths that use the centralized HMAC interface are
related to SCRAM, for errors that are unlikely going to happen, with
only SHA-256. It would be possible to see errors when computing some
HMACs with MD5 for example and OpenSSL FIPS enabled, and this commit
would help in reporting the correct errors but nothing in core uses
that. So, at the end, no backpatch to v14 is done, at least for now.
Errors in SCRAM related to the computation of the server key, stored
key, etc. need to pass down the potential error context string across
more layers of their respective call stacks for the frontend and the
backend, so each surrounding routine is adapted for this purpose.
Reviewed-by: Sergey Shinderuk
Discussion: https://postgr.es/m/Yd0N9tSAIIkFd+qi@paquier.xyz
2022-01-13 08:17:21 +01:00
|
|
|
pg_hmac_errno
|
2013-05-29 22:58:43 +02:00
|
|
|
pg_int64
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_local_to_utf_combined
|
2011-04-09 05:11:37 +02:00
|
|
|
pg_locale_t
|
2017-05-17 21:52:16 +02:00
|
|
|
pg_mb_radix_tree
|
Refactor MD5 implementations according to new cryptohash infrastructure
This commit heavily reorganizes the MD5 implementations that exist in
the tree in various aspects.
First, MD5 is added to the list of options available in cryptohash.c and
cryptohash_openssl.c. This means that if building with OpenSSL, EVP is
used for MD5 instead of the fallback implementation that Postgres had
for ages. With the recent refactoring work for cryptohash functions,
this change is straight-forward. If not building with OpenSSL, a
fallback implementation internal to src/common/ is used.
Second, this reduces the number of MD5 implementations present in the
tree from two to one, by moving the KAME implementation from pgcrypto to
src/common/, and by removing the implementation that existed in
src/common/. KAME was already structured with an init/update/final set
of routines by pgcrypto (see original pgcrypto/md5.h) for compatibility
with OpenSSL, so moving it to src/common/ has proved to be a
straight-forward move, requiring no actual manipulation of the internals
of each routine. Some benchmarking has not shown any performance gap
between both implementations.
Similarly to the fallback implementation used for SHA2, the fallback
implementation of MD5 is moved to src/common/md5.c with an internal
header called md5_int.h for the init, update and final routines. This
gets then consumed by cryptohash.c.
The original routines used for MD5-hashed passwords are moved to a
separate file called md5_common.c, also in src/common/, aimed at being
shared between all MD5 implementations as utility routines to keep
compatibility with any code relying on them.
Like the SHA2 changes, this commit had its round of tests on both Linux
and Windows, across all versions of OpenSSL supported on HEAD, with and
even without OpenSSL.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20201106073434.GA4961@paquier.xyz
2020-12-10 03:59:10 +01:00
|
|
|
pg_md5_ctx
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_on_exit_callback
|
2022-05-12 21:17:30 +02:00
|
|
|
pg_prng_state
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_re_flags
|
2017-05-17 21:52:16 +02:00
|
|
|
pg_saslprep_rc
|
2021-01-23 03:33:04 +01:00
|
|
|
pg_sha1_ctx
|
2017-05-17 21:52:16 +02:00
|
|
|
pg_sha224_ctx
|
|
|
|
pg_sha256_ctx
|
|
|
|
pg_sha384_ctx
|
|
|
|
pg_sha512_ctx
|
2020-04-07 01:33:56 +02:00
|
|
|
pg_snapshot
|
2012-06-10 21:15:31 +02:00
|
|
|
pg_stack_base_t
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_time_t
|
2021-03-10 04:09:50 +01:00
|
|
|
pg_time_usec_t
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_tz
|
|
|
|
pg_tz_cache
|
|
|
|
pg_tzenum
|
2017-05-17 21:52:16 +02:00
|
|
|
pg_unicode_decompinfo
|
|
|
|
pg_unicode_decomposition
|
2020-10-11 12:09:01 +02:00
|
|
|
pg_unicode_norminfo
|
2017-05-17 21:52:16 +02:00
|
|
|
pg_unicode_normprops
|
2020-10-11 12:09:01 +02:00
|
|
|
pg_unicode_recompinfo
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_utf_to_local_combined
|
|
|
|
pg_uuid_t
|
2012-06-10 21:15:31 +02:00
|
|
|
pg_wc_probefunc
|
2010-02-26 02:55:35 +01:00
|
|
|
pg_wchar
|
|
|
|
pg_wchar_tbl
|
2015-05-24 03:20:37 +02:00
|
|
|
pgp_armor_headers_state
|
2010-02-26 02:55:35 +01:00
|
|
|
pgsocket
|
|
|
|
pgsql_thing_t
|
|
|
|
pgssEntry
|
2020-11-26 13:18:05 +01:00
|
|
|
pgssGlobalStats
|
2010-02-26 02:55:35 +01:00
|
|
|
pgssHashKey
|
|
|
|
pgssSharedState
|
Allow pg_stat_statements to track planning statistics.
This commit makes pg_stat_statements support new GUC
pg_stat_statements.track_planning. If this option is enabled,
pg_stat_statements tracks the planning statistics of the statements,
e.g., the number of times the statement was planned, the total time
spent planning the statement, etc. This feature is useful to check
the statements that it takes a long time to plan. Previously since
pg_stat_statements tracked only the execution statistics, we could
not use that for the purpose.
The planning and execution statistics are stored at the end of
each phase separately. So there are not always one-to-one relationship
between them. For example, if the statement is successfully planned
but fails in the execution phase, only its planning statistics are stored.
This may cause the users to be able to see different pg_stat_statements
results from the previous version. To avoid this,
pg_stat_statements.track_planning needs to be disabled.
This commit bumps the version of pg_stat_statements to 1.8
since it changes the definition of pg_stat_statements function.
Author: Julien Rouhaud, Pascal Legrand, Thomas Munro, Fujii Masao
Reviewed-by: Sergei Kornilov, Tomas Vondra, Yoshikazu Imai, Haribabu Kommi, Tom Lane
Discussion: https://postgr.es/m/CAHGQGwFx_=DO-Gu-MfPW3VQ4qC7TfVdH2zHmvZfrGv6fQ3D-Tw@mail.gmail.com
Discussion: https://postgr.es/m/CAEepm=0e59Y_6Q_YXYCTHZkqOc6H2pJ54C_Xe=VFu50Aqqp_sA@mail.gmail.com
Discussion: https://postgr.es/m/DB6PR0301MB21352F6210E3B11934B0DCC790B00@DB6PR0301MB2135.eurprd03.prod.outlook.com
2020-04-02 04:20:19 +02:00
|
|
|
pgssStoreKind
|
2014-05-06 15:08:14 +02:00
|
|
|
pgssVersion
|
2022-04-07 06:29:46 +02:00
|
|
|
pgstat_entry_ref_hash_hash
|
|
|
|
pgstat_entry_ref_hash_iterator
|
2010-02-26 02:55:35 +01:00
|
|
|
pgstat_page
|
2022-04-07 06:29:46 +02:00
|
|
|
pgstat_snapshot_hash
|
2010-02-26 02:55:35 +01:00
|
|
|
pgstattuple_type
|
|
|
|
pgthreadlock_t
|
|
|
|
pid_t
|
2016-04-27 17:47:28 +02:00
|
|
|
pivot_field
|
2010-02-26 02:55:35 +01:00
|
|
|
planner_hook_type
|
2011-04-09 05:11:37 +02:00
|
|
|
plperl_array_info
|
2010-02-26 02:55:35 +01:00
|
|
|
plperl_call_data
|
2011-04-09 05:11:37 +02:00
|
|
|
plperl_interp_desc
|
2010-02-26 02:55:35 +01:00
|
|
|
plperl_proc_desc
|
2011-04-09 05:11:37 +02:00
|
|
|
plperl_proc_key
|
|
|
|
plperl_proc_ptr
|
2010-02-26 02:55:35 +01:00
|
|
|
plperl_query_desc
|
|
|
|
plperl_query_entry
|
2015-05-24 03:20:37 +02:00
|
|
|
plpgsql_CastHashEntry
|
|
|
|
plpgsql_CastHashKey
|
2010-02-26 02:55:35 +01:00
|
|
|
plpgsql_HashEnt
|
2016-12-13 16:51:32 +01:00
|
|
|
pltcl_call_state
|
2011-04-09 05:11:37 +02:00
|
|
|
pltcl_interp_desc
|
2010-02-26 02:55:35 +01:00
|
|
|
pltcl_proc_desc
|
2011-04-09 05:11:37 +02:00
|
|
|
pltcl_proc_key
|
|
|
|
pltcl_proc_ptr
|
2010-02-26 02:55:35 +01:00
|
|
|
pltcl_query_desc
|
2018-04-26 20:45:04 +02:00
|
|
|
pointer
|
2020-05-14 19:06:38 +02:00
|
|
|
polymorphic_actuals
|
2016-04-27 17:47:28 +02:00
|
|
|
pos_trgm
|
2012-06-10 21:15:31 +02:00
|
|
|
post_parse_analyze_hook_type
|
2016-05-02 15:23:55 +02:00
|
|
|
postprocess_result_function
|
2010-02-26 02:55:35 +01:00
|
|
|
pqbool
|
|
|
|
pqsigfunc
|
|
|
|
printQueryOpt
|
|
|
|
printTableContent
|
|
|
|
printTableFooter
|
|
|
|
printTableOpt
|
|
|
|
printTextFormat
|
|
|
|
printTextLineFormat
|
|
|
|
printTextLineWrap
|
|
|
|
printTextRule
|
2022-07-25 20:24:50 +02:00
|
|
|
printXheaderWidthType
|
2018-04-26 20:45:04 +02:00
|
|
|
printfunc
|
2010-02-26 02:55:35 +01:00
|
|
|
priv_map
|
2015-05-24 03:20:37 +02:00
|
|
|
process_file_callback_t
|
2010-02-26 02:55:35 +01:00
|
|
|
process_sublinks_context
|
2016-08-16 00:09:55 +02:00
|
|
|
proclist_head
|
|
|
|
proclist_mutable_iter
|
|
|
|
proclist_node
|
2010-02-26 02:55:35 +01:00
|
|
|
promptStatus_t
|
2021-03-10 03:40:17 +01:00
|
|
|
pthread_barrier_t
|
2010-02-26 02:55:35 +01:00
|
|
|
pthread_cond_t
|
|
|
|
pthread_key_t
|
|
|
|
pthread_mutex_t
|
|
|
|
pthread_once_t
|
|
|
|
pthread_t
|
2016-12-13 16:51:32 +01:00
|
|
|
ptrdiff_t
|
2010-02-26 02:55:35 +01:00
|
|
|
pull_var_clause_context
|
2011-11-14 18:12:23 +01:00
|
|
|
pull_varattnos_context
|
2010-02-26 02:55:35 +01:00
|
|
|
pull_varnos_context
|
|
|
|
pull_vars_context
|
|
|
|
pullup_replace_vars_context
|
2015-05-24 03:20:37 +02:00
|
|
|
pushdown_safety_info
|
2021-05-12 19:14:10 +02:00
|
|
|
qc_hash_func
|
2010-02-26 02:55:35 +01:00
|
|
|
qsort_arg_comparator
|
|
|
|
qsort_comparator
|
|
|
|
query_pathkeys_callback
|
|
|
|
radius_attribute
|
|
|
|
radius_packet
|
|
|
|
rangeTableEntry_used_context
|
|
|
|
rank_context
|
2011-04-09 05:11:37 +02:00
|
|
|
rbt_allocfunc
|
|
|
|
rbt_combiner
|
2010-02-26 02:55:35 +01:00
|
|
|
rbt_comparator
|
|
|
|
rbt_freefunc
|
|
|
|
reduce_outer_joins_state
|
2018-04-26 20:45:04 +02:00
|
|
|
reference
|
2013-05-29 22:58:43 +02:00
|
|
|
regex_arc_t
|
2010-02-26 02:55:35 +01:00
|
|
|
regex_t
|
2014-05-06 15:08:14 +02:00
|
|
|
regexp
|
2010-02-26 02:55:35 +01:00
|
|
|
regexp_matches_ctx
|
2015-05-24 03:20:37 +02:00
|
|
|
registered_buffer
|
2010-02-26 02:55:35 +01:00
|
|
|
regmatch_t
|
|
|
|
regoff_t
|
|
|
|
regproc
|
|
|
|
relopt_bool
|
2020-03-19 19:40:45 +01:00
|
|
|
relopt_enum
|
|
|
|
relopt_enum_elt_def
|
2010-02-26 02:55:35 +01:00
|
|
|
relopt_gen
|
|
|
|
relopt_int
|
|
|
|
relopt_kind
|
|
|
|
relopt_parse_elt
|
|
|
|
relopt_real
|
|
|
|
relopt_string
|
|
|
|
relopt_type
|
|
|
|
relopt_value
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
relopts_validator
|
2010-02-26 02:55:35 +01:00
|
|
|
remoteConn
|
|
|
|
remoteConnHashEnt
|
|
|
|
remoteDep
|
|
|
|
rendezvousHashEntry
|
|
|
|
replace_rte_variables_callback
|
|
|
|
replace_rte_variables_context
|
2020-05-16 17:49:14 +02:00
|
|
|
ret_type
|
2020-11-04 10:21:18 +01:00
|
|
|
rewind_source
|
2010-02-26 02:55:35 +01:00
|
|
|
rewrite_event
|
Allow specifying row filters for logical replication of tables.
This feature adds row filtering for publication tables. When a publication
is defined or modified, an optional WHERE clause can be specified. Rows
that don't satisfy this WHERE clause will be filtered out. This allows a
set of tables to be partially replicated. The row filter is per table. A
new row filter can be added simply by specifying a WHERE clause after the
table name. The WHERE clause must be enclosed by parentheses.
The row filter WHERE clause for a table added to a publication that
publishes UPDATE and/or DELETE operations must contain only columns that
are covered by REPLICA IDENTITY. The row filter WHERE clause for a table
added to a publication that publishes INSERT can use any column. If the
row filter evaluates to NULL, it is regarded as "false". The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined types, user-defined collations,
non-immutable built-in functions, or references to system columns. These
restrictions could be addressed in the future.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber. If the subscription
has several publications in which a table has been published with
different WHERE clauses, rows that satisfy ANY of the expressions will be
copied. If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher.
The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters (for the same publish operation), those
expressions get OR'ed together so that rows satisfying any of the
expressions will be replicated.
This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.
If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition's
row filter (if the parameter is false, the default) or the root
partitioned table's row filter.
Psql commands \dRp+ and \d <table-name> will display any row filters.
Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian
Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
2022-02-22 03:24:12 +01:00
|
|
|
rf_context
|
2010-02-26 02:55:35 +01:00
|
|
|
rm_detail_t
|
2011-04-09 05:11:37 +02:00
|
|
|
role_auth_extra
|
2015-05-24 03:20:37 +02:00
|
|
|
row_security_policy_hook_type
|
2010-02-26 02:55:35 +01:00
|
|
|
rsv_callback
|
2021-05-12 19:14:10 +02:00
|
|
|
saophash_hash
|
2010-02-26 02:55:35 +01:00
|
|
|
save_buffer
|
2017-05-17 21:52:16 +02:00
|
|
|
scram_state
|
|
|
|
scram_state_enum
|
2016-12-13 16:51:32 +01:00
|
|
|
sem_t
|
2010-02-26 02:55:35 +01:00
|
|
|
sequence_magic
|
2015-05-24 03:20:37 +02:00
|
|
|
set_join_pathlist_hook_type
|
|
|
|
set_rel_pathlist_hook_type
|
2014-05-06 15:08:14 +02:00
|
|
|
shm_mq
|
|
|
|
shm_mq_handle
|
2015-05-24 03:20:37 +02:00
|
|
|
shm_mq_iovec
|
2014-05-06 15:08:14 +02:00
|
|
|
shm_mq_result
|
|
|
|
shm_toc
|
|
|
|
shm_toc_entry
|
|
|
|
shm_toc_estimator
|
2022-05-13 15:31:06 +02:00
|
|
|
shmem_request_hook_type
|
2010-02-26 02:55:35 +01:00
|
|
|
shmem_startup_hook_type
|
|
|
|
sig_atomic_t
|
|
|
|
sigjmp_buf
|
|
|
|
signedbitmapword
|
|
|
|
sigset_t
|
|
|
|
size_t
|
2013-05-29 22:58:43 +02:00
|
|
|
slist_head
|
|
|
|
slist_iter
|
2010-02-26 02:55:35 +01:00
|
|
|
slist_mutable_iter
|
2013-05-29 22:58:43 +02:00
|
|
|
slist_node
|
2010-02-26 02:55:35 +01:00
|
|
|
slock_t
|
|
|
|
socket_set
|
2022-05-12 21:17:30 +02:00
|
|
|
socklen_t
|
2012-06-10 21:15:31 +02:00
|
|
|
spgBulkDeleteState
|
|
|
|
spgChooseIn
|
|
|
|
spgChooseOut
|
|
|
|
spgChooseResultType
|
|
|
|
spgConfigIn
|
|
|
|
spgConfigOut
|
|
|
|
spgInnerConsistentIn
|
|
|
|
spgInnerConsistentOut
|
|
|
|
spgLeafConsistentIn
|
|
|
|
spgLeafConsistentOut
|
|
|
|
spgNodePtr
|
|
|
|
spgPickSplitIn
|
|
|
|
spgPickSplitOut
|
|
|
|
spgVacPendingItem
|
|
|
|
spgxlogAddLeaf
|
|
|
|
spgxlogAddNode
|
|
|
|
spgxlogMoveLeafs
|
|
|
|
spgxlogPickSplit
|
|
|
|
spgxlogSplitTuple
|
|
|
|
spgxlogState
|
|
|
|
spgxlogVacuumLeaf
|
|
|
|
spgxlogVacuumRedirect
|
|
|
|
spgxlogVacuumRoot
|
2017-05-17 21:52:16 +02:00
|
|
|
split_pathtarget_context
|
2018-06-30 18:07:27 +02:00
|
|
|
split_pathtarget_item
|
2010-02-26 02:55:35 +01:00
|
|
|
sql_error_callback_arg
|
2012-06-10 21:15:31 +02:00
|
|
|
sqlparseInfo
|
|
|
|
sqlparseState
|
2010-02-26 02:55:35 +01:00
|
|
|
ss_lru_item_t
|
|
|
|
ss_scan_location_t
|
|
|
|
ss_scan_locations_t
|
|
|
|
ssize_t
|
2013-05-29 22:58:43 +02:00
|
|
|
standard_qp_extra
|
2010-02-26 02:55:35 +01:00
|
|
|
stemmer_module
|
|
|
|
stmtCacheEntry
|
2012-06-10 21:15:31 +02:00
|
|
|
storeInfo
|
|
|
|
storeRes_func
|
|
|
|
stream_stop_callback
|
2018-04-26 20:45:04 +02:00
|
|
|
string
|
2010-02-26 02:55:35 +01:00
|
|
|
substitute_actual_parameters_context
|
|
|
|
substitute_actual_srf_parameters_context
|
|
|
|
substitute_phv_relids_context
|
|
|
|
symbol
|
2011-04-09 05:11:37 +02:00
|
|
|
tablespaceinfo
|
2010-02-26 02:55:35 +01:00
|
|
|
teSection
|
2011-04-09 05:11:37 +02:00
|
|
|
temp_tablespaces_extra
|
2014-05-06 15:08:14 +02:00
|
|
|
test_re_flags
|
|
|
|
test_regex_ctx
|
|
|
|
test_shm_mq_header
|
2019-05-22 18:55:34 +02:00
|
|
|
test_spec
|
2017-05-17 21:52:16 +02:00
|
|
|
test_start_function
|
2010-02-26 02:55:35 +01:00
|
|
|
text
|
|
|
|
timeKEY
|
|
|
|
time_t
|
2013-05-29 22:58:43 +02:00
|
|
|
timeout_handler_proc
|
|
|
|
timeout_params
|
2011-04-09 05:11:37 +02:00
|
|
|
timerCA
|
2010-02-26 02:55:35 +01:00
|
|
|
tlist_vinfo
|
2015-05-24 03:20:37 +02:00
|
|
|
toast_compress_header
|
2022-11-14 03:58:10 +01:00
|
|
|
tokenize_error_callback_arg
|
2010-07-06 21:18:19 +02:00
|
|
|
transferMode
|
2012-06-10 21:15:31 +02:00
|
|
|
transfer_thread_arg
|
2010-02-26 02:55:35 +01:00
|
|
|
trgm
|
2013-05-29 22:58:43 +02:00
|
|
|
trgm_mb_char
|
2015-05-24 03:20:37 +02:00
|
|
|
trivalue
|
2010-02-26 02:55:35 +01:00
|
|
|
tsKEY
|
2018-04-26 20:45:04 +02:00
|
|
|
ts_parserstate
|
|
|
|
ts_tokenizer
|
2010-02-26 02:55:35 +01:00
|
|
|
ts_tokentype
|
|
|
|
tsearch_readline_state
|
2016-10-15 02:22:51 +02:00
|
|
|
tuplehash_hash
|
|
|
|
tuplehash_iterator
|
2018-04-26 20:45:04 +02:00
|
|
|
type
|
2010-02-26 02:55:35 +01:00
|
|
|
tzEntry
|
|
|
|
u_char
|
|
|
|
u_int
|
|
|
|
uchr
|
|
|
|
uid_t
|
2015-05-24 03:20:37 +02:00
|
|
|
uint128
|
2010-02-26 02:55:35 +01:00
|
|
|
uint16
|
2011-04-09 05:11:37 +02:00
|
|
|
uint16_t
|
2010-02-26 02:55:35 +01:00
|
|
|
uint32
|
|
|
|
uint32_t
|
|
|
|
uint64
|
2018-04-26 20:45:04 +02:00
|
|
|
uint64_t
|
2010-02-26 02:55:35 +01:00
|
|
|
uint8
|
2015-05-24 03:20:37 +02:00
|
|
|
uint8_t
|
2010-02-26 02:55:35 +01:00
|
|
|
uintptr_t
|
2015-05-24 03:20:37 +02:00
|
|
|
unicodeStyleBorderFormat
|
|
|
|
unicodeStyleColumnFormat
|
|
|
|
unicodeStyleFormat
|
|
|
|
unicodeStyleRowFormat
|
|
|
|
unicode_linestyle
|
|
|
|
unit_conversion
|
2011-04-09 05:11:37 +02:00
|
|
|
unlogged_relation_entry
|
2015-05-24 03:20:37 +02:00
|
|
|
utf_local_conversion_func
|
2016-12-13 16:51:32 +01:00
|
|
|
uuidKEY
|
2010-02-26 02:55:35 +01:00
|
|
|
uuid_rc_t
|
2016-04-27 17:47:28 +02:00
|
|
|
uuid_sortsupport_state
|
2010-02-26 02:55:35 +01:00
|
|
|
uuid_t
|
|
|
|
va_list
|
2015-05-24 03:20:37 +02:00
|
|
|
vacuumingOptions
|
2010-02-26 02:55:35 +01:00
|
|
|
validate_string_relopt
|
2015-05-24 03:20:37 +02:00
|
|
|
varatt_expanded
|
2010-02-26 02:55:35 +01:00
|
|
|
varattrib_1b
|
|
|
|
varattrib_1b_e
|
|
|
|
varattrib_4b
|
2016-04-27 17:47:28 +02:00
|
|
|
vbits
|
2010-02-26 02:55:35 +01:00
|
|
|
verifier_context
|
2017-05-17 21:52:16 +02:00
|
|
|
walrcv_check_conninfo_fn
|
2016-12-13 16:51:32 +01:00
|
|
|
walrcv_connect_fn
|
2017-05-17 21:52:16 +02:00
|
|
|
walrcv_create_slot_fn
|
2016-12-13 16:51:32 +01:00
|
|
|
walrcv_disconnect_fn
|
|
|
|
walrcv_endstreaming_fn
|
2017-05-17 21:52:16 +02:00
|
|
|
walrcv_exec_fn
|
2018-04-26 20:45:04 +02:00
|
|
|
walrcv_get_backend_pid_fn
|
2016-12-13 16:51:32 +01:00
|
|
|
walrcv_get_conninfo_fn
|
2018-04-26 20:45:04 +02:00
|
|
|
walrcv_get_senderinfo_fn
|
2016-12-13 16:51:32 +01:00
|
|
|
walrcv_identify_system_fn
|
|
|
|
walrcv_readtimelinehistoryfile_fn
|
|
|
|
walrcv_receive_fn
|
|
|
|
walrcv_send_fn
|
2018-04-26 20:45:04 +02:00
|
|
|
walrcv_server_version_fn
|
2016-12-13 16:51:32 +01:00
|
|
|
walrcv_startstreaming_fn
|
2010-02-26 02:55:35 +01:00
|
|
|
wchar2mb_with_len_converter
|
|
|
|
wchar_t
|
2011-04-09 05:11:37 +02:00
|
|
|
win32_deadchild_waitinfo
|
2010-02-26 02:55:35 +01:00
|
|
|
wint_t
|
2014-05-06 15:08:14 +02:00
|
|
|
worker_state
|
2010-02-26 02:55:35 +01:00
|
|
|
worktable
|
2018-04-26 20:45:04 +02:00
|
|
|
wrap
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_brin_createidx
|
2017-05-17 21:52:16 +02:00
|
|
|
xl_brin_desummarize
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_brin_insert
|
|
|
|
xl_brin_revmap_extend
|
|
|
|
xl_brin_samepage_update
|
|
|
|
xl_brin_update
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_btree_dedup
|
|
|
|
xl_btree_delete
|
|
|
|
xl_btree_insert
|
2014-05-06 15:08:14 +02:00
|
|
|
xl_btree_mark_page_halfdead
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_btree_metadata
|
|
|
|
xl_btree_newroot
|
|
|
|
xl_btree_reuse_page
|
|
|
|
xl_btree_split
|
2014-05-06 15:08:14 +02:00
|
|
|
xl_btree_unlink_page
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_btree_update
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_btree_vacuum
|
2017-05-17 21:52:16 +02:00
|
|
|
xl_clog_truncate
|
|
|
|
xl_commit_ts_truncate
|
Add new block-by-block strategy for CREATE DATABASE.
Because this strategy logs changes on a block-by-block basis, it
avoids the need to checkpoint before and after the operation.
However, because it logs each changed block individually, it might
generate a lot of extra write-ahead logging if the template database
is large. Therefore, the older strategy remains available via a new
STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy
option to createdb.
Somewhat controversially, this patch assembles the list of relations
to be copied to the new database by reading the pg_class relation of
the template database. Cross-database access like this isn't normally
possible, but it can be made to work here because there can't be any
connections to the database being copied, nor can it contain any
in-doubt transactions. Even so, we have to use lower-level interfaces
than normal, since the table scan and relcache interfaces will not
work for a database to which we're not connected. The advantage of
this approach is that we do not need to rely on the filesystem to
determine what ought to be copied, but instead on PostgreSQL's own
knowledge of the database structure. This avoids, for example,
copying stray files that happen to be located in the source database
directory.
Dilip Kumar, with a fairly large number of cosmetic changes by me.
Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor,
Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian,
Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro
Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others.
Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com
2022-03-29 17:31:43 +02:00
|
|
|
xl_dbase_create_file_copy_rec
|
|
|
|
xl_dbase_create_wal_log_rec
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_dbase_drop_rec
|
2013-05-29 22:58:43 +02:00
|
|
|
xl_end_of_recovery
|
2017-05-17 21:52:16 +02:00
|
|
|
xl_hash_add_ovfl_page
|
|
|
|
xl_hash_delete
|
|
|
|
xl_hash_init_bitmap_page
|
|
|
|
xl_hash_init_meta_page
|
|
|
|
xl_hash_insert
|
|
|
|
xl_hash_move_page_contents
|
|
|
|
xl_hash_split_allocate_page
|
|
|
|
xl_hash_split_complete
|
|
|
|
xl_hash_squeeze_page
|
|
|
|
xl_hash_update_meta_page
|
|
|
|
xl_hash_vacuum_one_page
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_heap_confirm
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_heap_delete
|
2014-05-06 15:08:14 +02:00
|
|
|
xl_heap_freeze_page
|
2023-01-23 11:08:38 +01:00
|
|
|
xl_heap_freeze_plan
|
2014-05-06 15:08:14 +02:00
|
|
|
xl_heap_freeze_tuple
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_heap_header
|
|
|
|
xl_heap_inplace
|
|
|
|
xl_heap_insert
|
|
|
|
xl_heap_lock
|
|
|
|
xl_heap_lock_updated
|
2011-11-14 18:12:23 +01:00
|
|
|
xl_heap_multi_insert
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
xl_heap_new_cid
|
2021-04-06 17:49:22 +02:00
|
|
|
xl_heap_prune
|
2014-05-06 15:08:14 +02:00
|
|
|
xl_heap_rewrite_mapping
|
2018-04-26 20:45:04 +02:00
|
|
|
xl_heap_truncate
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_heap_update
|
2021-04-06 17:49:22 +02:00
|
|
|
xl_heap_vacuum
|
2011-11-14 18:12:23 +01:00
|
|
|
xl_heap_visible
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_invalid_page
|
|
|
|
xl_invalid_page_key
|
2016-04-27 17:47:28 +02:00
|
|
|
xl_invalidations
|
|
|
|
xl_logical_message
|
2011-11-14 18:12:23 +01:00
|
|
|
xl_multi_insert_tuple
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_multixact_create
|
Rework the way multixact truncations work.
The fact that multixact truncations are not WAL logged has caused a fair
share of problems. Amongst others it requires to do computations during
recovery while the database is not in a consistent state, delaying
truncations till checkpoints, and handling members being truncated, but
offset not.
We tried to put bandaids on lots of these issues over the last years,
but it seems time to change course. Thus this patch introduces WAL
logging for multixact truncations.
This allows:
1) to perform the truncation directly during VACUUM, instead of delaying it
to the checkpoint.
2) to avoid looking at the offsets SLRU for truncation during recovery,
we can just use the master's values.
3) simplify a fair amount of logic to keep in memory limits straight,
this has gotten much easier
During the course of fixing this a bunch of additional bugs had to be
fixed:
1) Data was not purged from memory the member's SLRU before deleting
segments. This happened to be hard or impossible to hit due to the
interlock between checkpoints and truncation.
2) find_multixact_start() relied on SimpleLruDoesPhysicalPageExist - but
that doesn't work for offsets that haven't yet been flushed to
disk. Add code to flush the SLRUs to fix. Not pretty, but it feels
slightly safer to only make decisions based on actual on-disk state.
3) find_multixact_start() could be called concurrently with a truncation
and thus fail. Via SetOffsetVacuumLimit() that could lead to a round
of emergency vacuuming. The problem remains in
pg_get_multixact_members(), but that's quite harmless.
For now this is going to only get applied to 9.5+, leaving the issues in
the older branches in place. It is quite possible that we need to
backpatch at a later point though.
For the case this gets backpatched we need to handle that an updated
standby may be replaying WAL from a not-yet upgraded primary. We have to
recognize that situation and use "old style" truncation (i.e. looking at
the SLRUs) during WAL replay. In contrast to before, this now happens in
the startup process, when replaying a checkpoint record, instead of the
checkpointer. Doing truncation in the restartpoint is incorrect, they
can happen much later than the original checkpoint, thereby leading to
wraparound. To avoid "multixact_redo: unknown op code 48" errors
standbys would have to be upgraded before primaries.
A later patch will bump the WAL page magic, and remove the legacy
truncation codepaths. Legacy truncation support is just included to make
a possible future backpatch easier.
Discussion: 20150621192409.GA4797@alap3.anarazel.de
Reviewed-By: Robert Haas, Alvaro Herrera, Thomas Munro
Backpatch: 9.5 for now
2015-09-26 19:04:25 +02:00
|
|
|
xl_multixact_truncate
|
Fix WAL replay in presence of an incomplete record
Physical replication always ships WAL segment files to replicas once
they are complete. This is a problem if one WAL record is split across
a segment boundary and the primary server crashes before writing down
the segment with the next portion of the WAL record: WAL writing after
crash recovery would happily resume at the point where the broken record
started, overwriting that record ... but any standby or backup may have
already received a copy of that segment, and they are not rewinding.
This causes standbys to stop following the primary after the latter
crashes:
LOG: invalid contrecord length 7262 at A8/D9FFFBC8
because the standby is still trying to read the continuation record
(contrecord) for the original long WAL record, but it is not there and
it will never be. A workaround is to stop the replica, delete the WAL
file, and restart it -- at which point a fresh copy is brought over from
the primary. But that's pretty labor intensive, and I bet many users
would just give up and re-clone the standby instead.
A fix for this problem was already attempted in commit 515e3d84a0b5, but
it only addressed the case for the scenario of WAL archiving, so
streaming replication would still be a problem (as well as other things
such as taking a filesystem-level backup while the server is down after
having crashed), and it had performance scalability problems too; so it
had to be reverted.
This commit fixes the problem using an approach suggested by Andres
Freund, whereby the initial portion(s) of the split-up WAL record are
kept, and a special type of WAL record is written where the contrecord
was lost, so that WAL replay in the replica knows to skip the broken
parts. With this approach, we can continue to stream/archive segment
files as soon as they are complete, and replay of the broken records
will proceed across the crash point without a hitch.
Because a new type of WAL record is added, users should be careful to
upgrade standbys first, primaries later. Otherwise they risk the standby
being unable to start if the primary happens to write such a record.
A new TAP test that exercises this is added, but the portability of it
is yet to be seen.
This has been wrong since the introduction of physical replication, so
backpatch all the way back. In stable branches, keep the new
XLogReaderState members at the end of the struct, to avoid an ABI
break.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
2021-09-29 16:21:51 +02:00
|
|
|
xl_overwrite_contrecord
|
2010-07-06 21:18:19 +02:00
|
|
|
xl_parameter_change
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_relmap_update
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_replorigin_drop
|
|
|
|
xl_replorigin_set
|
2011-04-09 05:11:37 +02:00
|
|
|
xl_restore_point
|
2010-02-26 02:55:35 +01:00
|
|
|
xl_running_xacts
|
|
|
|
xl_seq_rec
|
|
|
|
xl_smgr_create
|
|
|
|
xl_smgr_truncate
|
|
|
|
xl_standby_lock
|
|
|
|
xl_standby_locks
|
|
|
|
xl_tblspc_create_rec
|
|
|
|
xl_tblspc_drop_rec
|
|
|
|
xl_xact_abort
|
|
|
|
xl_xact_assignment
|
|
|
|
xl_xact_commit
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_xact_dbinfo
|
|
|
|
xl_xact_invals
|
|
|
|
xl_xact_origin
|
|
|
|
xl_xact_parsed_abort
|
|
|
|
xl_xact_parsed_commit
|
2018-04-26 20:45:04 +02:00
|
|
|
xl_xact_parsed_prepare
|
|
|
|
xl_xact_prepare
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
xl_xact_relfilelocators
|
pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there
is no reliable way of getting rid of statistics entries. Because of that
pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the
current database with the catalog contents and tries to drop now-superfluous
entries. That's quite expensive. What's worse, it doesn't work on physical
replicas, despite physical replicas collection statistics entries.
This commit introduces infrastructure to create / drop statistics entries
transactionally, together with the underlying catalog objects (functions,
relations, subscriptions). pgstat_xact.c maintains a list of stats entries
created / dropped transactionally in the current transaction. To ensure the
removal of statistics entries is durable dropped statistics entries are
included in commit / abort (and prepare) records, which also ensures that
stats entries are dropped on standbys.
Statistics entries created separately from creating the underlying catalog
object (e.g. when stats were previously lost due to an immediate restart)
are *not* WAL logged. However that can only happen outside of the transaction
creating the catalog object, so it does not lead to "leaked" statistics
entries.
For this to work, functions creating / dropping functions / relations /
subscriptions need to call into pgstat. For subscriptions this was already
done when dropping subscriptions, via pgstat_report_subscription_drop() (now
renamed to pgstat_drop_subscription()).
This commit does not actually drop stats yet, it just provides the
infrastructure. It is however a largely independent piece of infrastructure,
so committing it separately makes sense.
Bumps XLOG_PAGE_MAGIC.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-07 03:22:22 +02:00
|
|
|
xl_xact_stats_item
|
|
|
|
xl_xact_stats_items
|
2015-05-24 03:20:37 +02:00
|
|
|
xl_xact_subxacts
|
|
|
|
xl_xact_twophase
|
|
|
|
xl_xact_xinfo
|
2010-02-26 02:55:35 +01:00
|
|
|
xmlBuffer
|
|
|
|
xmlBufferPtr
|
|
|
|
xmlChar
|
|
|
|
xmlDocPtr
|
2011-11-14 18:12:23 +01:00
|
|
|
xmlErrorPtr
|
2013-05-29 22:58:43 +02:00
|
|
|
xmlExternalEntityLoader
|
2011-11-14 18:12:23 +01:00
|
|
|
xmlGenericErrorFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
xmlNodePtr
|
|
|
|
xmlNodeSetPtr
|
|
|
|
xmlParserCtxtPtr
|
2011-11-14 18:12:23 +01:00
|
|
|
xmlParserInputPtr
|
|
|
|
xmlStructuredErrorFunc
|
2010-02-26 02:55:35 +01:00
|
|
|
xmlTextWriter
|
|
|
|
xmlTextWriterPtr
|
|
|
|
xmlXPathCompExprPtr
|
|
|
|
xmlXPathContextPtr
|
|
|
|
xmlXPathObjectPtr
|
|
|
|
xmltype
|
2011-04-09 05:11:37 +02:00
|
|
|
xpath_workspace
|
2013-05-29 22:58:43 +02:00
|
|
|
xsltSecurityPrefsPtr
|
2010-02-26 02:55:35 +01:00
|
|
|
xsltStylesheetPtr
|
|
|
|
xsltTransformContextPtr
|
2011-04-09 05:11:37 +02:00
|
|
|
yy_parser
|
2010-02-26 02:55:35 +01:00
|
|
|
yy_size_t
|
|
|
|
yyscan_t
|
|
|
|
z_stream
|
|
|
|
z_streamp
|
2014-05-06 15:08:14 +02:00
|
|
|
zic_t
|