2010-09-20 22:08:53 +02:00
|
|
|
src/backend/access/gin/README
|
2008-03-20 18:55:15 +01:00
|
|
|
|
2006-05-02 13:28:56 +02:00
|
|
|
Gin for PostgreSQL
|
|
|
|
==================
|
|
|
|
|
|
|
|
Gin was sponsored by jfg://networks (http://www.jfg-networks.com/)
|
|
|
|
|
|
|
|
Gin stands for Generalized Inverted Index and should be considered as a genie,
|
|
|
|
not a drink.
|
|
|
|
|
|
|
|
Generalized means that the index does not know which operation it accelerates.
|
2010-11-23 21:27:50 +01:00
|
|
|
It instead works with custom strategies, defined for specific data types (read
|
|
|
|
"Index Method Strategies" in the PostgreSQL documentation). In that sense, Gin
|
2006-05-02 13:28:56 +02:00
|
|
|
is similar to GiST and differs from btree indices, which have predefined,
|
|
|
|
comparison-based operations.
|
|
|
|
|
2010-11-23 21:27:50 +01:00
|
|
|
An inverted index is an index structure storing a set of (key, posting list)
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
pairs, where 'posting list' is a set of heap rows in which the key occurs.
|
2010-11-23 21:27:50 +01:00
|
|
|
(A text document would usually contain many keys.) The primary goal of
|
2006-05-02 13:28:56 +02:00
|
|
|
Gin indices is support for highly scalable, full-text search in PostgreSQL.
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
A Gin index consists of a B-tree index constructed over key values,
|
|
|
|
where each key is an element of some indexed items (element of array, lexeme
|
|
|
|
for tsvector) and where each tuple in a leaf page contains either a pointer to
|
|
|
|
a B-tree over item pointers (posting tree), or a simple list of item pointers
|
|
|
|
(posting list) if the list is small enough.
|
2006-05-02 13:28:56 +02:00
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
Note: There is no delete operation in the key (entry) tree. The reason for
|
|
|
|
this is that in our experience, the set of distinct words in a large corpus
|
|
|
|
changes very slowly. This greatly simplifies the code and concurrency
|
|
|
|
algorithms.
|
2006-05-02 13:28:56 +02:00
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
Core PostgreSQL includes built-in Gin support for one-dimensional arrays
|
|
|
|
(eg. integer[], text[]). The following operations are available:
|
2006-05-02 13:28:56 +02:00
|
|
|
|
2006-09-10 02:29:35 +02:00
|
|
|
* contains: value_array @> query_array
|
|
|
|
* overlaps: value_array && query_array
|
|
|
|
* is contained by: value_array <@ query_array
|
2006-05-02 13:28:56 +02:00
|
|
|
|
|
|
|
Synopsis
|
|
|
|
--------
|
|
|
|
|
|
|
|
=# create index txt_idx on aa using gin(a);
|
|
|
|
|
|
|
|
Features
|
|
|
|
--------
|
|
|
|
|
|
|
|
* Concurrency
|
|
|
|
* Write-Ahead Logging (WAL). (Recoverability from crashes.)
|
|
|
|
* User-defined opclasses. (The scheme is similar to GiST.)
|
|
|
|
* Optimized index creation (Makes use of maintenance_work_mem to accumulate
|
|
|
|
postings in memory.)
|
2007-11-14 00:36:26 +01:00
|
|
|
* Text search support via an opclass
|
2006-05-02 13:28:56 +02:00
|
|
|
* Soft upper limit on the returned results set using a GUC variable:
|
|
|
|
gin_fuzzy_search_limit
|
|
|
|
|
|
|
|
Gin Fuzzy Limit
|
|
|
|
---------------
|
|
|
|
|
|
|
|
There are often situations when a full-text search returns a very large set of
|
|
|
|
results. Since reading tuples from the disk and sorting them could take a
|
2010-11-23 21:27:50 +01:00
|
|
|
lot of time, this is unacceptable for production. (Note that the search
|
2006-05-02 13:28:56 +02:00
|
|
|
itself is very fast.)
|
|
|
|
|
2010-11-23 21:27:50 +01:00
|
|
|
Such queries usually contain very frequent lexemes, so the results are not
|
|
|
|
very helpful. To facilitate execution of such queries Gin has a configurable
|
|
|
|
soft upper limit on the size of the returned set, determined by the
|
|
|
|
'gin_fuzzy_search_limit' GUC variable. This is set to 0 by default (no
|
2006-05-02 13:28:56 +02:00
|
|
|
limit).
|
|
|
|
|
|
|
|
If a non-zero search limit is set, then the returned set is a subset of the
|
|
|
|
whole result set, chosen at random.
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
"Soft" means that the actual number of returned results could differ
|
2010-11-23 21:27:50 +01:00
|
|
|
from the specified limit, depending on the query and the quality of the
|
2006-05-02 13:28:56 +02:00
|
|
|
system's random number generator.
|
|
|
|
|
|
|
|
From experience, a value of 'gin_fuzzy_search_limit' in the thousands
|
|
|
|
(eg. 5000-20000) works well. This means that 'gin_fuzzy_search_limit' will
|
2010-11-23 21:27:50 +01:00
|
|
|
have no effect for queries returning a result set with less tuples than this
|
2006-05-02 13:28:56 +02:00
|
|
|
number.
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
Index structure
|
|
|
|
---------------
|
2006-05-02 13:28:56 +02:00
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
The "items" that a GIN index indexes are composite values that contain
|
|
|
|
zero or more "keys". For example, an item might be an integer array, and
|
|
|
|
then the keys would be the individual integer values. The index actually
|
|
|
|
stores and searches for the key values, not the items per se. In the
|
|
|
|
pg_opclass entry for a GIN opclass, the opcintype is the data type of the
|
|
|
|
items, and the opckeytype is the data type of the keys. GIN is optimized
|
|
|
|
for cases where items contain many keys and the same key values appear
|
|
|
|
in many different items.
|
|
|
|
|
|
|
|
A GIN index contains a metapage, a btree of key entries, and possibly
|
|
|
|
"posting tree" pages, which hold the overflow when a key entry acquires
|
|
|
|
too many heap tuple pointers to fit in a btree page. Additionally, if the
|
|
|
|
fast-update feature is enabled, there can be "list pages" holding "pending"
|
|
|
|
key entries that haven't yet been merged into the main btree. The list
|
|
|
|
pages have to be scanned linearly when doing a search, so the pending
|
|
|
|
entries should be merged into the main btree before there get to be too
|
|
|
|
many of them. The advantage of the pending list is that bulk insertion of
|
|
|
|
a few thousand entries can be much faster than retail insertion. (The win
|
|
|
|
comes mainly from not having to do multiple searches/insertions when the
|
|
|
|
same key appears in multiple new heap tuples.)
|
|
|
|
|
2013-11-06 09:31:38 +01:00
|
|
|
Key entries are nominally of the same IndexTuple format as used in other
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
index types, but since a leaf key entry typically refers to multiple heap
|
|
|
|
tuples, there are significant differences. (See GinFormTuple, which works
|
|
|
|
by building a "normal" index tuple and then modifying it.) The points to
|
|
|
|
know are:
|
|
|
|
|
|
|
|
* In a single-column index, a key tuple just contains the key datum, but
|
|
|
|
in a multi-column index, a key tuple contains the pair (column number,
|
|
|
|
key datum) where the column number is stored as an int2. This is needed
|
|
|
|
to support different key data types in different columns. This much of
|
|
|
|
the tuple is built by index_form_tuple according to the usual rules.
|
|
|
|
The column number (if present) can never be null, but the key datum can
|
|
|
|
be, in which case a null bitmap is present as usual. (As usual for index
|
|
|
|
tuples, the size of the null bitmap is fixed at INDEX_MAX_KEYS.)
|
|
|
|
|
|
|
|
* If the key datum is null (ie, IndexTupleHasNulls() is true), then
|
|
|
|
just after the nominal index data (ie, at offset IndexInfoFindDataOffset
|
|
|
|
or IndexInfoFindDataOffset + sizeof(int2)) there is a byte indicating
|
|
|
|
the "category" of the null entry. These are the possible categories:
|
|
|
|
1 = ordinary null key value extracted from an indexable item
|
|
|
|
2 = placeholder for zero-key indexable item
|
|
|
|
3 = placeholder for null indexable item
|
|
|
|
Placeholder null entries are inserted into the index because otherwise
|
|
|
|
there would be no index entry at all for an empty or null indexable item,
|
|
|
|
which would mean that full index scans couldn't be done and various corner
|
|
|
|
cases would give wrong answers. The different categories of null entries
|
|
|
|
are treated as distinct keys by the btree, but heap itempointers for the
|
|
|
|
same category of null entry are merged into one index entry just as happens
|
|
|
|
with ordinary key entries.
|
|
|
|
|
|
|
|
* In a key entry at the btree leaf level, at the next SHORTALIGN boundary,
|
2014-01-22 17:51:48 +01:00
|
|
|
there is a list of item pointers, in compressed format (see Posting List
|
|
|
|
Compression section), pointing to the heap tuples for which the indexable
|
|
|
|
items contain this key. This is called the "posting list".
|
|
|
|
|
|
|
|
If the list would be too big for the index tuple to fit on an index page, the
|
|
|
|
ItemPointers are pushed out to a separate posting page or pages, and none
|
|
|
|
appear in the key entry itself. The separate pages are called a "posting
|
|
|
|
tree" (see below); Note that in either case, the ItemPointers associated with
|
|
|
|
a key can easily be read out in sorted order; this is relied on by the scan
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
algorithms.
|
|
|
|
|
|
|
|
* The index tuple header fields of a leaf key entry are abused as follows:
|
|
|
|
|
|
|
|
1) Posting list case:
|
|
|
|
|
|
|
|
* ItemPointerGetBlockNumber(&itup->t_tid) contains the offset from index
|
|
|
|
tuple start to the posting list.
|
|
|
|
Access macros: GinGetPostingOffset(itup) / GinSetPostingOffset(itup,n)
|
|
|
|
|
|
|
|
* ItemPointerGetOffsetNumber(&itup->t_tid) contains the number of elements
|
|
|
|
in the posting list (number of heap itempointers).
|
|
|
|
Access macros: GinGetNPosting(itup) / GinSetNPosting(itup,n)
|
|
|
|
|
|
|
|
* If IndexTupleHasNulls(itup) is true, the null category byte can be
|
|
|
|
accessed/set with GinGetNullCategory(itup,gs) / GinSetNullCategory(itup,gs,c)
|
|
|
|
|
|
|
|
* The posting list can be accessed with GinGetPosting(itup)
|
|
|
|
|
2019-07-16 06:23:53 +02:00
|
|
|
* If GinItupIsCompressed(itup), the posting list is stored in compressed
|
2014-01-22 17:51:48 +01:00
|
|
|
format. Otherwise it is just an array of ItemPointers. New tuples are always
|
|
|
|
stored in compressed format, uncompressed items can be present if the
|
|
|
|
database was migrated from 9.3 or earlier version.
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
2) Posting tree case:
|
|
|
|
|
|
|
|
* ItemPointerGetBlockNumber(&itup->t_tid) contains the index block number
|
|
|
|
of the root of the posting tree.
|
|
|
|
Access macros: GinGetPostingTree(itup) / GinSetPostingTree(itup, blkno)
|
|
|
|
|
|
|
|
* ItemPointerGetOffsetNumber(&itup->t_tid) contains the magic number
|
|
|
|
GIN_TREE_POSTING, which distinguishes this from the posting-list case
|
|
|
|
(it's large enough that that many heap itempointers couldn't possibly
|
|
|
|
fit on an index page). This value is inserted automatically by the
|
|
|
|
GinSetPostingTree macro.
|
|
|
|
|
|
|
|
* If IndexTupleHasNulls(itup) is true, the null category byte can be
|
2014-04-20 16:30:55 +02:00
|
|
|
accessed/set with GinGetNullCategory(itup,gs) / GinSetNullCategory(itup,gs,c)
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
|
|
|
|
* The posting list is not present and must not be accessed.
|
|
|
|
|
|
|
|
Use the macro GinIsPostingTree(itup) to determine which case applies.
|
|
|
|
|
|
|
|
In both cases, itup->t_info & INDEX_SIZE_MASK contains actual total size of
|
|
|
|
tuple, and the INDEX_VAR_MASK and INDEX_NULL_MASK bits have their normal
|
|
|
|
meanings as set by index_form_tuple.
|
|
|
|
|
|
|
|
Index tuples in non-leaf levels of the btree contain the optional column
|
|
|
|
number, key datum, and null category byte as above. They do not contain
|
|
|
|
a posting list. ItemPointerGetBlockNumber(&itup->t_tid) is the downlink
|
|
|
|
to the next lower btree level, and ItemPointerGetOffsetNumber(&itup->t_tid)
|
|
|
|
is InvalidOffsetNumber. Use the access macros GinGetDownlink/GinSetDownlink
|
|
|
|
to get/set the downlink.
|
|
|
|
|
|
|
|
Index entries that appear in "pending list" pages work a tad differently as
|
|
|
|
well. The optional column number, key datum, and null category byte are as
|
|
|
|
for other GIN index entries. However, there is always exactly one heap
|
|
|
|
itempointer associated with a pending entry, and it is stored in the t_tid
|
|
|
|
header field just as in non-GIN indexes. There is no posting list.
|
|
|
|
Furthermore, the code that searches the pending list assumes that all
|
|
|
|
entries for a given heap tuple appear consecutively in the pending list and
|
|
|
|
are sorted by the column-number-plus-key-datum. The GIN_LIST_FULLROW page
|
|
|
|
flag bit tells whether entries for a given heap tuple are spread across
|
|
|
|
multiple pending-list pages. If GIN_LIST_FULLROW is set, the page contains
|
|
|
|
all the entries for one or more heap tuples. If GIN_LIST_FULLROW is clear,
|
|
|
|
the page contains entries for only one heap tuple, *and* they are not all
|
|
|
|
the entries for that tuple. (Thus, a heap tuple whose entries do not all
|
|
|
|
fit on one pending-list page must have those pages to itself, even if this
|
|
|
|
results in wasting much of the space on the preceding page and the last
|
|
|
|
page for the tuple.)
|
2006-05-02 13:28:56 +02:00
|
|
|
|
2014-01-22 17:51:48 +01:00
|
|
|
Posting tree
|
|
|
|
------------
|
|
|
|
|
|
|
|
If a posting list is too large to store in-line in a key entry, a posting tree
|
|
|
|
is created. A posting tree is a B-tree structure, where the ItemPointer is
|
|
|
|
used as the key.
|
|
|
|
|
|
|
|
Internal posting tree pages use the standard PageHeader and the same "opaque"
|
|
|
|
struct as other GIN page, but do not contain regular index tuples. Instead,
|
|
|
|
the contents of the page is an array of PostingItem structs. Each PostingItem
|
|
|
|
consists of the block number of the child page, and the right bound of that
|
|
|
|
child page, as an ItemPointer. The right bound of the page is stored right
|
|
|
|
after the page header, before the PostingItem array.
|
|
|
|
|
|
|
|
Posting tree leaf pages also use the standard PageHeader and opaque struct,
|
2014-04-22 21:36:32 +02:00
|
|
|
and the right bound of the page is stored right after the page header, but
|
|
|
|
the page content comprises of a number of compressed posting lists. The
|
|
|
|
compressed posting lists are stored one after each other, between page header
|
|
|
|
and pd_lower. The space between pd_lower and pd_upper is unused, which allows
|
|
|
|
full-page images of posting tree leaf pages to skip the unused space in middle
|
|
|
|
(buffer_std = true in XLogRecData).
|
2014-01-22 17:51:48 +01:00
|
|
|
|
|
|
|
The item pointers are stored in a number of independent compressed posting
|
|
|
|
lists (also called segments), instead of one big one, to make random access
|
|
|
|
to a given item pointer faster: to find an item in a compressed list, you
|
|
|
|
have to read the list from the beginning, but when the items are split into
|
|
|
|
multiple lists, you can first skip over to the list containing the item you're
|
|
|
|
looking for, and read only that segment. Also, an update only needs to
|
|
|
|
re-encode the affected segment.
|
|
|
|
|
|
|
|
Posting List Compression
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
To fit as many item pointers on a page as possible, posting tree leaf pages
|
|
|
|
and posting lists stored inline in entry tree leaf tuples use a lightweight
|
|
|
|
form of compression. We take advantage of the fact that the item pointers
|
|
|
|
are stored in sorted order. Instead of storing the block and offset number of
|
|
|
|
each item pointer separately, we store the difference from the previous item.
|
|
|
|
That in itself doesn't do much, but it allows us to use so-called varbyte
|
|
|
|
encoding to compress them.
|
|
|
|
|
|
|
|
Varbyte encoding is a method to encode integers, allowing smaller numbers to
|
|
|
|
take less space at the cost of larger numbers. Each integer is represented by
|
|
|
|
variable number of bytes. High bit of each byte in varbyte encoding determines
|
|
|
|
whether the next byte is still part of this number. Therefore, to read a single
|
|
|
|
varbyte encoded number, you have to read bytes until you find a byte with the
|
|
|
|
high bit not set.
|
|
|
|
|
|
|
|
When encoding, the block and offset number forming the item pointer are
|
|
|
|
combined into a single integer. The offset number is stored in the 11 low
|
|
|
|
bits (see MaxHeapTuplesPerPageBits in ginpostinglist.c), and the block number
|
|
|
|
is stored in the higher bits. That requires 43 bits in total, which
|
|
|
|
conveniently fits in at most 6 bytes.
|
|
|
|
|
|
|
|
A compressed posting list is passed around and stored on disk in a
|
|
|
|
PackedPostingList struct. The first item in the list is stored uncompressed
|
|
|
|
as a regular ItemPointerData, followed by the length of the list in bytes,
|
|
|
|
followed by the packed items.
|
|
|
|
|
2013-11-08 21:21:42 +01:00
|
|
|
Concurrency
|
|
|
|
-----------
|
|
|
|
|
|
|
|
The entry tree and each posting tree is a B-tree, with right-links connecting
|
|
|
|
sibling pages at the same level. This is the same structure that is used in
|
|
|
|
the regular B-tree indexam (invented by Lehman & Yao), but we don't support
|
|
|
|
scanning a GIN trees backwards, so we don't need left-links.
|
|
|
|
|
|
|
|
To avoid deadlocks, B-tree pages must always be locked in the same order:
|
|
|
|
left to right, and bottom to top. When searching, the tree is traversed from
|
|
|
|
top to bottom, so the lock on the parent page must be released before
|
|
|
|
descending to the next level. Concurrent page splits move the keyspace to
|
|
|
|
right, so after following a downlink, the page actually containing the key
|
|
|
|
we're looking for might be somewhere to the right of the page we landed on.
|
|
|
|
In that case, we follow the right-links until we find the page we're looking
|
|
|
|
for.
|
|
|
|
|
|
|
|
To delete a page, the page's left sibling, the target page, and its parent,
|
|
|
|
are locked in that order, and the page is marked as deleted. However, a
|
|
|
|
concurrent search might already have read a pointer to the page, and might be
|
|
|
|
just about to follow it. A page can be reached via the right-link of its left
|
|
|
|
sibling, or via its downlink in the parent.
|
|
|
|
|
|
|
|
To prevent a backend from reaching a deleted page via a right-link, when
|
|
|
|
following a right-link the lock on the previous page is not released until
|
|
|
|
the lock on next page has been acquired.
|
|
|
|
|
|
|
|
The downlink is more tricky. A search descending the tree must release the
|
|
|
|
lock on the parent page before locking the child, or it could deadlock with
|
|
|
|
a concurrent split of the child page; a page split locks the parent, while
|
2018-12-13 04:12:31 +01:00
|
|
|
already holding a lock on the child page. So, deleted page cannot be reclaimed
|
|
|
|
immediately. Instead, we have to wait for every transaction, which might wait
|
|
|
|
to reference this page, to finish. Corresponding processes must observe that
|
|
|
|
the page is marked deleted and recover accordingly.
|
2013-11-08 21:21:42 +01:00
|
|
|
|
|
|
|
The previous paragraph's reasoning only applies to searches, and only to
|
|
|
|
posting trees. To protect from inserters following a downlink to a deleted
|
|
|
|
page, vacuum simply locks out all concurrent insertions to the posting tree,
|
2018-12-13 04:12:11 +01:00
|
|
|
by holding a super-exclusive lock on the posting tree root. Inserters hold a
|
|
|
|
pin on the root page, but searches do not, so while new searches cannot begin
|
|
|
|
while root page is locked, any already-in-progress scans can continue
|
|
|
|
concurrently with vacuum. In the entry tree, we never delete pages.
|
2013-11-08 21:21:42 +01:00
|
|
|
|
|
|
|
(This is quite different from the mechanism the btree indexam uses to make
|
|
|
|
page-deletions safe; it stamps the deleted pages with an XID and keeps the
|
|
|
|
deleted pages around with the right-link intact until all concurrent scans
|
|
|
|
have finished.)
|
|
|
|
|
Re-think predicate locking on GIN indexes.
The principle behind the locking was not very well thought-out, and not
documented. Add a section in the README to explain how it's supposed to
work, and change the code so that it actually works that way.
This fixes two bugs:
1. If fast update was turned on concurrently, subsequent inserts to the
pending list would not conflict with predicate locks that were acquired
earlier, on entry pages. The included 'predicate-gin-fastupdate' test
demonstrates that. To fix, make all scans acquire a predicate lock on
the metapage. That lock represents a scan of the pending list, whether
or not there is a pending list at the moment. Forget about the
optimization to skip locking/checking for locks, when fastupdate=off.
2. If a scan finds no match, it still needs to lock the entry page. The
point of predicate locks is to lock the gabs between values, whether
or not there is a match. The included 'predicate-gin-nomatch' test
tests that case.
In addition to those two bug fixes, this removes some unnecessary locking,
following the principle laid out in the README. Because all items in
a posting tree have the same key value, a lock on the posting tree root is
enough to cover all the items. (With a very large posting tree, it would
possibly be better to lock the posting tree leaf pages instead, so that a
"skip scan" with a query like "A & B", you could avoid unnecessary conflict
if a new tuple is inserted with A but !B. But let's keep this simple.)
Also, some spelling fixes.
Author: Heikki Linnakangas with some editorization by me
Review: Andrey Borodin, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/0b3ad2c2-2692-62a9-3a04-5724f2af9114@iki.fi
2018-05-04 10:27:50 +02:00
|
|
|
Predicate Locking
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
GIN supports predicate locking, for serializable snapshot isolation.
|
|
|
|
A predicate locks represent that a scan has scanned a range of values. They
|
|
|
|
are not concerned with physical pages as such, but the logical key values.
|
|
|
|
A predicate lock on a page covers the key range that would belong on that
|
|
|
|
page, whether or not there are any matching tuples there currently. In other
|
|
|
|
words, a predicate lock on an index page covers the "gaps" between the index
|
|
|
|
tuples. To minimize false positives, predicate locks are acquired at the
|
|
|
|
finest level possible.
|
|
|
|
|
|
|
|
* Like in the B-tree index, it is enough to lock only leaf pages, because all
|
|
|
|
insertions happen at the leaf level.
|
|
|
|
|
|
|
|
* In an equality search (i.e. not a partial match search), if a key entry has
|
|
|
|
a posting tree, we lock the posting tree root page, to represent a lock on
|
|
|
|
just that key entry. Otherwise, we lock the entry tree page. We also lock
|
|
|
|
the entry tree page if no match is found, to lock the "gap" where the entry
|
|
|
|
would've been, had there been one.
|
|
|
|
|
|
|
|
* In a partial match search, we lock all the entry leaf pages that we scan,
|
|
|
|
in addition to locks on posting tree roots, to represent the "gaps" between
|
|
|
|
values.
|
|
|
|
|
|
|
|
* In addition to the locks on entry leaf pages and posting tree roots, all
|
|
|
|
scans grab a lock the metapage. This is to interlock with insertions to
|
|
|
|
the fast update pending list. An insertion to the pending list can really
|
|
|
|
belong anywhere in the tree, and the lock on the metapage represents that.
|
|
|
|
|
|
|
|
The interlock for fastupdate pending lists means that with fastupdate=on,
|
|
|
|
we effectively always grab a full-index lock, so you could get a lot of false
|
|
|
|
positives.
|
|
|
|
|
2014-01-22 17:51:48 +01:00
|
|
|
Compatibility
|
|
|
|
-------------
|
|
|
|
|
|
|
|
Compression of TIDs was introduced in 9.4. Some GIN indexes could remain in
|
|
|
|
uncompressed format because of pg_upgrade from 9.3 or earlier versions.
|
|
|
|
For compatibility, old uncompressed format is also supported. Following
|
|
|
|
rules are used to handle it:
|
|
|
|
|
|
|
|
* GIN_ITUP_COMPRESSED flag marks index tuples that contain a posting list.
|
|
|
|
This flag is stored in high bit of ItemPointerGetBlockNumber(&itup->t_tid).
|
|
|
|
Use GinItupIsCompressed(itup) to check the flag.
|
|
|
|
|
|
|
|
* Posting tree pages in the new format are marked with the GIN_COMPRESSED flag.
|
|
|
|
Macros GinPageIsCompressed(page) and GinPageSetCompressed(page) are used to
|
|
|
|
check and set this flag.
|
|
|
|
|
|
|
|
* All scan operations check format of posting list add use corresponding code
|
|
|
|
to read its content.
|
|
|
|
|
|
|
|
* When updating an index tuple containing an uncompressed posting list, it
|
|
|
|
will be replaced with new index tuple containing a compressed list.
|
|
|
|
|
|
|
|
* When updating an uncompressed posting tree leaf page, it's compressed.
|
|
|
|
|
|
|
|
* If vacuum finds some dead TIDs in uncompressed posting lists, they are
|
|
|
|
converted into compressed posting lists. This assumes that the compressed
|
|
|
|
posting list fits in the space occupied by the uncompressed list. IOW, we
|
|
|
|
assume that the compressed version of the page, with the dead items removed,
|
|
|
|
takes less space than the old uncompressed version.
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
Limitations
|
|
|
|
-----------
|
2006-05-02 13:28:56 +02:00
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
* Gin doesn't use scan->kill_prior_tuple & scan->ignore_killed_tuples
|
|
|
|
* Gin searches entries only by equality matching, or simple range
|
|
|
|
matching using the "partial match" feature.
|
2006-05-02 13:28:56 +02:00
|
|
|
|
|
|
|
TODO
|
|
|
|
----
|
|
|
|
|
|
|
|
Nearest future:
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
* Opclasses for more types (no programming, just many catalog changes)
|
2006-05-02 13:28:56 +02:00
|
|
|
|
|
|
|
Distant future:
|
|
|
|
|
|
|
|
* Replace B-tree of entries to something like GiST
|
|
|
|
|
|
|
|
Authors
|
|
|
|
-------
|
|
|
|
|
Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-08 01:16:24 +01:00
|
|
|
Original work was done by Teodor Sigaev (teodor@sigaev.ru) and Oleg Bartunov
|
2006-05-02 13:28:56 +02:00
|
|
|
(oleg@sai.msu.su).
|