Improve SP-GiST opclass API to better support unlabeled nodes.

Previously, the spgSplitTuple action could only create a new upper tuple
containing a single labeled node.  This made it useless for opclasses
that prefer to work with fixed sets of nodes (labeled or otherwise),
which meant that restrictive prefixes could not be used with such
node definitions.  Change the output field set for the choose() method
to allow it to specify any valid node set for the new upper tuple,
and to specify which of these nodes to place the modified lower tuple in.

In addition to its primary use for fixed node sets, this feature could
allow existing opclasses that use variable node sets to skip a separate
spgAddNode action when splitting a tuple, by setting up the node needed
for the incoming value as part of the spgSplitTuple action.  However, care
would have to be taken to add the extra node only when it would not make
the tuple bigger than before.  (spgAddNode can enlarge the tuple,
spgSplitTuple can't.)

This is a prerequisite for an upcoming SP-GiST inet opclass, but is
being committed separately to increase the visibility of the API change.

In passing, improve the documentation about the traverse-values feature
that was added by commit ccd6eb49a.

Emre Hasegeli, with cosmetic adjustments and documentation rework by me

Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
This commit is contained in:
Tom Lane 2016-08-23 12:10:25 -04:00
parent 86f31695f3
commit d2ddee63b4
4 changed files with 115 additions and 63 deletions

View File

@ -114,7 +114,7 @@
</row> </row>
<row> <row>
<entry><literal>box_ops</></entry> <entry><literal>box_ops</></entry>
<entry>box</entry> <entry><type>box</></entry>
<entry> <entry>
<literal>&lt;&lt;</> <literal>&lt;&lt;</>
<literal>&amp;&lt;</> <literal>&amp;&lt;</>
@ -183,11 +183,14 @@
Inner tuples are more complex, since they are branching points in the Inner tuples are more complex, since they are branching points in the
search tree. Each inner tuple contains a set of one or more search tree. Each inner tuple contains a set of one or more
<firstterm>nodes</>, which represent groups of similar leaf values. <firstterm>nodes</>, which represent groups of similar leaf values.
A node contains a downlink that leads to either another, lower-level inner A node contains a downlink that leads either to another, lower-level inner
tuple, or a short list of leaf tuples that all lie on the same index page. tuple, or to a short list of leaf tuples that all lie on the same index page.
Each node has a <firstterm>label</> that describes it; for example, Each node normally has a <firstterm>label</> that describes it; for example,
in a radix tree the node label could be the next character of the string in a radix tree the node label could be the next character of the string
value. Optionally, an inner tuple can have a <firstterm>prefix</> value value. (Alternatively, an operator class can omit the node labels, if it
works with a fixed set of nodes for all inner tuples;
see <xref linkend="spgist-null-labels">.)
Optionally, an inner tuple can have a <firstterm>prefix</> value
that describes all its members. In a radix tree this could be the common that describes all its members. In a radix tree this could be the common
prefix of the represented strings. The prefix value is not necessarily prefix of the represented strings. The prefix value is not necessarily
really a prefix, but can be any data needed by the operator class; really a prefix, but can be any data needed by the operator class;
@ -202,7 +205,8 @@
tuple, so the <acronym>SP-GiST</acronym> core provides the possibility for tuple, so the <acronym>SP-GiST</acronym> core provides the possibility for
operator classes to manage level counting while descending the tree. operator classes to manage level counting while descending the tree.
There is also support for incrementally reconstructing the represented There is also support for incrementally reconstructing the represented
value when that is needed. value when that is needed, and for passing down additional data (called
<firstterm>traverse values</>) during a tree descent.
</para> </para>
<note> <note>
@ -343,10 +347,13 @@ typedef struct spgChooseOut
} addNode; } addNode;
struct /* results for spgSplitTuple */ struct /* results for spgSplitTuple */
{ {
/* Info to form new inner tuple with one node */ /* Info to form new upper-level inner tuple with one child tuple */
bool prefixHasPrefix; /* tuple should have a prefix? */ bool prefixHasPrefix; /* tuple should have a prefix? */
Datum prefixPrefixDatum; /* if so, its value */ Datum prefixPrefixDatum; /* if so, its value */
Datum nodeLabel; /* node's label */ int prefixNNodes; /* number of nodes */
Datum *prefixNodeLabels; /* their labels (or NULL for
* no labels) */
int childNodeN; /* which node gets child tuple */
/* Info to form new lower-level inner tuple with all old nodes */ /* Info to form new lower-level inner tuple with all old nodes */
bool postfixHasPrefix; /* tuple should have a prefix? */ bool postfixHasPrefix; /* tuple should have a prefix? */
@ -416,29 +423,33 @@ typedef struct spgChooseOut
set <structfield>resultType</> to <literal>spgSplitTuple</>. set <structfield>resultType</> to <literal>spgSplitTuple</>.
This action moves all the existing nodes into a new lower-level This action moves all the existing nodes into a new lower-level
inner tuple, and replaces the existing inner tuple with a tuple inner tuple, and replaces the existing inner tuple with a tuple
having a single node that links to the new lower-level inner tuple. having a single downlink pointing to the new lower-level inner tuple.
Set <structfield>prefixHasPrefix</> to indicate whether the new Set <structfield>prefixHasPrefix</> to indicate whether the new
upper tuple should have a prefix, and if so set upper tuple should have a prefix, and if so set
<structfield>prefixPrefixDatum</> to the prefix value. This new <structfield>prefixPrefixDatum</> to the prefix value. This new
prefix value must be sufficiently less restrictive than the original prefix value must be sufficiently less restrictive than the original
to accept the new value to be indexed, and it should be no longer to accept the new value to be indexed.
than the original prefix. Set <structfield>prefixNNodes</> to the number of nodes needed in the
Set <structfield>nodeLabel</> to the label to be used for the new tuple, and set <structfield>prefixNodeLabels</> to a palloc'd array
node that will point to the new lower-level inner tuple. holding their labels, or to NULL if node labels are not required.
Note that the total size of the new upper tuple must be no more
than the total size of the tuple it is replacing; this constrains
the lengths of the new prefix and new labels.
Set <structfield>childNodeN</> to the index (from zero) of the node
that will downlink to the new lower-level inner tuple.
Set <structfield>postfixHasPrefix</> to indicate whether the new Set <structfield>postfixHasPrefix</> to indicate whether the new
lower-level inner tuple should have a prefix, and if so set lower-level inner tuple should have a prefix, and if so set
<structfield>postfixPrefixDatum</> to the prefix value. The <structfield>postfixPrefixDatum</> to the prefix value. The
combination of these two prefixes and the additional label must combination of these two prefixes and the downlink node's label
have the same meaning as the original prefix, because there is (if any) must have the same meaning as the original prefix, because
no opportunity to alter the node labels that are moved to the new there is no opportunity to alter the node labels that are moved to
lower-level tuple, nor to change any child index entries. the new lower-level tuple, nor to change any child index entries.
After the node has been split, the <function>choose</function> After the node has been split, the <function>choose</function>
function will be called again with the replacement inner tuple. function will be called again with the replacement inner tuple.
That call will usually result in an <literal>spgAddNode</> result, That call may return an <literal>spgAddNode</> result, if no suitable
since presumably the node label added in the split step will not node was created by the <literal>spgSplitTuple</> action. Eventually
match the new value; so after that, there will be a third call <function>choose</function> must return <literal>spgMatchNode</> to
that finally returns <literal>spgMatchNode</> and allows the allow the insertion to descend to the next level.
insertion to descend to the leaf level.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -492,9 +503,8 @@ typedef struct spgPickSplitOut
<structfield>prefixDatum</> to the prefix value. <structfield>prefixDatum</> to the prefix value.
Set <structfield>nNodes</> to indicate the number of nodes that Set <structfield>nNodes</> to indicate the number of nodes that
the new inner tuple will contain, and the new inner tuple will contain, and
set <structfield>nodeLabels</> to an array of their label values. set <structfield>nodeLabels</> to an array of their label values,
(If the nodes do not require labels, set <structfield>nodeLabels</> or to NULL if node labels are not required.
to NULL; see <xref linkend="spgist-null-labels"> for details.)
Set <structfield>mapTuplesToNodes</> to an array that gives the index Set <structfield>mapTuplesToNodes</> to an array that gives the index
(from zero) of the node that each leaf tuple should be assigned to. (from zero) of the node that each leaf tuple should be assigned to.
Set <structfield>leafTupleDatums</> to an array of the values to Set <structfield>leafTupleDatums</> to an array of the values to
@ -561,7 +571,7 @@ typedef struct spgInnerConsistentIn
Datum reconstructedValue; /* value reconstructed at parent */ Datum reconstructedValue; /* value reconstructed at parent */
void *traversalValue; /* opclass-specific traverse value */ void *traversalValue; /* opclass-specific traverse value */
MemoryContext traversalMemoryContext; MemoryContext traversalMemoryContext; /* put new traverse values here */
int level; /* current level (counting from zero) */ int level; /* current level (counting from zero) */
bool returnData; /* original data must be returned? */ bool returnData; /* original data must be returned? */
@ -580,7 +590,6 @@ typedef struct spgInnerConsistentOut
int *levelAdds; /* increment level by this much for each */ int *levelAdds; /* increment level by this much for each */
Datum *reconstructedValues; /* associated reconstructed values */ Datum *reconstructedValues; /* associated reconstructed values */
void **traversalValues; /* opclass-specific traverse values */ void **traversalValues; /* opclass-specific traverse values */
} spgInnerConsistentOut; } spgInnerConsistentOut;
</programlisting> </programlisting>
@ -599,6 +608,11 @@ typedef struct spgInnerConsistentOut
parent tuple; it is <literal>(Datum) 0</> at the root level or if the parent tuple; it is <literal>(Datum) 0</> at the root level or if the
<function>inner_consistent</> function did not provide a value at the <function>inner_consistent</> function did not provide a value at the
parent level. parent level.
<structfield>traversalValue</> is a pointer to any traverse data
passed down from the previous call of <function>inner_consistent</>
on the parent index tuple, or NULL at the root level.
<structfield>traversalMemoryContext</> is the memory context in which
to store output traverse values (see below).
<structfield>level</> is the current inner tuple's level, starting at <structfield>level</> is the current inner tuple's level, starting at
zero for the root level. zero for the root level.
<structfield>returnData</> is <literal>true</> if reconstructed data is <structfield>returnData</> is <literal>true</> if reconstructed data is
@ -615,9 +629,6 @@ typedef struct spgInnerConsistentOut
inner tuple, and inner tuple, and
<structfield>nodeLabels</> is an array of their label values, or <structfield>nodeLabels</> is an array of their label values, or
NULL if the nodes do not have labels. NULL if the nodes do not have labels.
<structfield>traversalValue</> is a pointer to data that
<function>inner_consistent</> gets when called on child nodes from an
outer call of <function>inner_consistent</> on parent nodes.
</para> </para>
<para> <para>
@ -633,17 +644,19 @@ typedef struct spgInnerConsistentOut
<structfield>reconstructedValues</> to an array of the values <structfield>reconstructedValues</> to an array of the values
reconstructed for each child node to be visited; otherwise, leave reconstructed for each child node to be visited; otherwise, leave
<structfield>reconstructedValues</> as NULL. <structfield>reconstructedValues</> as NULL.
If it is desired to pass down additional out-of-band information
(<quote>traverse values</>) to lower levels of the tree search,
set <structfield>traversalValues</> to an array of the appropriate
traverse values, one for each child node to be visited; otherwise,
leave <structfield>traversalValues</> as NULL.
Note that the <function>inner_consistent</> function is Note that the <function>inner_consistent</> function is
responsible for palloc'ing the responsible for palloc'ing the
<structfield>nodeNumbers</>, <structfield>levelAdds</> and <structfield>nodeNumbers</>, <structfield>levelAdds</>,
<structfield>reconstructedValues</> arrays. <structfield>reconstructedValues</>, and
Sometimes accumulating some information is needed, while <structfield>traversalValues</> arrays in the current memory context.
descending from parent to child node was happened. In this case However, any output traverse values pointed to by
<structfield>traversalValues</> array keeps pointers to the <structfield>traversalValues</> array should be allocated
specific data you need to accumulate for every child node. in <structfield>traversalMemoryContext</>.
Memory for <structfield>traversalValues</> should be allocated in
the default context, but each element of it should be allocated in
<structfield>traversalMemoryContext</>.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -670,8 +683,8 @@ typedef struct spgLeafConsistentIn
ScanKey scankeys; /* array of operators and comparison values */ ScanKey scankeys; /* array of operators and comparison values */
int nkeys; /* length of array */ int nkeys; /* length of array */
void *traversalValue; /* opclass-specific traverse value */
Datum reconstructedValue; /* value reconstructed at parent */ Datum reconstructedValue; /* value reconstructed at parent */
void *traversalValue; /* opclass-specific traverse value */
int level; /* current level (counting from zero) */ int level; /* current level (counting from zero) */
bool returnData; /* original data must be returned? */ bool returnData; /* original data must be returned? */
@ -700,6 +713,9 @@ typedef struct spgLeafConsistentOut
parent tuple; it is <literal>(Datum) 0</> at the root level or if the parent tuple; it is <literal>(Datum) 0</> at the root level or if the
<function>inner_consistent</> function did not provide a value at the <function>inner_consistent</> function did not provide a value at the
parent level. parent level.
<structfield>traversalValue</> is a pointer to any traverse data
passed down from the previous call of <function>inner_consistent</>
on the parent index tuple, or NULL at the root level.
<structfield>level</> is the current leaf tuple's level, starting at <structfield>level</> is the current leaf tuple's level, starting at
zero for the root level. zero for the root level.
<structfield>returnData</> is <literal>true</> if reconstructed data is <structfield>returnData</> is <literal>true</> if reconstructed data is
@ -797,7 +813,10 @@ typedef struct spgLeafConsistentOut
point. In such a case the code typically works with the nodes by point. In such a case the code typically works with the nodes by
number, and there is no need for explicit node labels. To suppress number, and there is no need for explicit node labels. To suppress
node labels (and thereby save some space), the <function>picksplit</> node labels (and thereby save some space), the <function>picksplit</>
function can return NULL for the <structfield>nodeLabels</> array. function can return NULL for the <structfield>nodeLabels</> array,
and likewise the <function>choose</> function can return NULL for
the <structfield>prefixNodeLabels</> array during
a <literal>spgSplitTuple</> action.
This will in turn result in <structfield>nodeLabels</> being NULL during This will in turn result in <structfield>nodeLabels</> being NULL during
subsequent calls to <function>choose</> and <function>inner_consistent</>. subsequent calls to <function>choose</> and <function>inner_consistent</>.
In principle, node labels could be used for some inner tuples and omitted In principle, node labels could be used for some inner tuples and omitted
@ -807,10 +826,7 @@ typedef struct spgLeafConsistentOut
<para> <para>
When working with an inner tuple having unlabeled nodes, it is an error When working with an inner tuple having unlabeled nodes, it is an error
for <function>choose</> to return <literal>spgAddNode</>, since the set for <function>choose</> to return <literal>spgAddNode</>, since the set
of nodes is supposed to be fixed in such cases. Also, there is no of nodes is supposed to be fixed in such cases.
provision for generating an unlabeled node in <literal>spgSplitTuple</>
actions, since it is expected that an <literal>spgAddNode</> action will
be needed as well.
</para> </para>
</sect2> </sect2>
@ -859,11 +875,10 @@ typedef struct spgLeafConsistentOut
<para> <para>
The <productname>PostgreSQL</productname> source distribution includes The <productname>PostgreSQL</productname> source distribution includes
several examples of index operator classes for several examples of index operator classes for <acronym>SP-GiST</acronym>,
<acronym>SP-GiST</acronym>. The core system currently provides radix as described in <xref linkend="spgist-builtin-opclasses-table">. Look
trees over text columns and two types of trees over points: quad-tree and into <filename>src/backend/access/spgist/</>
k-d tree. Look into <filename>src/backend/access/spgist/</> to see the and <filename>src/backend/utils/adt/</> to see the code.
code.
</para> </para>
</sect1> </sect1>

View File

@ -1705,17 +1705,40 @@ spgSplitNodeAction(Relation index, SpGistState *state,
/* Should not be applied to nulls */ /* Should not be applied to nulls */
Assert(!SpGistPageStoresNulls(current->page)); Assert(!SpGistPageStoresNulls(current->page));
/* Check opclass gave us sane values */
if (out->result.splitTuple.prefixNNodes <= 0 ||
out->result.splitTuple.prefixNNodes > SGITMAXNNODES)
elog(ERROR, "invalid number of prefix nodes: %d",
out->result.splitTuple.prefixNNodes);
if (out->result.splitTuple.childNodeN < 0 ||
out->result.splitTuple.childNodeN >=
out->result.splitTuple.prefixNNodes)
elog(ERROR, "invalid child node number: %d",
out->result.splitTuple.childNodeN);
/* /*
* Construct new prefix tuple, containing a single node with the specified * Construct new prefix tuple with requested number of nodes. We'll fill
* label. (We'll update the node's downlink to point to the new postfix * in the childNodeN'th node's downlink below.
* tuple, below.)
*/ */
node = spgFormNodeTuple(state, out->result.splitTuple.nodeLabel, false); nodes = (SpGistNodeTuple *) palloc(sizeof(SpGistNodeTuple) *
out->result.splitTuple.prefixNNodes);
for (i = 0; i < out->result.splitTuple.prefixNNodes; i++)
{
Datum label = (Datum) 0;
bool labelisnull;
labelisnull = (out->result.splitTuple.prefixNodeLabels == NULL);
if (!labelisnull)
label = out->result.splitTuple.prefixNodeLabels[i];
nodes[i] = spgFormNodeTuple(state, label, labelisnull);
}
prefixTuple = spgFormInnerTuple(state, prefixTuple = spgFormInnerTuple(state,
out->result.splitTuple.prefixHasPrefix, out->result.splitTuple.prefixHasPrefix,
out->result.splitTuple.prefixPrefixDatum, out->result.splitTuple.prefixPrefixDatum,
1, &node); out->result.splitTuple.prefixNNodes,
nodes);
/* it must fit in the space that innerTuple now occupies */ /* it must fit in the space that innerTuple now occupies */
if (prefixTuple->size > innerTuple->size) if (prefixTuple->size > innerTuple->size)
@ -1807,10 +1830,12 @@ spgSplitNodeAction(Relation index, SpGistState *state,
* the postfix tuple first.) We have to update the local copy of the * the postfix tuple first.) We have to update the local copy of the
* prefixTuple too, because that's what will be written to WAL. * prefixTuple too, because that's what will be written to WAL.
*/ */
spgUpdateNodeLink(prefixTuple, 0, postfixBlkno, postfixOffset); spgUpdateNodeLink(prefixTuple, out->result.splitTuple.childNodeN,
postfixBlkno, postfixOffset);
prefixTuple = (SpGistInnerTuple) PageGetItem(current->page, prefixTuple = (SpGistInnerTuple) PageGetItem(current->page,
PageGetItemId(current->page, current->offnum)); PageGetItemId(current->page, current->offnum));
spgUpdateNodeLink(prefixTuple, 0, postfixBlkno, postfixOffset); spgUpdateNodeLink(prefixTuple, out->result.splitTuple.childNodeN,
postfixBlkno, postfixOffset);
MarkBufferDirty(current->buffer); MarkBufferDirty(current->buffer);

View File

@ -212,9 +212,14 @@ spg_text_choose(PG_FUNCTION_ARGS)
out->result.splitTuple.prefixPrefixDatum = out->result.splitTuple.prefixPrefixDatum =
formTextDatum(prefixStr, commonLen); formTextDatum(prefixStr, commonLen);
} }
out->result.splitTuple.nodeLabel = out->result.splitTuple.prefixNNodes = 1;
out->result.splitTuple.prefixNodeLabels =
(Datum *) palloc(sizeof(Datum));
out->result.splitTuple.prefixNodeLabels[0] =
Int16GetDatum(*(unsigned char *) (prefixStr + commonLen)); Int16GetDatum(*(unsigned char *) (prefixStr + commonLen));
out->result.splitTuple.childNodeN = 0;
if (prefixSize - commonLen == 1) if (prefixSize - commonLen == 1)
{ {
out->result.splitTuple.postfixHasPrefix = false; out->result.splitTuple.postfixHasPrefix = false;
@ -280,7 +285,10 @@ spg_text_choose(PG_FUNCTION_ARGS)
out->resultType = spgSplitTuple; out->resultType = spgSplitTuple;
out->result.splitTuple.prefixHasPrefix = in->hasPrefix; out->result.splitTuple.prefixHasPrefix = in->hasPrefix;
out->result.splitTuple.prefixPrefixDatum = in->prefixDatum; out->result.splitTuple.prefixPrefixDatum = in->prefixDatum;
out->result.splitTuple.nodeLabel = Int16GetDatum(-2); out->result.splitTuple.prefixNNodes = 1;
out->result.splitTuple.prefixNodeLabels = (Datum *) palloc(sizeof(Datum));
out->result.splitTuple.prefixNodeLabels[0] = Int16GetDatum(-2);
out->result.splitTuple.childNodeN = 0;
out->result.splitTuple.postfixHasPrefix = false; out->result.splitTuple.postfixHasPrefix = false;
} }
else else

View File

@ -90,10 +90,13 @@ typedef struct spgChooseOut
} addNode; } addNode;
struct /* results for spgSplitTuple */ struct /* results for spgSplitTuple */
{ {
/* Info to form new inner tuple with one node */ /* Info to form new upper-level inner tuple with one child tuple */
bool prefixHasPrefix; /* tuple should have a prefix? */ bool prefixHasPrefix; /* tuple should have a prefix? */
Datum prefixPrefixDatum; /* if so, its value */ Datum prefixPrefixDatum; /* if so, its value */
Datum nodeLabel; /* node's label */ int prefixNNodes; /* number of nodes */
Datum *prefixNodeLabels; /* their labels (or NULL for
* no labels) */
int childNodeN; /* which node gets child tuple */
/* Info to form new lower-level inner tuple with all old nodes */ /* Info to form new lower-level inner tuple with all old nodes */
bool postfixHasPrefix; /* tuple should have a prefix? */ bool postfixHasPrefix; /* tuple should have a prefix? */
@ -134,7 +137,8 @@ typedef struct spgInnerConsistentIn
Datum reconstructedValue; /* value reconstructed at parent */ Datum reconstructedValue; /* value reconstructed at parent */
void *traversalValue; /* opclass-specific traverse value */ void *traversalValue; /* opclass-specific traverse value */
MemoryContext traversalMemoryContext; MemoryContext traversalMemoryContext; /* put new traverse values
* here */
int level; /* current level (counting from zero) */ int level; /* current level (counting from zero) */
bool returnData; /* original data must be returned? */ bool returnData; /* original data must be returned? */
@ -163,8 +167,8 @@ typedef struct spgLeafConsistentIn
ScanKey scankeys; /* array of operators and comparison values */ ScanKey scankeys; /* array of operators and comparison values */
int nkeys; /* length of array */ int nkeys; /* length of array */
void *traversalValue; /* opclass-specific traverse value */
Datum reconstructedValue; /* value reconstructed at parent */ Datum reconstructedValue; /* value reconstructed at parent */
void *traversalValue; /* opclass-specific traverse value */
int level; /* current level (counting from zero) */ int level; /* current level (counting from zero) */
bool returnData; /* original data must be returned? */ bool returnData; /* original data must be returned? */