2010-02-24 04:33:49 +01:00
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.65 2010/02/24 03:33:49 momjian Exp $ -->
|
1999-07-22 17:11:05 +02:00
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<sect1 id="xindex">
|
2002-01-07 03:29:15 +01:00
|
|
|
<title>Interfacing Extensions To Indexes</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="xindex">
|
|
|
|
<primary>index</primary>
|
|
|
|
<secondary>for user-defined data type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
The procedures described thus far let you define new types, new
|
2003-04-13 11:57:35 +02:00
|
|
|
functions, and new operators. However, we cannot yet define an
|
|
|
|
index on a column of a new data type. To do this, we must define an
|
|
|
|
<firstterm>operator class</> for the new data type. Later in this
|
|
|
|
section, we will illustrate this concept in an example: a new
|
|
|
|
operator class for the B-tree index method that stores and sorts
|
|
|
|
complex numbers in ascending absolute value order.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
2007-01-23 21:45:28 +01:00
|
|
|
<para>
|
|
|
|
Operator classes can be grouped into <firstterm>operator families</>
|
|
|
|
to show the relationships between semantically compatible classes.
|
|
|
|
When only a single data type is involved, an operator class is sufficient,
|
|
|
|
so we'll focus on that case first and then return to operator families.
|
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2007-01-23 21:45:28 +01:00
|
|
|
<sect2 id="xindex-opclass">
|
2003-04-13 11:57:35 +02:00
|
|
|
<title>Index Methods and Operator Classes</title>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The <classname>pg_am</classname> table contains one row for every
|
2003-04-13 11:57:35 +02:00
|
|
|
index method (internally known as access method). Support for
|
|
|
|
regular access to tables is built into
|
|
|
|
<productname>PostgreSQL</productname>, but all index methods are
|
|
|
|
described in <classname>pg_am</classname>. It is possible to add a
|
|
|
|
new index method by defining the required interface routines and
|
2004-11-15 07:32:15 +01:00
|
|
|
then creating a row in <classname>pg_am</classname> — but that is
|
2005-02-13 04:04:15 +01:00
|
|
|
beyond the scope of this chapter (see <xref linkend="indexam">).
|
2002-07-30 07:24:56 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
The routines for an index method do not directly know anything
|
2003-08-31 19:32:24 +02:00
|
|
|
about the data types that the index method will operate on.
|
|
|
|
Instead, an <firstterm>operator
|
|
|
|
class</><indexterm><primary>operator class</></indexterm>
|
|
|
|
identifies the set of operations that the index method needs to use
|
|
|
|
to work with a particular data type. Operator classes are so
|
|
|
|
called because one thing they specify is the set of
|
|
|
|
<literal>WHERE</>-clause operators that can be used with an index
|
|
|
|
(i.e., can be converted into an index-scan qualification). An
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
operator class can also specify some <firstterm>support
|
2003-08-31 19:32:24 +02:00
|
|
|
procedures</> that are needed by the internal operations of the
|
|
|
|
index method, but do not directly correspond to any
|
|
|
|
<literal>WHERE</>-clause operator that can be used with the index.
|
2002-07-30 07:24:56 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
It is possible to define multiple operator classes for the same
|
2003-04-13 11:57:35 +02:00
|
|
|
data type and index method. By doing this, multiple
|
2002-09-21 20:32:54 +02:00
|
|
|
sets of indexing semantics can be defined for a single data type.
|
2002-07-30 07:24:56 +02:00
|
|
|
For example, a B-tree index requires a sort ordering to be defined
|
2002-09-21 20:32:54 +02:00
|
|
|
for each data type it works on.
|
|
|
|
It might be useful for a complex-number data type
|
2002-07-30 07:24:56 +02:00
|
|
|
to have one B-tree operator class that sorts the data by complex
|
|
|
|
absolute value, another that sorts by real part, and so on.
|
2003-04-13 11:57:35 +02:00
|
|
|
Typically, one of the operator classes will be deemed most commonly
|
2002-07-30 07:24:56 +02:00
|
|
|
useful and will be marked as the default operator class for that
|
2003-04-13 11:57:35 +02:00
|
|
|
data type and index method.
|
2002-07-30 07:24:56 +02:00
|
|
|
</para>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
The same operator class name
|
2003-04-13 11:57:35 +02:00
|
|
|
can be used for several different index methods (for example, both B-tree
|
|
|
|
and hash index methods have operator classes named
|
2003-11-12 22:15:59 +01:00
|
|
|
<literal>int4_ops</literal>), but each such class is an independent
|
2002-07-30 07:24:56 +02:00
|
|
|
entity and must be defined separately.
|
|
|
|
</para>
|
2003-04-13 11:57:35 +02:00
|
|
|
</sect2>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<sect2 id="xindex-strategies">
|
|
|
|
<title>Index Method Strategies</title>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The operators associated with an operator class are identified by
|
|
|
|
<quote>strategy numbers</>, which serve to identify the semantics of
|
|
|
|
each operator within the context of its operator class.
|
|
|
|
For example, B-trees impose a strict ordering on keys, lesser to greater,
|
|
|
|
and so operators like <quote>less than</> and <quote>greater than or equal
|
|
|
|
to</> are interesting with respect to a B-tree.
|
|
|
|
Because
|
|
|
|
<productname>PostgreSQL</productname> allows the user to define operators,
|
|
|
|
<productname>PostgreSQL</productname> cannot look at the name of an operator
|
2002-07-30 19:34:37 +02:00
|
|
|
(e.g., <literal><</> or <literal>>=</>) and tell what kind of
|
2003-04-13 11:57:35 +02:00
|
|
|
comparison it is. Instead, the index method defines a set of
|
2002-07-30 07:24:56 +02:00
|
|
|
<quote>strategies</>, which can be thought of as generalized operators.
|
2003-04-13 11:57:35 +02:00
|
|
|
Each operator class specifies which actual operator corresponds to each
|
2002-09-21 20:32:54 +02:00
|
|
|
strategy for a particular data type and interpretation of the index
|
2002-07-30 07:24:56 +02:00
|
|
|
semantics.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
The B-tree index method defines five strategies, shown in <xref
|
2002-07-30 07:24:56 +02:00
|
|
|
linkend="xindex-btree-strat-table">.
|
|
|
|
</para>
|
2001-05-17 23:50:18 +02:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-btree-strat-table">
|
|
|
|
<title>B-tree Strategies</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Operation</entry>
|
|
|
|
<entry>Strategy Number</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>less than</entry>
|
|
|
|
<entry>1</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>less than or equal</entry>
|
|
|
|
<entry>2</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>equal</entry>
|
|
|
|
<entry>3</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>greater than or equal</entry>
|
|
|
|
<entry>4</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
1999-05-27 17:44:54 +02:00
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>greater than</entry>
|
|
|
|
<entry>5</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
2002-07-30 07:24:56 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
<para>
|
2007-02-06 05:38:31 +01:00
|
|
|
Hash indexes support only equality comparisons, and so they use only one
|
2003-04-13 11:57:35 +02:00
|
|
|
strategy, shown in <xref linkend="xindex-hash-strat-table">.
|
2002-07-30 07:24:56 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<table tocentry="1" id="xindex-hash-strat-table">
|
|
|
|
<title>Hash Strategies</title>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
1999-05-27 17:44:54 +02:00
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Operation</entry>
|
|
|
|
<entry>Strategy Number</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
2002-07-30 07:24:56 +02:00
|
|
|
</thead>
|
|
|
|
<tbody>
|
1999-05-27 17:44:54 +02:00
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>equal</entry>
|
|
|
|
<entry>1</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
2002-05-29 19:36:40 +02:00
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2007-02-06 05:38:31 +01:00
|
|
|
GiST indexes are more flexible: they do not have a fixed set of
|
2005-11-07 18:36:47 +01:00
|
|
|
strategies at all. Instead, the <quote>consistency</> support routine
|
|
|
|
of each particular GiST operator class interprets the strategy numbers
|
|
|
|
however it likes. As an example, several of the built-in GiST index
|
|
|
|
operator classes index two-dimensional geometric objects, providing
|
|
|
|
the <quote>R-tree</> strategies shown in
|
2005-06-24 22:53:34 +02:00
|
|
|
<xref linkend="xindex-rtree-strat-table">. Four of these are true
|
|
|
|
two-dimensional tests (overlaps, same, contains, contained by);
|
|
|
|
four of them consider only the X direction; and the other four
|
|
|
|
provide the same tests in the Y direction.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-rtree-strat-table">
|
2005-11-07 18:36:47 +01:00
|
|
|
<title>GiST Two-Dimensional <quote>R-tree</> Strategies</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>Operation</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Strategy Number</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2005-06-24 22:53:34 +02:00
|
|
|
<entry>strictly left of</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-06-24 22:53:34 +02:00
|
|
|
<entry>does not extend to right of</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-06-24 22:53:34 +02:00
|
|
|
<entry>overlaps</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-06-24 22:53:34 +02:00
|
|
|
<entry>does not extend to left of</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>4</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-06-24 22:53:34 +02:00
|
|
|
<entry>strictly right of</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>5</entry>
|
|
|
|
</row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<row>
|
|
|
|
<entry>same</entry>
|
|
|
|
<entry>6</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>contains</entry>
|
|
|
|
<entry>7</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>contained by</entry>
|
|
|
|
<entry>8</entry>
|
|
|
|
</row>
|
2005-06-24 22:53:34 +02:00
|
|
|
<row>
|
|
|
|
<entry>does not extend above</entry>
|
|
|
|
<entry>9</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>strictly below</entry>
|
|
|
|
<entry>10</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>strictly above</entry>
|
|
|
|
<entry>11</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>does not extend below</entry>
|
|
|
|
<entry>12</entry>
|
|
|
|
</row>
|
1999-05-27 17:44:54 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2006-09-14 13:16:27 +02:00
|
|
|
<para>
|
2006-12-02 00:46:46 +01:00
|
|
|
GIN indexes are similar to GiST indexes in flexibility: they don't have a
|
|
|
|
fixed set of strategies. Instead the support routines of each operator
|
|
|
|
class interpret the strategy numbers according to the operator class's
|
|
|
|
definition. As an example, the strategy numbers used by the built-in
|
|
|
|
operator classes for arrays are
|
|
|
|
shown in <xref linkend="xindex-gin-array-strat-table">.
|
2006-09-14 13:16:27 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<table tocentry="1" id="xindex-gin-array-strat-table">
|
2006-12-02 00:46:46 +01:00
|
|
|
<title>GIN Array Strategies</title>
|
2006-09-14 13:16:27 +02:00
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>Operation</entry>
|
|
|
|
<entry>Strategy Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry>overlap</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>contains</entry>
|
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>is contained by</entry>
|
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>equal</entry>
|
|
|
|
<entry>4</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<para>
|
2007-01-23 21:45:28 +01:00
|
|
|
Notice that all strategy operators return Boolean values. In
|
2003-04-13 11:57:35 +02:00
|
|
|
practice, all operators defined as index method strategies must
|
|
|
|
return type <type>boolean</type>, since they must appear at the top
|
|
|
|
level of a <literal>WHERE</> clause to be used with an index.
|
|
|
|
</para>
|
|
|
|
</sect2>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<sect2 id="xindex-support">
|
|
|
|
<title>Index Method Support Routines</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Strategies aren't usually enough information for the system to figure
|
2003-04-13 11:57:35 +02:00
|
|
|
out how to use an index. In practice, the index methods require
|
2002-07-30 07:24:56 +02:00
|
|
|
additional support routines in order to work. For example, the B-tree
|
2003-04-13 11:57:35 +02:00
|
|
|
index method must be able to compare two keys and determine whether one
|
1999-05-27 17:44:54 +02:00
|
|
|
is greater than, equal to, or less than the other. Similarly, the
|
2005-11-07 18:36:47 +01:00
|
|
|
hash index method must be able to compute hash codes for key values.
|
|
|
|
These operations do not correspond to operators used in qualifications in
|
2003-04-13 11:57:35 +02:00
|
|
|
SQL commands; they are administrative routines used by
|
|
|
|
the index methods, internally.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
Just as with strategies, the operator class identifies which specific
|
2002-09-21 20:32:54 +02:00
|
|
|
functions should play each of these roles for a given data type and
|
2003-04-13 11:57:35 +02:00
|
|
|
semantic interpretation. The index method defines the set
|
2002-07-30 07:24:56 +02:00
|
|
|
of functions it needs, and the operator class identifies the correct
|
2007-01-23 21:45:28 +01:00
|
|
|
functions to use by assigning them to the <quote>support function numbers</>
|
|
|
|
specified by the index method.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
B-trees require a single support function, shown in <xref
|
2002-07-30 07:24:56 +02:00
|
|
|
linkend="xindex-btree-support-table">.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-btree-support-table">
|
|
|
|
<title>B-tree Support Functions</title>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry>
|
2007-01-23 21:45:28 +01:00
|
|
|
Compare two keys and return an integer less than zero, zero, or
|
|
|
|
greater than zero, indicating whether the first key is less than,
|
2007-04-25 21:48:27 +02:00
|
|
|
equal to, or greater than the second
|
2002-07-30 07:24:56 +02:00
|
|
|
</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2000-02-17 04:40:02 +01:00
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
Hash indexes likewise require one support function, shown in <xref
|
2002-07-30 07:24:56 +02:00
|
|
|
linkend="xindex-hash-support-table">.
|
2000-02-17 04:40:02 +01:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-hash-support-table">
|
|
|
|
<title>Hash Support Functions</title>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Compute the hash value for a key</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
2000-02-17 04:40:02 +01:00
|
|
|
|
2000-03-28 04:53:02 +02:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
GiST indexes require seven support functions,
|
2003-04-13 11:57:35 +02:00
|
|
|
shown in <xref linkend="xindex-gist-support-table">.
|
2000-03-28 04:53:02 +02:00
|
|
|
</para>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
|
|
|
<table tocentry="1" id="xindex-gist-support-table">
|
|
|
|
<title>GiST Support Functions</title>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2006-10-23 20:10:32 +02:00
|
|
|
<entry>consistent - determine whether key satisfies the
|
2006-12-02 00:46:46 +01:00
|
|
|
query qualifier</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>union - compute union of a set of keys</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>compress - compute a compressed representation of a key or value
|
|
|
|
to be indexed</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>decompress - compute a decompressed representation of a
|
|
|
|
compressed key</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>4</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-09-14 13:16:27 +02:00
|
|
|
<entry>penalty - compute penalty for inserting new key into subtree
|
2006-12-02 00:46:46 +01:00
|
|
|
with given subtree's key</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>5</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-09-14 13:16:27 +02:00
|
|
|
<entry>picksplit - determine which entries of a page are to be moved
|
2006-12-02 00:46:46 +01:00
|
|
|
to the new page and compute the union keys for resulting pages</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>6</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>equal - compare two keys and return true if they are equal</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>7</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2006-09-14 13:16:27 +02:00
|
|
|
<para>
|
|
|
|
GIN indexes require four support functions,
|
|
|
|
shown in <xref linkend="xindex-gin-support-table">.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<table tocentry="1" id="xindex-gin-support-table">
|
|
|
|
<title>GIN Support Functions</title>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>Function</entry>
|
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry>
|
2006-12-02 00:46:46 +01:00
|
|
|
compare - compare two keys and return an integer less than zero, zero,
|
|
|
|
or greater than zero, indicating whether the first key is less than,
|
|
|
|
equal to, or greater than the second
|
|
|
|
</entry>
|
2006-09-14 13:16:27 +02:00
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>extractValue - extract keys from a value to be indexed</entry>
|
2006-09-14 13:16:27 +02:00
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>extractQuery - extract keys from a query condition</entry>
|
2006-09-14 13:16:27 +02:00
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2006-12-02 00:46:46 +01:00
|
|
|
<entry>consistent - determine whether value matches query condition</entry>
|
2006-09-14 13:16:27 +02:00
|
|
|
<entry>4</entry>
|
|
|
|
</row>
|
2008-05-16 18:31:02 +02:00
|
|
|
<row>
|
|
|
|
<entry>comparePartial - (optional method) compare partial key from
|
|
|
|
query and key from index, and return an integer less than zero, zero,
|
|
|
|
or greater than zero, indicating whether GIN should ignore this index
|
|
|
|
entry, treat the entry as a match, or stop the index scan</entry>
|
|
|
|
<entry>5</entry>
|
|
|
|
</row>
|
2006-09-14 13:16:27 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<para>
|
|
|
|
Unlike strategy operators, support functions return whichever data
|
2005-06-24 22:53:34 +02:00
|
|
|
type the particular index method expects; for example in the case
|
2007-01-23 21:45:28 +01:00
|
|
|
of the comparison function for B-trees, a signed integer. The number
|
|
|
|
and types of the arguments to each support function are likewise
|
|
|
|
dependent on the index method. For B-tree and hash the support functions
|
|
|
|
take the same input data types as do the operators included in the operator
|
|
|
|
class, but this is not the case for most GIN and GiST support functions.
|
2003-04-13 11:57:35 +02:00
|
|
|
</para>
|
|
|
|
</sect2>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<sect2 id="xindex-example">
|
|
|
|
<title>An Example</title>
|
2000-03-28 04:53:02 +02:00
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
Now that we have seen the ideas, here is the promised example of
|
2003-10-22 01:28:42 +02:00
|
|
|
creating a new operator class.
|
|
|
|
(You can find a working copy of this example in
|
|
|
|
<filename>src/tutorial/complex.c</filename> and
|
|
|
|
<filename>src/tutorial/complex.sql</filename> in the source
|
|
|
|
distribution.)
|
|
|
|
The operator class encapsulates
|
2003-04-13 11:57:35 +02:00
|
|
|
operators that sort complex numbers in absolute value order, so we
|
|
|
|
choose the name <literal>complex_abs_ops</literal>. First, we need
|
|
|
|
a set of operators. The procedure for defining operators was
|
|
|
|
discussed in <xref linkend="xoper">. For an operator class on
|
|
|
|
B-trees, the operators we require are:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<itemizedlist spacing="compact">
|
|
|
|
<listitem><simpara>absolute-value less-than (strategy 1)</></>
|
|
|
|
<listitem><simpara>absolute-value less-than-or-equal (strategy 2)</></>
|
|
|
|
<listitem><simpara>absolute-value equal (strategy 3)</></>
|
|
|
|
<listitem><simpara>absolute-value greater-than-or-equal (strategy 4)</></>
|
|
|
|
<listitem><simpara>absolute-value greater-than (strategy 5)</></>
|
|
|
|
</itemizedlist>
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-10-22 01:28:42 +02:00
|
|
|
The least error-prone way to define a related set of comparison operators
|
2003-11-01 02:56:29 +01:00
|
|
|
is to write the B-tree comparison support function first, and then write the
|
2003-10-22 01:28:42 +02:00
|
|
|
other functions as one-line wrappers around the support function. This
|
|
|
|
reduces the odds of getting inconsistent results for corner cases.
|
2007-02-01 01:28:19 +01:00
|
|
|
Following this approach, we first write:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2008-12-08 00:46:39 +01:00
|
|
|
<programlisting><![CDATA[
|
|
|
|
#define Mag(c) ((c)->x*(c)->x + (c)->y*(c)->y)
|
1998-03-01 09:16:16 +01:00
|
|
|
|
2003-10-22 01:28:42 +02:00
|
|
|
static int
|
|
|
|
complex_abs_cmp_internal(Complex *a, Complex *b)
|
2003-04-13 11:57:35 +02:00
|
|
|
{
|
2003-10-22 01:28:42 +02:00
|
|
|
double amag = Mag(a),
|
|
|
|
bmag = Mag(b);
|
|
|
|
|
2008-12-08 00:46:39 +01:00
|
|
|
if (amag < bmag)
|
2003-10-22 01:28:42 +02:00
|
|
|
return -1;
|
2008-12-08 00:46:39 +01:00
|
|
|
if (amag > bmag)
|
2003-10-22 01:28:42 +02:00
|
|
|
return 1;
|
|
|
|
return 0;
|
2003-04-13 11:57:35 +02:00
|
|
|
}
|
2008-12-08 00:46:39 +01:00
|
|
|
]]>
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2003-10-22 01:28:42 +02:00
|
|
|
|
2007-02-01 01:28:19 +01:00
|
|
|
Now the less-than function looks like:
|
2003-10-22 01:28:42 +02:00
|
|
|
|
2008-12-08 00:46:39 +01:00
|
|
|
<programlisting><![CDATA[
|
2003-10-22 01:28:42 +02:00
|
|
|
PG_FUNCTION_INFO_V1(complex_abs_lt);
|
|
|
|
|
|
|
|
Datum
|
|
|
|
complex_abs_lt(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
Complex *a = (Complex *) PG_GETARG_POINTER(0);
|
|
|
|
Complex *b = (Complex *) PG_GETARG_POINTER(1);
|
|
|
|
|
2008-12-08 00:46:39 +01:00
|
|
|
PG_RETURN_BOOL(complex_abs_cmp_internal(a, b) < 0);
|
2003-10-22 01:28:42 +02:00
|
|
|
}
|
2008-12-08 00:46:39 +01:00
|
|
|
]]>
|
2003-10-22 01:28:42 +02:00
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
The other four functions differ only in how they compare the internal
|
|
|
|
function's result to zero.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-10-22 01:28:42 +02:00
|
|
|
Next we declare the functions and the operators based on the functions
|
|
|
|
to SQL:
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2003-10-22 01:28:42 +02:00
|
|
|
CREATE FUNCTION complex_abs_lt(complex, complex) RETURNS bool
|
|
|
|
AS '<replaceable>filename</replaceable>', 'complex_abs_lt'
|
|
|
|
LANGUAGE C IMMUTABLE STRICT;
|
|
|
|
|
|
|
|
CREATE OPERATOR < (
|
|
|
|
leftarg = complex, rightarg = complex, procedure = complex_abs_lt,
|
|
|
|
commutator = > , negator = >= ,
|
|
|
|
restrict = scalarltsel, join = scalarltjoinsel
|
2003-04-13 11:57:35 +02:00
|
|
|
);
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2003-10-22 01:28:42 +02:00
|
|
|
It is important to specify the correct commutator and negator operators,
|
|
|
|
as well as suitable restriction and join selectivity
|
2003-04-13 11:57:35 +02:00
|
|
|
functions, otherwise the optimizer will be unable to make effective
|
2003-08-18 00:09:00 +02:00
|
|
|
use of the index. Note that the less-than, equal, and
|
2003-04-13 11:57:35 +02:00
|
|
|
greater-than cases should use different selectivity functions.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
Other things worth noting are happening here:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
2003-04-13 11:57:35 +02:00
|
|
|
<para>
|
|
|
|
There can only be one operator named, say, <literal>=</literal>
|
|
|
|
and taking type <type>complex</type> for both operands. In this
|
|
|
|
case we don't have any other operator <literal>=</literal> for
|
|
|
|
<type>complex</type>, but if we were building a practical data
|
|
|
|
type we'd probably want <literal>=</literal> to be the ordinary
|
|
|
|
equality operation for complex numbers (and not the equality of
|
|
|
|
the absolute values). In that case, we'd need to use some other
|
|
|
|
operator name for <function>complex_abs_eq</>.
|
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</listitem>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<listitem>
|
2003-04-13 11:57:35 +02:00
|
|
|
<para>
|
|
|
|
Although <productname>PostgreSQL</productname> can cope with
|
2005-02-13 04:04:15 +01:00
|
|
|
functions having the same SQL name as long as they have different
|
2003-04-13 11:57:35 +02:00
|
|
|
argument data types, C can only cope with one global function
|
|
|
|
having a given name. So we shouldn't name the C function
|
|
|
|
something simple like <filename>abs_eq</filename>. Usually it's
|
|
|
|
a good practice to include the data type name in the C function
|
|
|
|
name, so as not to conflict with functions for other data types.
|
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</listitem>
|
1998-03-01 09:16:16 +01:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<listitem>
|
2003-04-13 11:57:35 +02:00
|
|
|
<para>
|
2005-02-13 04:04:15 +01:00
|
|
|
We could have made the SQL name
|
2003-04-13 11:57:35 +02:00
|
|
|
of the function <filename>abs_eq</filename>, relying on
|
|
|
|
<productname>PostgreSQL</productname> to distinguish it by
|
2005-02-13 04:04:15 +01:00
|
|
|
argument data types from any other SQL function of the same name.
|
2003-04-13 11:57:35 +02:00
|
|
|
To keep the example simple, we make the function have the same
|
2005-02-13 04:04:15 +01:00
|
|
|
names at the C level and SQL level.
|
2003-04-13 11:57:35 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
2001-08-21 18:36:06 +02:00
|
|
|
</para>
|
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
The next step is the registration of the support routine required
|
|
|
|
by B-trees. The example C code that implements this is in the same
|
|
|
|
file that contains the operator functions. This is how we declare
|
|
|
|
the function:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2002-07-30 07:24:56 +02:00
|
|
|
CREATE FUNCTION complex_abs_cmp(complex, complex)
|
|
|
|
RETURNS integer
|
2003-04-13 11:57:35 +02:00
|
|
|
AS '<replaceable>filename</replaceable>'
|
2003-10-22 01:28:42 +02:00
|
|
|
LANGUAGE C IMMUTABLE STRICT;
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Now that we have the required operators and support routine,
|
|
|
|
we can finally create the operator class:
|
|
|
|
|
2008-12-08 00:46:39 +01:00
|
|
|
<programlisting><![CDATA[
|
2002-07-30 07:24:56 +02:00
|
|
|
CREATE OPERATOR CLASS complex_abs_ops
|
|
|
|
DEFAULT FOR TYPE complex USING btree AS
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < ,
|
|
|
|
OPERATOR 2 <= ,
|
2002-07-30 07:24:56 +02:00
|
|
|
OPERATOR 3 = ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= ,
|
|
|
|
OPERATOR 5 > ,
|
2002-07-30 07:24:56 +02:00
|
|
|
FUNCTION 1 complex_abs_cmp(complex, complex);
|
2008-12-08 00:46:39 +01:00
|
|
|
]]>
|
2002-07-30 07:24:56 +02:00
|
|
|
</programlisting>
|
2001-08-21 18:36:06 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-13 11:57:35 +02:00
|
|
|
And we're done! It should now be possible to create
|
2002-07-30 07:24:56 +02:00
|
|
|
and use B-tree indexes on <type>complex</type> columns.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-02-01 01:28:19 +01:00
|
|
|
We could have written the operator entries more verbosely, as in:
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2002-07-30 19:34:37 +02:00
|
|
|
OPERATOR 1 < (complex, complex) ,
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2002-09-21 20:32:54 +02:00
|
|
|
but there is no need to do so when the operators take the same data type
|
2002-07-30 07:24:56 +02:00
|
|
|
we are defining the operator class for.
|
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<para>
|
|
|
|
The above example assumes that you want to make this new operator class the
|
|
|
|
default B-tree operator class for the <type>complex</type> data type.
|
|
|
|
If you don't, just leave out the word <literal>DEFAULT</>.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
2003-04-13 11:57:35 +02:00
|
|
|
</sect2>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
2007-01-23 21:45:28 +01:00
|
|
|
<sect2 id="xindex-opfamily">
|
|
|
|
<title>Operator Classes and Operator Families</title>
|
2003-11-12 22:15:59 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
So far we have implicitly assumed that an operator class deals with
|
|
|
|
only one data type. While there certainly can be only one data type in
|
|
|
|
a particular index column, it is often useful to index operations that
|
2007-01-23 21:45:28 +01:00
|
|
|
compare an indexed column to a value of a different data type. Also,
|
|
|
|
if there is use for a cross-data-type operator in connection with an
|
|
|
|
operator class, it is often the case that the other data type has a
|
|
|
|
related operator class of its own. It is helpful to make the connections
|
|
|
|
between related classes explicit, because this can aid the planner in
|
|
|
|
optimizing SQL queries (particularly for B-tree operator classes, since
|
|
|
|
the planner contains a great deal of knowledge about how to work with them).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
To handle these needs, <productname>PostgreSQL</productname>
|
|
|
|
uses the concept of an <firstterm>operator
|
|
|
|
family</><indexterm><primary>operator family</></indexterm>.
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
An operator family contains one or more operator classes, and can also
|
2007-01-23 21:45:28 +01:00
|
|
|
contain indexable operators and corresponding support functions that
|
|
|
|
belong to the family as a whole but not to any single class within the
|
|
|
|
family. We say that such operators and functions are <quote>loose</>
|
|
|
|
within the family, as opposed to being bound into a specific class.
|
|
|
|
Typically each operator class contains single-data-type operators
|
|
|
|
while cross-data-type operators are loose in the family.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
All the operators and functions in an operator family must have compatible
|
|
|
|
semantics, where the compatibility requirements are set by the index
|
|
|
|
method. You might therefore wonder why bother to single out particular
|
|
|
|
subsets of the family as operator classes; and indeed for many purposes
|
|
|
|
the class divisions are irrelevant and the family is the only interesting
|
|
|
|
grouping. The reason for defining operator classes is that they specify
|
|
|
|
how much of the family is needed to support any particular index.
|
|
|
|
If there is an index using an operator class, then that operator class
|
|
|
|
cannot be dropped without dropping the index — but other parts of
|
|
|
|
the operator family, namely other operator classes and loose operators,
|
|
|
|
could be dropped. Thus, an operator class should be specified to contain
|
|
|
|
the minimum set of operators and functions that are reasonably needed
|
|
|
|
to work with an index on a specific data type, and then related but
|
|
|
|
non-essential operators can be added as loose members of the operator
|
|
|
|
family.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
As an example, <productname>PostgreSQL</productname> has a built-in
|
|
|
|
B-tree operator family <literal>integer_ops</>, which includes operator
|
|
|
|
classes <literal>int8_ops</>, <literal>int4_ops</>, and
|
|
|
|
<literal>int2_ops</> for indexes on <type>bigint</> (<type>int8</>),
|
|
|
|
<type>integer</> (<type>int4</>), and <type>smallint</> (<type>int2</>)
|
|
|
|
columns respectively. The family also contains cross-data-type comparison
|
|
|
|
operators allowing any two of these types to be compared, so that an index
|
|
|
|
on one of these types can be searched using a comparison value of another
|
|
|
|
type. The family could be duplicated by these definitions:
|
2003-11-12 22:15:59 +01:00
|
|
|
|
2008-12-08 00:46:39 +01:00
|
|
|
<programlisting><![CDATA[
|
2007-01-23 21:45:28 +01:00
|
|
|
CREATE OPERATOR FAMILY integer_ops USING btree;
|
|
|
|
|
2003-11-12 22:15:59 +01:00
|
|
|
CREATE OPERATOR CLASS int8_ops
|
2007-01-23 21:45:28 +01:00
|
|
|
DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS
|
2003-11-12 22:15:59 +01:00
|
|
|
-- standard int8 comparisons
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < ,
|
|
|
|
OPERATOR 2 <= ,
|
2003-11-12 22:15:59 +01:00
|
|
|
OPERATOR 3 = ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= ,
|
|
|
|
OPERATOR 5 > ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint8cmp(int8, int8) ;
|
|
|
|
|
|
|
|
CREATE OPERATOR CLASS int4_ops
|
|
|
|
DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS
|
|
|
|
-- standard int4 comparisons
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < ,
|
|
|
|
OPERATOR 2 <= ,
|
2007-01-23 21:45:28 +01:00
|
|
|
OPERATOR 3 = ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= ,
|
|
|
|
OPERATOR 5 > ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint4cmp(int4, int4) ;
|
2003-11-12 22:15:59 +01:00
|
|
|
|
2007-01-23 21:45:28 +01:00
|
|
|
CREATE OPERATOR CLASS int2_ops
|
|
|
|
DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS
|
|
|
|
-- standard int2 comparisons
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < ,
|
|
|
|
OPERATOR 2 <= ,
|
2007-01-23 21:45:28 +01:00
|
|
|
OPERATOR 3 = ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= ,
|
|
|
|
OPERATOR 5 > ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint2cmp(int2, int2) ;
|
|
|
|
|
|
|
|
ALTER OPERATOR FAMILY integer_ops USING btree ADD
|
|
|
|
-- cross-type comparisons int8 vs int2
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < (int8, int2) ,
|
|
|
|
OPERATOR 2 <= (int8, int2) ,
|
2003-11-12 22:15:59 +01:00
|
|
|
OPERATOR 3 = (int8, int2) ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= (int8, int2) ,
|
|
|
|
OPERATOR 5 > (int8, int2) ,
|
2003-11-12 22:15:59 +01:00
|
|
|
FUNCTION 1 btint82cmp(int8, int2) ,
|
|
|
|
|
2007-01-23 21:45:28 +01:00
|
|
|
-- cross-type comparisons int8 vs int4
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < (int8, int4) ,
|
|
|
|
OPERATOR 2 <= (int8, int4) ,
|
2003-11-12 22:15:59 +01:00
|
|
|
OPERATOR 3 = (int8, int4) ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= (int8, int4) ,
|
|
|
|
OPERATOR 5 > (int8, int4) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint84cmp(int8, int4) ,
|
|
|
|
|
|
|
|
-- cross-type comparisons int4 vs int2
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < (int4, int2) ,
|
|
|
|
OPERATOR 2 <= (int4, int2) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
OPERATOR 3 = (int4, int2) ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= (int4, int2) ,
|
|
|
|
OPERATOR 5 > (int4, int2) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint42cmp(int4, int2) ,
|
|
|
|
|
|
|
|
-- cross-type comparisons int4 vs int8
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < (int4, int8) ,
|
|
|
|
OPERATOR 2 <= (int4, int8) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
OPERATOR 3 = (int4, int8) ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= (int4, int8) ,
|
|
|
|
OPERATOR 5 > (int4, int8) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint48cmp(int4, int8) ,
|
|
|
|
|
|
|
|
-- cross-type comparisons int2 vs int8
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < (int2, int8) ,
|
|
|
|
OPERATOR 2 <= (int2, int8) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
OPERATOR 3 = (int2, int8) ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= (int2, int8) ,
|
|
|
|
OPERATOR 5 > (int2, int8) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint28cmp(int2, int8) ,
|
|
|
|
|
|
|
|
-- cross-type comparisons int2 vs int4
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 1 < (int2, int4) ,
|
|
|
|
OPERATOR 2 <= (int2, int4) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
OPERATOR 3 = (int2, int4) ,
|
2008-12-08 00:46:39 +01:00
|
|
|
OPERATOR 4 >= (int2, int4) ,
|
|
|
|
OPERATOR 5 > (int2, int4) ,
|
2007-01-23 21:45:28 +01:00
|
|
|
FUNCTION 1 btint24cmp(int2, int4) ;
|
2008-12-08 00:46:39 +01:00
|
|
|
]]>
|
2003-11-12 22:15:59 +01:00
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
Notice that this definition <quote>overloads</> the operator strategy and
|
2007-01-23 21:45:28 +01:00
|
|
|
support function numbers: each number occurs multiple times within the
|
|
|
|
family. This is allowed so long as each instance of a
|
|
|
|
particular number has distinct input data types. The instances that have
|
|
|
|
both input types equal to an operator class's input type are the
|
|
|
|
primary operators and support functions for that operator class,
|
|
|
|
and in most cases should be declared as part of the operator class rather
|
|
|
|
than as loose members of the family.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
In a B-tree operator family, all the operators in the family must sort
|
|
|
|
compatibly, meaning that the transitive laws hold across all the data types
|
|
|
|
supported by the family: <quote>if A = B and B = C, then A =
|
|
|
|
C</>, and <quote>if A < B and B < C, then A < C</>. For each
|
|
|
|
operator in the family there must be a support function having the same
|
|
|
|
two input data types as the operator. It is recommended that a family be
|
|
|
|
complete, i.e., for each combination of data types, all operators are
|
2007-02-06 05:38:31 +01:00
|
|
|
included. Each operator class should include just the non-cross-type
|
2007-01-23 21:45:28 +01:00
|
|
|
operators and support function for its data type.
|
2003-11-12 22:15:59 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-02-06 05:38:31 +01:00
|
|
|
To build a multiple-data-type hash operator family, compatible hash
|
|
|
|
support functions must be created for each data type supported by the
|
|
|
|
family. Here compatibility means that the functions are guaranteed to
|
|
|
|
return the same hash code for any two values that are considered equal
|
|
|
|
by the family's equality operators, even when the values are of different
|
|
|
|
types. This is usually difficult to accomplish when the types have
|
|
|
|
different physical representations, but it can be done in some cases.
|
|
|
|
Notice that there is only one support function per data type, not one
|
|
|
|
per equality operator. It is recommended that a family be complete, i.e.,
|
|
|
|
provide an equality operator for each combination of data types.
|
|
|
|
Each operator class should include just the non-cross-type equality
|
|
|
|
operator and the support function for its data type.
|
2003-11-12 22:15:59 +01:00
|
|
|
</para>
|
2007-01-23 21:45:28 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
GIN and GiST indexes do not have any explicit notion of cross-data-type
|
|
|
|
operations. The set of operators supported is just whatever the primary
|
|
|
|
support functions for a given operator class can handle.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
Prior to <productname>PostgreSQL</productname> 8.3, there was no concept
|
|
|
|
of operator families, and so any cross-data-type operators intended to be
|
|
|
|
used with an index had to be bound directly into the index's operator
|
|
|
|
class. While this approach still works, it is deprecated because it
|
|
|
|
makes an index's dependencies too broad, and because the planner can
|
|
|
|
handle cross-data-type comparisons more effectively when both data types
|
|
|
|
have operators in the same operator family.
|
|
|
|
</para>
|
|
|
|
</note>
|
2003-11-12 22:15:59 +01:00
|
|
|
</sect2>
|
|
|
|
|
2003-08-18 00:09:00 +02:00
|
|
|
<sect2 id="xindex-opclass-dependencies">
|
|
|
|
<title>System Dependencies on Operator Classes</title>
|
|
|
|
|
|
|
|
<indexterm>
|
|
|
|
<primary>ordering operator</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<productname>PostgreSQL</productname> uses operator classes to infer the
|
|
|
|
properties of operators in more ways than just whether they can be used
|
|
|
|
with indexes. Therefore, you might want to create operator classes
|
2003-11-01 02:56:29 +01:00
|
|
|
even if you have no intention of indexing any columns of your data type.
|
2003-08-18 00:09:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
In particular, there are SQL features such as <literal>ORDER BY</> and
|
|
|
|
<literal>DISTINCT</> that require comparison and sorting of values.
|
2003-11-01 02:56:29 +01:00
|
|
|
To implement these features on a user-defined data type,
|
2003-08-18 00:09:00 +02:00
|
|
|
<productname>PostgreSQL</productname> looks for the default B-tree operator
|
2003-11-01 02:56:29 +01:00
|
|
|
class for the data type. The <quote>equals</> member of this operator
|
2003-08-18 00:09:00 +02:00
|
|
|
class defines the system's notion of equality of values for
|
|
|
|
<literal>GROUP BY</> and <literal>DISTINCT</>, and the sort ordering
|
|
|
|
imposed by the operator class defines the default <literal>ORDER BY</>
|
|
|
|
ordering.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Comparison of arrays of user-defined types also relies on the semantics
|
|
|
|
defined by the default B-tree operator class.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-11-01 02:56:29 +01:00
|
|
|
If there is no default B-tree operator class for a data type, the system
|
2003-08-18 00:09:00 +02:00
|
|
|
will look for a default hash operator class. But since that kind of
|
|
|
|
operator class only provides equality, in practice it is only enough
|
|
|
|
to support array equality.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-11-01 02:56:29 +01:00
|
|
|
When there is no default operator class for a data type, you will get
|
2003-08-18 00:09:00 +02:00
|
|
|
errors like <quote>could not identify an ordering operator</> if you
|
2003-11-01 02:56:29 +01:00
|
|
|
try to use these SQL features with the data type.
|
2003-08-18 00:09:00 +02:00
|
|
|
</para>
|
|
|
|
|
2007-12-02 05:36:40 +01:00
|
|
|
<para>
|
|
|
|
Another important point is that an operator that
|
|
|
|
appears in a hash operator family is a candidate for hash joins,
|
|
|
|
hash aggregation, and related optimizations. The hash operator family
|
|
|
|
is essential here since it identifies the hash function(s) to use.
|
|
|
|
</para>
|
2003-08-18 00:09:00 +02:00
|
|
|
</sect2>
|
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
<sect2 id="xindex-opclass-features">
|
2002-07-30 07:24:56 +02:00
|
|
|
<title>Special Features of Operator Classes</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
There are two special features of operator classes that we have
|
2003-08-18 00:09:00 +02:00
|
|
|
not discussed yet, mainly because they are not useful
|
|
|
|
with the most commonly used index methods.
|
2002-07-30 07:24:56 +02:00
|
|
|
</para>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<para>
|
2007-01-23 21:45:28 +01:00
|
|
|
Normally, declaring an operator as a member of an operator class
|
2008-04-14 19:05:34 +02:00
|
|
|
(or family) means that the index method can retrieve exactly the set of rows
|
2007-02-01 01:28:19 +01:00
|
|
|
that satisfy a <literal>WHERE</> condition using the operator. For example:
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2002-07-30 19:34:37 +02:00
|
|
|
SELECT * FROM table WHERE integer_column < 4;
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2002-07-30 07:24:56 +02:00
|
|
|
can be satisfied exactly by a B-tree index on the integer column.
|
|
|
|
But there are cases where an index is useful as an inexact guide to
|
2008-04-14 19:05:34 +02:00
|
|
|
the matching rows. For example, if a GiST index stores only bounding boxes
|
|
|
|
for geometric objects, then it cannot exactly satisfy a <literal>WHERE</>
|
2002-07-30 07:24:56 +02:00
|
|
|
condition that tests overlap between nonrectangular objects such as
|
|
|
|
polygons. Yet we could use the index to find objects whose bounding
|
|
|
|
box overlaps the bounding box of the target object, and then do the
|
|
|
|
exact overlap test only on the objects found by the index. If this
|
|
|
|
scenario applies, the index is said to be <quote>lossy</> for the
|
2008-04-14 19:05:34 +02:00
|
|
|
operator. Lossy index searches are implemented by having the index
|
|
|
|
method return a <firstterm>recheck</> flag when a row might or might
|
|
|
|
not really satisfy the query condition. The core system will then
|
|
|
|
test the original query condition on the retrieved row to see whether
|
|
|
|
it should be returned as a valid match. This approach works if
|
|
|
|
the index is guaranteed to return all the required rows, plus perhaps
|
|
|
|
some additional rows, which can be eliminated by performing the original
|
|
|
|
operator invocation. The index methods that support lossy searches
|
|
|
|
(currently, GiST and GIN) allow the support functions of individual
|
|
|
|
operator classes to set the recheck flag, and so this is essentially an
|
|
|
|
operator-class feature.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Consider again the situation where we are storing in the index only
|
|
|
|
the bounding box of a complex object such as a polygon. In this
|
|
|
|
case there's not much value in storing the whole polygon in the index
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
entry — we might as well store just a simpler object of type
|
2003-04-13 11:57:35 +02:00
|
|
|
<type>box</>. This situation is expressed by the <literal>STORAGE</>
|
2007-02-01 01:28:19 +01:00
|
|
|
option in <command>CREATE OPERATOR CLASS</>: we'd write something like:
|
2002-07-30 07:24:56 +02:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
CREATE OPERATOR CLASS polygon_ops
|
|
|
|
DEFAULT FOR TYPE polygon USING gist AS
|
|
|
|
...
|
|
|
|
STORAGE box;
|
|
|
|
</programlisting>
|
|
|
|
|
2006-12-02 00:46:46 +01:00
|
|
|
At present, only the GiST and GIN index methods support a
|
2002-09-21 20:32:54 +02:00
|
|
|
<literal>STORAGE</> type that's different from the column data type.
|
2006-12-02 00:46:46 +01:00
|
|
|
The GiST <function>compress</> and <function>decompress</> support
|
2002-09-21 20:32:54 +02:00
|
|
|
routines must deal with data-type conversion when <literal>STORAGE</>
|
2006-12-02 00:46:46 +01:00
|
|
|
is used. In GIN, the <literal>STORAGE</> type identifies the type of
|
|
|
|
the <quote>key</> values, which normally is different from the type
|
|
|
|
of the indexed column — for example, an operator class for
|
2007-12-02 05:36:40 +01:00
|
|
|
integer-array columns might have keys that are just integers. The
|
2006-12-02 00:46:46 +01:00
|
|
|
GIN <function>extractValue</> and <function>extractQuery</> support
|
|
|
|
routines are responsible for extracting keys from indexed values.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2003-04-13 11:57:35 +02:00
|
|
|
</sect2>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2003-04-13 11:57:35 +02:00
|
|
|
</sect1>
|