1999-07-22 17:11:05 +02:00
|
|
|
<!--
|
2002-09-21 20:32:54 +02:00
|
|
|
$Header: /cvsroot/pgsql/doc/src/sgml/xindex.sgml,v 1.29 2002/09/21 18:32:54 petere Exp $
|
2001-11-21 07:09:45 +01:00
|
|
|
PostgreSQL documentation
|
1999-07-22 17:11:05 +02:00
|
|
|
-->
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<chapter id="xindex">
|
|
|
|
<title>Interfacing Extensions To Indexes</title>
|
|
|
|
|
|
|
|
<sect1 id="xindex-intro">
|
|
|
|
<title>Introduction</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
The procedures described thus far let you define new types, new
|
2002-06-21 05:25:53 +02:00
|
|
|
functions, and new operators. However, we cannot yet define a
|
|
|
|
secondary index (such as a B-tree, R-tree, or hash access method)
|
2002-07-30 07:24:56 +02:00
|
|
|
over a new type, nor associate operators of a new type with secondary
|
|
|
|
indexes.
|
|
|
|
To do these things, we must define an <firstterm>operator class</>
|
2002-09-21 20:32:54 +02:00
|
|
|
for the new data type. We will describe operator classes in the
|
2002-07-30 07:24:56 +02:00
|
|
|
context of a running example: a new operator
|
2002-03-27 20:19:23 +01:00
|
|
|
class for the B-tree access method that stores and
|
1999-05-27 17:44:54 +02:00
|
|
|
sorts complex numbers in ascending absolute value order.
|
|
|
|
</para>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
Prior to <productname>PostgreSQL</productname> release 7.3, it was
|
2002-09-21 20:32:54 +02:00
|
|
|
necessary to make manual additions to
|
2002-07-30 07:24:56 +02:00
|
|
|
<classname>pg_amop</>, <classname>pg_amproc</>, and
|
|
|
|
<classname>pg_opclass</> in order to create a user-defined
|
|
|
|
operator class. That approach is now deprecated in favor of
|
|
|
|
using <command>CREATE OPERATOR CLASS</>, which is a much simpler
|
|
|
|
and less error-prone way of creating the necessary catalog entries.
|
|
|
|
</para>
|
|
|
|
</note>
|
2002-01-07 03:29:15 +01:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="xindex-am">
|
2002-07-30 07:24:56 +02:00
|
|
|
<title>Access Methods and Operator Classes</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The <classname>pg_am</classname> table contains one row for every
|
|
|
|
index access method. Support for access to regular tables is
|
|
|
|
built into <productname>PostgreSQL</productname>, but all index access
|
|
|
|
methods are described in <classname>pg_am</classname>. It is possible
|
|
|
|
to add a new index access method by defining the required interface
|
|
|
|
routines and then creating a row in <classname>pg_am</classname> ---
|
|
|
|
but that is far beyond the scope of this chapter.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The routines for an index access method do not directly know anything
|
|
|
|
about the data types the access method will operate on. Instead, an
|
|
|
|
<firstterm>operator class</> identifies the set of operations that the
|
|
|
|
access method needs to be able to use to work with a particular data type.
|
|
|
|
Operator classes are so called because one thing they specify is the set
|
|
|
|
of WHERE-clause operators that can be used with an index (ie, can be
|
2002-09-21 20:32:54 +02:00
|
|
|
converted into an index scan qualification). An operator class may also
|
2002-07-30 07:24:56 +02:00
|
|
|
specify some <firstterm>support procedures</> that are needed by the
|
|
|
|
internal operations of the index access method, but do not directly
|
|
|
|
correspond to any WHERE-clause operator that can be used with the index.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
It is possible to define multiple operator classes for the same
|
2002-09-21 20:32:54 +02:00
|
|
|
input data type and index access method. By doing this, multiple
|
|
|
|
sets of indexing semantics can be defined for a single data type.
|
2002-07-30 07:24:56 +02:00
|
|
|
For example, a B-tree index requires a sort ordering to be defined
|
2002-09-21 20:32:54 +02:00
|
|
|
for each data type it works on.
|
|
|
|
It might be useful for a complex-number data type
|
2002-07-30 07:24:56 +02:00
|
|
|
to have one B-tree operator class that sorts the data by complex
|
|
|
|
absolute value, another that sorts by real part, and so on.
|
|
|
|
Typically one of the operator classes will be deemed most commonly
|
|
|
|
useful and will be marked as the default operator class for that
|
2002-09-21 20:32:54 +02:00
|
|
|
data type and index access method.
|
2002-07-30 07:24:56 +02:00
|
|
|
</para>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
The same operator class name
|
|
|
|
can be used for several different access methods (for example, both B-tree
|
|
|
|
and hash access methods have operator classes named
|
|
|
|
<literal>oid_ops</literal>), but each such class is an independent
|
|
|
|
entity and must be defined separately.
|
|
|
|
</para>
|
|
|
|
</sect1>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<sect1 id="xindex-strategies">
|
|
|
|
<title>Access Method Strategies</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The operators associated with an operator class are identified by
|
|
|
|
<quote>strategy numbers</>, which serve to identify the semantics of
|
|
|
|
each operator within the context of its operator class.
|
|
|
|
For example, B-trees impose a strict ordering on keys, lesser to greater,
|
|
|
|
and so operators like <quote>less than</> and <quote>greater than or equal
|
|
|
|
to</> are interesting with respect to a B-tree.
|
|
|
|
Because
|
|
|
|
<productname>PostgreSQL</productname> allows the user to define operators,
|
|
|
|
<productname>PostgreSQL</productname> cannot look at the name of an operator
|
2002-07-30 19:34:37 +02:00
|
|
|
(e.g., <literal><</> or <literal>>=</>) and tell what kind of
|
2002-07-30 07:24:56 +02:00
|
|
|
comparison it is. Instead, the index access method defines a set of
|
|
|
|
<quote>strategies</>, which can be thought of as generalized operators.
|
|
|
|
Each operator class shows which actual operator corresponds to each
|
2002-09-21 20:32:54 +02:00
|
|
|
strategy for a particular data type and interpretation of the index
|
2002-07-30 07:24:56 +02:00
|
|
|
semantics.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
B-tree indexes define 5 strategies, as shown in <xref
|
|
|
|
linkend="xindex-btree-strat-table">.
|
|
|
|
</para>
|
2001-05-17 23:50:18 +02:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-btree-strat-table">
|
|
|
|
<title>B-tree Strategies</title>
|
|
|
|
<titleabbrev>B-tree</titleabbrev>
|
1999-05-27 17:44:54 +02:00
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Operation</entry>
|
|
|
|
<entry>Strategy Number</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>less than</entry>
|
|
|
|
<entry>1</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>less than or equal</entry>
|
|
|
|
<entry>2</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>equal</entry>
|
|
|
|
<entry>3</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>greater than or equal</entry>
|
|
|
|
<entry>4</entry>
|
2001-08-21 18:36:06 +02:00
|
|
|
</row>
|
1999-05-27 17:44:54 +02:00
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>greater than</entry>
|
|
|
|
<entry>5</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
2002-07-30 07:24:56 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Hash indexes express only bitwise similarity, and so they define only 1
|
|
|
|
strategy, as shown in <xref linkend="xindex-hash-strat-table">.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<table tocentry="1" id="xindex-hash-strat-table">
|
|
|
|
<title>Hash Strategies</title>
|
|
|
|
<titleabbrev>Hash</titleabbrev>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
1999-05-27 17:44:54 +02:00
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Operation</entry>
|
|
|
|
<entry>Strategy Number</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
2002-07-30 07:24:56 +02:00
|
|
|
</thead>
|
|
|
|
<tbody>
|
1999-05-27 17:44:54 +02:00
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>equal</entry>
|
|
|
|
<entry>1</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
2002-05-29 19:36:40 +02:00
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
R-tree indexes express rectangle-containment relationships.
|
|
|
|
They define 8 strategies, as shown in <xref linkend="xindex-rtree-strat-table">.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-rtree-strat-table">
|
|
|
|
<title>R-tree Strategies</title>
|
|
|
|
<titleabbrev>R-tree</titleabbrev>
|
1999-05-27 17:44:54 +02:00
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>Operation</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Strategy Number</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>left of</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>left of or overlapping</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>overlapping</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>right of or overlapping</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>4</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>right of</entry>
|
1999-05-27 17:44:54 +02:00
|
|
|
<entry>5</entry>
|
|
|
|
</row>
|
2002-07-30 07:24:56 +02:00
|
|
|
<row>
|
|
|
|
<entry>same</entry>
|
|
|
|
<entry>6</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>contains</entry>
|
|
|
|
<entry>7</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>contained by</entry>
|
|
|
|
<entry>8</entry>
|
|
|
|
</row>
|
1999-05-27 17:44:54 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
GiST indexes are even more flexible: they do not have a fixed set of
|
|
|
|
strategies at all. Instead, the <quote>consistency</> support routine
|
|
|
|
of a particular GiST operator class interprets the strategy numbers
|
|
|
|
however it likes.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
By the way, the <structfield>amorderstrategy</structfield> column
|
|
|
|
in <classname>pg_am</> tells whether
|
|
|
|
the access method supports ordered scan. Zero means it doesn't; if it
|
|
|
|
does, <structfield>amorderstrategy</structfield> is the strategy
|
|
|
|
number that corresponds to the ordering operator. For example, B-tree
|
|
|
|
has <structfield>amorderstrategy</structfield> = 1, which is its
|
|
|
|
<quote>less than</quote> strategy number.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
In short, an operator class must specify a set of operators that express
|
2002-09-21 20:32:54 +02:00
|
|
|
each of these semantic ideas for the operator class's data type.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="xindex-support">
|
|
|
|
<title>Access Method Support Routines</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Strategies aren't usually enough information for the system to figure
|
|
|
|
out how to use an index. In practice, the access methods require
|
|
|
|
additional support routines in order to work. For example, the B-tree
|
1999-05-27 17:44:54 +02:00
|
|
|
access method must be able to compare two keys and determine whether one
|
|
|
|
is greater than, equal to, or less than the other. Similarly, the
|
2002-01-07 03:29:15 +01:00
|
|
|
R-tree access method must be able to compute
|
1999-05-27 17:44:54 +02:00
|
|
|
intersections, unions, and sizes of rectangles. These
|
2001-08-21 18:36:06 +02:00
|
|
|
operations do not correspond to operators used in qualifications in
|
1999-05-27 17:44:54 +02:00
|
|
|
SQL queries; they are administrative routines used by
|
|
|
|
the access methods, internally.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Just as with operators, the operator class identifies which specific
|
2002-09-21 20:32:54 +02:00
|
|
|
functions should play each of these roles for a given data type and
|
2002-07-30 07:24:56 +02:00
|
|
|
semantic interpretation. The index access method specifies the set
|
|
|
|
of functions it needs, and the operator class identifies the correct
|
|
|
|
functions to use by assigning <quote>support function numbers</> to them.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
B-trees require a single support function, as shown in <xref
|
|
|
|
linkend="xindex-btree-support-table">.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-btree-support-table">
|
|
|
|
<title>B-tree Support Functions</title>
|
|
|
|
<titleabbrev>B-tree</titleabbrev>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry>
|
2002-07-30 19:34:37 +02:00
|
|
|
Compare two keys and return an integer less than zero, zero, or
|
|
|
|
greater than zero, indicating whether the first key is less than, equal to,
|
|
|
|
or greater than the second.
|
2002-07-30 07:24:56 +02:00
|
|
|
</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2000-02-17 04:40:02 +01:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Hash indexes likewise require one support function, as shown in <xref
|
|
|
|
linkend="xindex-hash-support-table">.
|
2000-02-17 04:40:02 +01:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-hash-support-table">
|
|
|
|
<title>Hash Support Functions</title>
|
|
|
|
<titleabbrev>Hash</titleabbrev>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Compute the hash value for a key</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
2000-02-17 04:40:02 +01:00
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
R-tree indexes require three support functions,
|
|
|
|
as shown in <xref linkend="xindex-rtree-support-table">.
|
2001-08-21 18:36:06 +02:00
|
|
|
</para>
|
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<table tocentry="1" id="xindex-rtree-support-table">
|
|
|
|
<title>R-tree Support Functions</title>
|
|
|
|
<titleabbrev>R-tree</titleabbrev>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry>union</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>intersection</entry>
|
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>size</entry>
|
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2000-03-28 04:53:02 +02:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
GiST indexes require seven support functions,
|
|
|
|
as shown in <xref linkend="xindex-gist-support-table">.
|
2000-03-28 04:53:02 +02:00
|
|
|
</para>
|
2002-07-30 07:24:56 +02:00
|
|
|
|
|
|
|
<table tocentry="1" id="xindex-gist-support-table">
|
|
|
|
<title>GiST Support Functions</title>
|
|
|
|
<titleabbrev>GiST</titleabbrev>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
|
|
|
<row>
|
2002-07-30 19:34:37 +02:00
|
|
|
<entry>Function</entry>
|
2002-07-30 07:24:56 +02:00
|
|
|
<entry>Support Number</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry>consistent</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>union</entry>
|
|
|
|
<entry>2</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>compress</entry>
|
|
|
|
<entry>3</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>decompress</entry>
|
|
|
|
<entry>4</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>penalty</entry>
|
|
|
|
<entry>5</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>picksplit</entry>
|
|
|
|
<entry>6</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>equal</entry>
|
|
|
|
<entry>7</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="xindex-operators">
|
|
|
|
<title>Creating the Operators and Support Routines</title>
|
2000-03-28 04:53:02 +02:00
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Now that we have seen the ideas, here is the promised example
|
|
|
|
of creating a new operator class. First, we need a set of operators.
|
|
|
|
The procedure for
|
2002-01-07 03:29:15 +01:00
|
|
|
defining operators was discussed in <xref linkend="xoper">.
|
|
|
|
For the <literal>complex_abs_ops</literal> operator class on B-trees,
|
1999-05-27 17:44:54 +02:00
|
|
|
the operators we require are:
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<itemizedlist spacing="compact">
|
|
|
|
<listitem><simpara>absolute-value less-than (strategy 1)</></>
|
|
|
|
<listitem><simpara>absolute-value less-than-or-equal (strategy 2)</></>
|
|
|
|
<listitem><simpara>absolute-value equal (strategy 3)</></>
|
|
|
|
<listitem><simpara>absolute-value greater-than-or-equal (strategy 4)</></>
|
|
|
|
<listitem><simpara>absolute-value greater-than (strategy 5)</></>
|
|
|
|
</itemizedlist>
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-08-21 18:36:06 +02:00
|
|
|
Suppose the code that implements these functions
|
1999-05-27 17:44:54 +02:00
|
|
|
is stored in the file
|
2002-01-07 03:29:15 +01:00
|
|
|
<filename><replaceable>PGROOT</replaceable>/src/tutorial/complex.c</filename>,
|
2001-10-26 23:17:03 +02:00
|
|
|
which we have compiled into
|
2002-01-07 03:29:15 +01:00
|
|
|
<filename><replaceable>PGROOT</replaceable>/src/tutorial/complex.so</filename>.
|
|
|
|
Part of the C code looks like this:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
1998-03-01 09:16:16 +01:00
|
|
|
#define Mag(c) ((c)->x*(c)->x + (c)->y*(c)->y)
|
|
|
|
|
|
|
|
bool
|
|
|
|
complex_abs_eq(Complex *a, Complex *b)
|
|
|
|
{
|
|
|
|
double amag = Mag(a), bmag = Mag(b);
|
|
|
|
return (amag==bmag);
|
|
|
|
}
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2002-07-30 19:34:37 +02:00
|
|
|
(Note that we will only show the equality operator in this text.
|
|
|
|
The other four operators are very similar. Refer to
|
2002-01-07 03:29:15 +01:00
|
|
|
<filename>complex.c</filename> or
|
|
|
|
<filename>complex.source</filename> for the details.)
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-11-21 07:09:45 +01:00
|
|
|
We make the function known to <productname>PostgreSQL</productname> like this:
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
|
|
|
CREATE FUNCTION complex_abs_eq(complex, complex) RETURNS boolean
|
|
|
|
AS '<replaceable>PGROOT</replaceable>/src/tutorial/complex'
|
|
|
|
LANGUAGE C;
|
|
|
|
</programlisting>
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
There are some important things that are happening here:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2000-03-28 04:53:02 +02:00
|
|
|
First, note that operators for less-than, less-than-or-equal, equal,
|
|
|
|
greater-than-or-equal, and greater-than for <filename>complex</filename>
|
|
|
|
are being defined. We can only have one operator named, say, = and
|
|
|
|
taking type <filename>complex</filename> for both operands. In this case
|
|
|
|
we don't have any other operator = for <filename>complex</filename>,
|
2001-09-10 23:58:47 +02:00
|
|
|
but if we were building a practical data type we'd probably want = to
|
2000-03-28 04:53:02 +02:00
|
|
|
be the ordinary equality operation for complex numbers. In that case,
|
2002-01-07 03:29:15 +01:00
|
|
|
we'd need to use some other operator name for <function>complex_abs_eq</>.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</listitem>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<listitem>
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2001-11-21 07:09:45 +01:00
|
|
|
Second, although <productname>PostgreSQL</productname> can cope with operators having
|
2001-09-10 23:58:47 +02:00
|
|
|
the same name as long as they have different input data types, C can only
|
2000-03-28 04:53:02 +02:00
|
|
|
cope with one global routine having a given name, period. So we shouldn't
|
|
|
|
name the C function something simple like <filename>abs_eq</filename>.
|
2001-09-10 23:58:47 +02:00
|
|
|
Usually it's a good practice to include the data type name in the C
|
|
|
|
function name, so as not to conflict with functions for other data types.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</listitem>
|
1998-03-01 09:16:16 +01:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<listitem>
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2001-11-21 07:09:45 +01:00
|
|
|
Third, we could have made the <productname>PostgreSQL</productname> name of the function
|
|
|
|
<filename>abs_eq</filename>, relying on <productname>PostgreSQL</productname> to distinguish it
|
|
|
|
by input data types from any other <productname>PostgreSQL</productname> function of the same name.
|
2000-03-28 04:53:02 +02:00
|
|
|
To keep the example simple, we make the function have the same names
|
2001-11-21 07:09:45 +01:00
|
|
|
at the C level and <productname>PostgreSQL</productname> level.
|
2000-03-28 04:53:02 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</listitem>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<listitem>
|
2000-03-28 04:53:02 +02:00
|
|
|
<para>
|
|
|
|
Finally, note that these operator functions return Boolean values.
|
2002-01-07 03:29:15 +01:00
|
|
|
In practice, all operators defined as index access method
|
|
|
|
strategies must return type <type>boolean</type>, since they must
|
|
|
|
appear at the top level of a <literal>WHERE</> clause to be used with an index.
|
2002-07-30 07:24:56 +02:00
|
|
|
(On the other hand, support functions return whatever the
|
|
|
|
particular access method expects -- in the case of the comparison
|
|
|
|
function for B-trees, a signed integer.)
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
2001-08-21 18:36:06 +02:00
|
|
|
</para>
|
|
|
|
|
1999-05-27 17:44:54 +02:00
|
|
|
<para>
|
2000-03-28 04:53:02 +02:00
|
|
|
Now we are ready to define the operators:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
1998-03-01 09:16:16 +01:00
|
|
|
CREATE OPERATOR = (
|
2000-03-28 04:53:02 +02:00
|
|
|
leftarg = complex, rightarg = complex,
|
1998-03-01 09:16:16 +01:00
|
|
|
procedure = complex_abs_eq,
|
|
|
|
restrict = eqsel, join = eqjoinsel
|
2001-08-31 06:17:13 +02:00
|
|
|
);
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2000-03-28 04:53:02 +02:00
|
|
|
The important
|
2002-01-07 03:29:15 +01:00
|
|
|
things here are the procedure names (which are the C
|
2000-03-28 04:53:02 +02:00
|
|
|
functions defined above) and the restriction and join selectivity
|
|
|
|
functions. You should just use the selectivity functions used in
|
|
|
|
the example (see <filename>complex.source</filename>).
|
|
|
|
Note that there
|
|
|
|
are different such functions for the less-than, equal, and greater-than
|
2002-01-07 03:29:15 +01:00
|
|
|
cases. These must be supplied or the optimizer will be unable to
|
2000-03-28 04:53:02 +02:00
|
|
|
make effective use of the index.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
The next step is the registration of the comparison <quote>support
|
|
|
|
routine</quote> required by B-trees. The C code that implements this
|
|
|
|
is in the same file that contains the operator procedures:
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2002-07-30 07:24:56 +02:00
|
|
|
CREATE FUNCTION complex_abs_cmp(complex, complex)
|
|
|
|
RETURNS integer
|
|
|
|
AS '<replaceable>PGROOT</replaceable>/src/tutorial/complex'
|
|
|
|
LANGUAGE C;
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2002-07-30 07:24:56 +02:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="xindex-opclass">
|
|
|
|
<title>Creating the Operator Class</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Now that we have the required operators and support routine,
|
|
|
|
we can finally create the operator class:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
CREATE OPERATOR CLASS complex_abs_ops
|
|
|
|
DEFAULT FOR TYPE complex USING btree AS
|
2002-07-30 19:34:37 +02:00
|
|
|
OPERATOR 1 < ,
|
|
|
|
OPERATOR 2 <= ,
|
2002-07-30 07:24:56 +02:00
|
|
|
OPERATOR 3 = ,
|
2002-07-30 19:34:37 +02:00
|
|
|
OPERATOR 4 >= ,
|
|
|
|
OPERATOR 5 > ,
|
2002-07-30 07:24:56 +02:00
|
|
|
FUNCTION 1 complex_abs_cmp(complex, complex);
|
|
|
|
</programlisting>
|
2001-08-21 18:36:06 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
And we're done! (Whew.) It should now be possible to create
|
|
|
|
and use B-tree indexes on <type>complex</type> columns.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
We could have written the operator entries more verbosely, as in
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2002-07-30 19:34:37 +02:00
|
|
|
OPERATOR 1 < (complex, complex) ,
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2002-09-21 20:32:54 +02:00
|
|
|
but there is no need to do so when the operators take the same data type
|
2002-07-30 07:24:56 +02:00
|
|
|
we are defining the operator class for.
|
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<para>
|
|
|
|
The above example assumes that you want to make this new operator class the
|
|
|
|
default B-tree operator class for the <type>complex</type> data type.
|
|
|
|
If you don't, just leave out the word <literal>DEFAULT</>.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
2002-07-30 07:24:56 +02:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="xindex-opclass-features">
|
|
|
|
<title>Special Features of Operator Classes</title>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
There are two special features of operator classes that we have
|
|
|
|
not discussed yet, mainly because they are not very useful
|
|
|
|
with the default B-tree index access method.
|
|
|
|
</para>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-07-30 07:24:56 +02:00
|
|
|
<para>
|
|
|
|
Normally, declaring an operator as a member of an operator class means
|
|
|
|
that the index access method can retrieve exactly the set of rows
|
|
|
|
that satisfy a WHERE condition using the operator. For example,
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2002-07-30 19:34:37 +02:00
|
|
|
SELECT * FROM table WHERE integer_column < 4;
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2002-07-30 07:24:56 +02:00
|
|
|
can be satisfied exactly by a B-tree index on the integer column.
|
|
|
|
But there are cases where an index is useful as an inexact guide to
|
|
|
|
the matching rows. For example, if an R-tree index stores only
|
|
|
|
bounding boxes for objects, then it cannot exactly satisfy a WHERE
|
|
|
|
condition that tests overlap between nonrectangular objects such as
|
|
|
|
polygons. Yet we could use the index to find objects whose bounding
|
|
|
|
box overlaps the bounding box of the target object, and then do the
|
|
|
|
exact overlap test only on the objects found by the index. If this
|
|
|
|
scenario applies, the index is said to be <quote>lossy</> for the
|
2002-07-30 19:34:37 +02:00
|
|
|
operator, and we add <literal>RECHECK</> to the <literal>OPERATOR</> clause
|
|
|
|
in the <command>CREATE OPERATOR CLASS</> command.
|
2002-07-30 07:24:56 +02:00
|
|
|
<literal>RECHECK</> is valid if the index is guaranteed to return
|
|
|
|
all the required tuples, plus perhaps some additional tuples, which
|
|
|
|
can be eliminated by performing the original operator comparison.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2002-07-30 07:24:56 +02:00
|
|
|
Consider again the situation where we are storing in the index only
|
|
|
|
the bounding box of a complex object such as a polygon. In this
|
|
|
|
case there's not much value in storing the whole polygon in the index
|
|
|
|
entry --- we may as well store just a simpler object of type
|
|
|
|
<literal>box</>. This situation is expressed by the <literal>STORAGE</>
|
|
|
|
option in <command>CREATE OPERATOR CLASS</>: we'd write something like
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
CREATE OPERATOR CLASS polygon_ops
|
|
|
|
DEFAULT FOR TYPE polygon USING gist AS
|
|
|
|
...
|
|
|
|
STORAGE box;
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
At present, only the GiST access method supports a
|
2002-09-21 20:32:54 +02:00
|
|
|
<literal>STORAGE</> type that's different from the column data type.
|
2002-07-30 07:24:56 +02:00
|
|
|
The GiST <literal>compress</> and <literal>decompress</> support
|
2002-09-21 20:32:54 +02:00
|
|
|
routines must deal with data-type conversion when <literal>STORAGE</>
|
2002-07-30 07:24:56 +02:00
|
|
|
is used.
|
1999-05-27 17:44:54 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</sect1>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
</chapter>
|
1999-05-27 17:44:54 +02:00
|
|
|
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
|
|
Local variables:
|
2000-03-31 05:27:42 +02:00
|
|
|
mode:sgml
|
1999-05-27 17:44:54 +02:00
|
|
|
sgml-omittag:nil
|
|
|
|
sgml-shorttag:t
|
|
|
|
sgml-minimize-attributes:nil
|
|
|
|
sgml-always-quote-attributes:t
|
|
|
|
sgml-indent-step:1
|
|
|
|
sgml-indent-data:t
|
|
|
|
sgml-parent-document:nil
|
|
|
|
sgml-default-dtd-file:"./reference.ced"
|
|
|
|
sgml-exposed-tags:nil
|
2000-03-31 05:27:42 +02:00
|
|
|
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
1999-05-27 17:44:54 +02:00
|
|
|
sgml-local-ecat-files:nil
|
|
|
|
End:
|
|
|
|
-->
|