diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index 8fbea2cf7c..79610c30ed 100644 --- a/doc/src/sgml/xindex.sgml +++ b/doc/src/sgml/xindex.sgml @@ -1,4 +1,4 @@ - + Interfacing Extensions To Indexes @@ -18,20 +18,14 @@ complex numbers in ascending absolute value order. - - - Prior to PostgreSQL release 7.3, it was - necessary to make manual additions to the system catalogs - pg_amop, pg_amproc, and - pg_opclass in order to create a user-defined - operator class. That approach is now deprecated in favor of using - , - which is a much simpler and less error-prone way of creating the - necessary catalog entries. - - + + Operator classes can be grouped into operator families + to show the relationships between semantically compatible classes. + When only a single data type is involved, an operator class is sufficient, + so we'll focus on that case first and then return to operator families. + - + Index Methods and Operator Classes @@ -282,7 +276,7 @@ - Note that all strategy operators return Boolean values. In + Notice that all strategy operators return Boolean values. In practice, all operators defined as index method strategies must return type boolean, since they must appear at the top level of a WHERE clause to be used with an index. @@ -309,7 +303,8 @@ functions should play each of these roles for a given data type and semantic interpretation. The index method defines the set of functions it needs, and the operator class identifies the correct - functions to use by assigning them to the support function numbers. + functions to use by assigning them to the support function numbers + specified by the index method. @@ -329,9 +324,9 @@ - Compare two keys and return an integer less than zero, zero, or - greater than zero, indicating whether the first key is less than, equal to, - or greater than the second. + Compare two keys and return an integer less than zero, zero, or + greater than zero, indicating whether the first key is less than, + equal to, or greater than the second. 1 @@ -456,7 +451,11 @@ Unlike strategy operators, support functions return whichever data type the particular index method expects; for example in the case - of the comparison function for B-trees, a signed integer. + of the comparison function for B-trees, a signed integer. The number + and types of the arguments to each support function are likewise + dependent on the index method. For B-tree and hash the support functions + take the same input data types as do the operators included in the operator + class, but this is not the case for most GIN and GiST support functions. @@ -644,37 +643,99 @@ CREATE OPERATOR CLASS complex_abs_ops - - Cross-Data-Type Operator Classes + + Operator Classes and Operator Families So far we have implicitly assumed that an operator class deals with only one data type. While there certainly can be only one data type in a particular index column, it is often useful to index operations that - compare an indexed column to a value of a different data type. This is - presently supported by the B-tree and GiST index methods. + compare an indexed column to a value of a different data type. Also, + if there is use for a cross-data-type operator in connection with an + operator class, it is often the case that the other data type has a + related operator class of its own. It is helpful to make the connections + between related classes explicit, because this can aid the planner in + optimizing SQL queries (particularly for B-tree operator classes, since + the planner contains a great deal of knowledge about how to work with them). - B-trees require the left-hand operand of each operator to be the indexed - data type, but the right-hand operand can be of a different type. There - must be a support function having a matching signature. For example, - the built-in operator class for type bigint (int8) - allows cross-type comparisons to int4 and int2. It - could be duplicated by this definition: + To handle these needs, PostgreSQL + uses the concept of an operator + familyoperator family. + An operator family contains one or more operator classes, and may also + contain indexable operators and corresponding support functions that + belong to the family as a whole but not to any single class within the + family. We say that such operators and functions are loose + within the family, as opposed to being bound into a specific class. + Typically each operator class contains single-data-type operators + while cross-data-type operators are loose in the family. + + + + All the operators and functions in an operator family must have compatible + semantics, where the compatibility requirements are set by the index + method. You might therefore wonder why bother to single out particular + subsets of the family as operator classes; and indeed for many purposes + the class divisions are irrelevant and the family is the only interesting + grouping. The reason for defining operator classes is that they specify + how much of the family is needed to support any particular index. + If there is an index using an operator class, then that operator class + cannot be dropped without dropping the index — but other parts of + the operator family, namely other operator classes and loose operators, + could be dropped. Thus, an operator class should be specified to contain + the minimum set of operators and functions that are reasonably needed + to work with an index on a specific data type, and then related but + non-essential operators can be added as loose members of the operator + family. + + + + As an example, PostgreSQL has a built-in + B-tree operator family integer_ops, which includes operator + classes int8_ops, int4_ops, and + int2_ops for indexes on bigint (int8), + integer (int4), and smallint (int2) + columns respectively. The family also contains cross-data-type comparison + operators allowing any two of these types to be compared, so that an index + on one of these types can be searched using a comparison value of another + type. The family could be duplicated by these definitions: +CREATE OPERATOR FAMILY integer_ops USING btree; + CREATE OPERATOR CLASS int8_ops -DEFAULT FOR TYPE int8 USING btree AS +DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS -- standard int8 comparisons OPERATOR 1 < , OPERATOR 2 <= , OPERATOR 3 = , OPERATOR 4 >= , OPERATOR 5 > , - FUNCTION 1 btint8cmp(int8, int8) , + FUNCTION 1 btint8cmp(int8, int8) ; - -- cross-type comparisons to int2 (smallint) +CREATE OPERATOR CLASS int4_ops +DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS + -- standard int4 comparisons + OPERATOR 1 < , + OPERATOR 2 <= , + OPERATOR 3 = , + OPERATOR 4 >= , + OPERATOR 5 > , + FUNCTION 1 btint4cmp(int4, int4) ; + +CREATE OPERATOR CLASS int2_ops +DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS + -- standard int2 comparisons + OPERATOR 1 < , + OPERATOR 2 <= , + OPERATOR 3 = , + OPERATOR 4 >= , + OPERATOR 5 > , + FUNCTION 1 btint2cmp(int2, int2) ; + +ALTER OPERATOR FAMILY integer_ops USING btree ADD + -- cross-type comparisons int8 vs int2 OPERATOR 1 < (int8, int2) , OPERATOR 2 <= (int8, int2) , OPERATOR 3 = (int8, int2) , @@ -682,31 +743,92 @@ DEFAULT FOR TYPE int8 USING btree AS OPERATOR 5 > (int8, int2) , FUNCTION 1 btint82cmp(int8, int2) , - -- cross-type comparisons to int4 (integer) + -- cross-type comparisons int8 vs int4 OPERATOR 1 < (int8, int4) , OPERATOR 2 <= (int8, int4) , OPERATOR 3 = (int8, int4) , OPERATOR 4 >= (int8, int4) , OPERATOR 5 > (int8, int4) , - FUNCTION 1 btint84cmp(int8, int4) ; + FUNCTION 1 btint84cmp(int8, int4) , + + -- cross-type comparisons int4 vs int2 + OPERATOR 1 < (int4, int2) , + OPERATOR 2 <= (int4, int2) , + OPERATOR 3 = (int4, int2) , + OPERATOR 4 >= (int4, int2) , + OPERATOR 5 > (int4, int2) , + FUNCTION 1 btint42cmp(int4, int2) , + + -- cross-type comparisons int4 vs int8 + OPERATOR 1 < (int4, int8) , + OPERATOR 2 <= (int4, int8) , + OPERATOR 3 = (int4, int8) , + OPERATOR 4 >= (int4, int8) , + OPERATOR 5 > (int4, int8) , + FUNCTION 1 btint48cmp(int4, int8) , + + -- cross-type comparisons int2 vs int8 + OPERATOR 1 < (int2, int8) , + OPERATOR 2 <= (int2, int8) , + OPERATOR 3 = (int2, int8) , + OPERATOR 4 >= (int2, int8) , + OPERATOR 5 > (int2, int8) , + FUNCTION 1 btint28cmp(int2, int8) , + + -- cross-type comparisons int2 vs int4 + OPERATOR 1 < (int2, int4) , + OPERATOR 2 <= (int2, int4) , + OPERATOR 3 = (int2, int4) , + OPERATOR 4 >= (int2, int4) , + OPERATOR 5 > (int2, int4) , + FUNCTION 1 btint24cmp(int2, int4) ; Notice that this definition overloads the operator strategy and - support function numbers. This is allowed (for B-tree operator classes - only) so long as each instance of a particular number has a different - right-hand data type. The instances that are not cross-type are the - default or primary operators of the operator class. + support function numbers: each number occurs multiple times within the + family. This is allowed so long as each instance of a + particular number has distinct input data types. The instances that have + both input types equal to an operator class's input type are the + primary operators and support functions for that operator class, + and in most cases should be declared as part of the operator class rather + than as loose members of the family. - GiST indexes do not allow overloading of strategy or support function - numbers, but it is still possible to get the effect of supporting - multiple right-hand data types, by assigning a distinct strategy number - to each operator that needs to be supported. The consistent - support function must determine what it needs to do based on the strategy - number, and must be prepared to accept comparison values of the appropriate - data types. + In a B-tree operator family, all the operators in the family must sort + compatibly, meaning that the transitive laws hold across all the data types + supported by the family: if A = B and B = C, then A = + C, and if A < B and B < C, then A < C. For each + operator in the family there must be a support function having the same + two input data types as the operator. It is recommended that a family be + complete, i.e., for each combination of data types, all operators are + included. An operator class should include just the non-cross-type + operators and support function for its data type. + + + At this writing, hash indexes do not support cross-type operations, + and so there is little use for a hash operator family larger than one + operator class. This is expected to be relaxed in the future. + + + + GIN and GiST indexes do not have any explicit notion of cross-data-type + operations. The set of operators supported is just whatever the primary + support functions for a given operator class can handle. + + + + + Prior to PostgreSQL 8.3, there was no concept + of operator families, and so any cross-data-type operators intended to be + used with an index had to be bound directly into the index's operator + class. While this approach still works, it is deprecated because it + makes an index's dependencies too broad, and because the planner can + handle cross-data-type comparisons more effectively when both data types + have operators in the same operator family. + + @@ -774,7 +896,8 @@ DEFAULT FOR TYPE int8 USING btree AS - Normally, declaring an operator as a member of an operator class means + Normally, declaring an operator as a member of an operator class + (or family) means that the index method can retrieve exactly the set of rows that satisfy a WHERE condition using the operator. For example,