Reduce match_pattern_prefix()'s dependencies on index opfamilies.

Historically, the planner's LIKE/regex index optimizations were only
carried out for specific index opfamilies.  That's never been a great
idea from the standpoint of extensibility, but it didn't matter so
much as long as we had no practical way to extend such behaviors anyway.
With the addition of planner support functions, and in view of ongoing
work to support additional table and index AMs, it seems like a good
time to relax this.

Hence, recast the decisions in match_pattern_prefix() so that rather
than decide which operators to generate by looking at what the index
opfamily contains, we decide which operators to generate a-priori
and then see if the opfamily supports them.  This is much more
defensible from a semantic standpoint anyway, since we know the
semantics of the chosen operators precisely, and we only need to
assume that the opfamily correctly implements operators it claims
to support.

The existing "pattern" opfamilies put a crimp in this approach, since
we need to select the pattern operators if we want those to work.
So we still have to special-case those opfamilies.  But that seems
all right, since in view of the addition of collations, the pattern
opfamilies seem like a legacy hack that nobody will be building on.

The only immediate effect of this change, so far as the core code is
concerned, is that anchored LIKE/regex patterns can be mapped onto
BRIN index searches, and exact-match patterns can be mapped onto hash
indexes, not only btree and spgist indexes as before.  That's not a
terribly exciting result, but it does fix an omission mentioned in
the ancient comments here.

Note: no catversion bump, even though this touches pg_operator.dat,
because it's only adding OID macros not changing the contents of
postgres.bki.

Per consideration of a report from Manuel Rigger.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
This commit is contained in:
Tom Lane 2019-11-20 14:13:04 -05:00
parent 86be6453ba
commit 2ddedcafca
2 changed files with 108 additions and 72 deletions

View File

@ -39,6 +39,7 @@
#include "access/htup_details.h"
#include "access/stratnum.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_operator.h"
#include "catalog/pg_opfamily.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_type.h"
@ -240,7 +241,10 @@ match_pattern_prefix(Node *leftop,
Pattern_Prefix_Status pstatus;
Oid ldatatype;
Oid rdatatype;
Oid oproid;
Oid eqopr;
Oid ltopr;
Oid geopr;
bool collation_aware;
Expr *expr;
FmgrInfo ltproc;
Const *greaterstr;
@ -284,62 +288,89 @@ match_pattern_prefix(Node *leftop,
return NIL;
/*
* Must also check that index's opfamily supports the operators we will
* want to apply. (A hash index, for example, will not support ">=".)
* Currently, only btree and spgist support the operators we need.
*
* Note: actually, in the Pattern_Prefix_Exact case, we only need "=" so a
* hash index would work. Currently it doesn't seem worth checking for
* that, however.
*
* We insist on the opfamily being one of the specific ones we expect,
* else we'd do the wrong thing if someone were to make a reverse-sort
* opfamily with the same operators.
*
* The non-pattern opclasses will not sort the way we need in most non-C
* locales. We can use such an index anyway for an exact match (simple
* equality), but not for prefix-match cases. Note that here we are
* looking at the index's collation, not the expression's collation --
* this test is *not* dependent on the LIKE/regex operator's collation.
*
* While we're at it, identify the type the comparison constant(s) should
* have, based on the opfamily.
* Identify the operators we want to use, based on the type of the
* left-hand argument. Usually these are just the type's regular
* comparison operators, but if we are considering one of the semi-legacy
* "pattern" opclasses, use the "pattern" operators instead. Those are
* not collation-sensitive but always use C collation, as we want. The
* selected operators also determine the needed type of the prefix
* constant.
*/
switch (opfamily)
ldatatype = exprType(leftop);
switch (ldatatype)
{
case TEXT_BTREE_FAM_OID:
if (!(pstatus == Pattern_Prefix_Exact ||
lc_collate_is_c(indexcollation)))
return NIL;
case TEXTOID:
if (opfamily == TEXT_PATTERN_BTREE_FAM_OID ||
opfamily == TEXT_SPGIST_FAM_OID)
{
eqopr = TextEqualOperator;
ltopr = TextPatternLessOperator;
geopr = TextPatternGreaterEqualOperator;
collation_aware = false;
}
else
{
eqopr = TextEqualOperator;
ltopr = TextLessOperator;
geopr = TextGreaterEqualOperator;
collation_aware = true;
}
rdatatype = TEXTOID;
break;
case NAMEOID:
case TEXT_PATTERN_BTREE_FAM_OID:
case TEXT_SPGIST_FAM_OID:
/*
* Note that here, we need the RHS type to be text, so that the
* comparison value isn't improperly truncated to NAMEDATALEN.
*/
eqopr = NameEqualTextOperator;
ltopr = NameLessTextOperator;
geopr = NameGreaterEqualTextOperator;
collation_aware = true;
rdatatype = TEXTOID;
break;
case BPCHAR_BTREE_FAM_OID:
if (!(pstatus == Pattern_Prefix_Exact ||
lc_collate_is_c(indexcollation)))
return NIL;
case BPCHAROID:
if (opfamily == BPCHAR_PATTERN_BTREE_FAM_OID)
{
eqopr = BpcharEqualOperator;
ltopr = BpcharPatternLessOperator;
geopr = BpcharPatternGreaterEqualOperator;
collation_aware = false;
}
else
{
eqopr = BpcharEqualOperator;
ltopr = BpcharLessOperator;
geopr = BpcharGreaterEqualOperator;
collation_aware = true;
}
rdatatype = BPCHAROID;
break;
case BPCHAR_PATTERN_BTREE_FAM_OID:
rdatatype = BPCHAROID;
break;
case BYTEA_BTREE_FAM_OID:
case BYTEAOID:
eqopr = ByteaEqualOperator;
ltopr = ByteaLessOperator;
geopr = ByteaGreaterEqualOperator;
collation_aware = false;
rdatatype = BYTEAOID;
break;
default:
/* Can't get here unless we're attached to the wrong operator */
return NIL;
}
/* OK, prepare to create the indexqual(s) */
ldatatype = exprType(leftop);
/*
* If necessary, verify that the index's collation behavior is compatible.
* For an exact-match case, we don't have to be picky. Otherwise, insist
* that the index collation be "C". Note that here we are looking at the
* index's collation, not the expression's collation -- this test is *not*
* dependent on the LIKE/regex operator's collation.
*/
if (collation_aware)
{
if (!(pstatus == Pattern_Prefix_Exact ||
lc_collate_is_c(indexcollation)))
return NIL;
}
/*
* If necessary, coerce the prefix constant to the right type. The given
@ -358,16 +389,17 @@ match_pattern_prefix(Node *leftop,
/*
* If we found an exact-match pattern, generate an "=" indexqual.
*
* (Despite the checks above, we might fail to find a suitable operator in
* some cases with binary-compatible opclasses. Just punt if so.)
* Here and below, check to see whether the desired operator is actually
* supported by the index opclass, and fail quietly if not. This allows
* us to not be concerned with specific opclasses (except for the legacy
* "pattern" cases); any index that correctly implements the operators
* will work.
*/
if (pstatus == Pattern_Prefix_Exact)
{
oproid = get_opfamily_member(opfamily, ldatatype, rdatatype,
BTEqualStrategyNumber);
if (oproid == InvalidOid)
if (!op_in_opfamily(eqopr, opfamily))
return NIL;
expr = make_opclause(oproid, BOOLOID, false,
expr = make_opclause(eqopr, BOOLOID, false,
(Expr *) leftop, (Expr *) prefix,
InvalidOid, indexcollation);
result = list_make1(expr);
@ -379,11 +411,9 @@ match_pattern_prefix(Node *leftop,
*
* We can always say "x >= prefix".
*/
oproid = get_opfamily_member(opfamily, ldatatype, rdatatype,
BTGreaterEqualStrategyNumber);
if (oproid == InvalidOid)
if (!op_in_opfamily(geopr, opfamily))
return NIL;
expr = make_opclause(oproid, BOOLOID, false,
expr = make_opclause(geopr, BOOLOID, false,
(Expr *) leftop, (Expr *) prefix,
InvalidOid, indexcollation);
result = list_make1(expr);
@ -396,15 +426,13 @@ match_pattern_prefix(Node *leftop,
* using a C-locale index collation.
*-------
*/
oproid = get_opfamily_member(opfamily, ldatatype, rdatatype,
BTLessStrategyNumber);
if (oproid == InvalidOid)
if (!op_in_opfamily(ltopr, opfamily))
return result;
fmgr_info(get_opcode(oproid), &ltproc);
fmgr_info(get_opcode(ltopr), &ltproc);
greaterstr = make_greater_string(prefix, &ltproc, indexcollation);
if (greaterstr)
{
expr = make_opclause(oproid, BOOLOID, false,
expr = make_opclause(ltopr, BOOLOID, false,
(Expr *) leftop, (Expr *) greaterstr,
InvalidOid, indexcollation);
result = lappend(result, expr);

View File

@ -107,12 +107,12 @@
oprcode => 'starts_with', oprrest => 'prefixsel',
oprjoin => 'prefixjoinsel' },
{ oid => '254', descr => 'equal',
{ oid => '254', oid_symbol => 'NameEqualTextOperator', descr => 'equal',
oprname => '=', oprcanmerge => 't', oprcanhash => 't', oprleft => 'name',
oprright => 'text', oprresult => 'bool', oprcom => '=(text,name)',
oprnegate => '<>(name,text)', oprcode => 'nameeqtext', oprrest => 'eqsel',
oprjoin => 'eqjoinsel' },
{ oid => '255', descr => 'less than',
{ oid => '255', oid_symbol => 'NameLessTextOperator', descr => 'less than',
oprname => '<', oprleft => 'name', oprright => 'text', oprresult => 'bool',
oprcom => '>(text,name)', oprnegate => '>=(name,text)',
oprcode => 'namelttext', oprrest => 'scalarltsel',
@ -122,7 +122,8 @@
oprcom => '>=(text,name)', oprnegate => '>(name,text)',
oprcode => 'nameletext', oprrest => 'scalarlesel',
oprjoin => 'scalarlejoinsel' },
{ oid => '257', descr => 'greater than or equal',
{ oid => '257', oid_symbol => 'NameGreaterEqualTextOperator',
descr => 'greater than or equal',
oprname => '>=', oprleft => 'name', oprright => 'text', oprresult => 'bool',
oprcom => '<=(text,name)', oprnegate => '<(name,text)',
oprcode => 'namegetext', oprrest => 'scalargesel',
@ -789,7 +790,7 @@
oprname => '>=', oprleft => 'name', oprright => 'name', oprresult => 'bool',
oprcom => '<=(name,name)', oprnegate => '<(name,name)', oprcode => 'namege',
oprrest => 'scalargesel', oprjoin => 'scalargejoinsel' },
{ oid => '664', descr => 'less than',
{ oid => '664', oid_symbol => 'TextLessOperator', descr => 'less than',
oprname => '<', oprleft => 'text', oprright => 'text', oprresult => 'bool',
oprcom => '>(text,text)', oprnegate => '>=(text,text)', oprcode => 'text_lt',
oprrest => 'scalarltsel', oprjoin => 'scalarltjoinsel' },
@ -801,7 +802,8 @@
oprname => '>', oprleft => 'text', oprright => 'text', oprresult => 'bool',
oprcom => '<(text,text)', oprnegate => '<=(text,text)', oprcode => 'text_gt',
oprrest => 'scalargtsel', oprjoin => 'scalargtjoinsel' },
{ oid => '667', descr => 'greater than or equal',
{ oid => '667', oid_symbol => 'TextGreaterEqualOperator',
descr => 'greater than or equal',
oprname => '>=', oprleft => 'text', oprright => 'text', oprresult => 'bool',
oprcom => '<=(text,text)', oprnegate => '<(text,text)', oprcode => 'text_ge',
oprrest => 'scalargesel', oprjoin => 'scalargejoinsel' },
@ -1160,7 +1162,7 @@
oprname => '@@', oprkind => 'l', oprleft => '0', oprright => 'polygon',
oprresult => 'point', oprcode => 'poly_center' },
{ oid => '1054', descr => 'equal',
{ oid => '1054', oid_symbol => 'BpcharEqualOperator', descr => 'equal',
oprname => '=', oprcanmerge => 't', oprcanhash => 't', oprleft => 'bpchar',
oprright => 'bpchar', oprresult => 'bool', oprcom => '=(bpchar,bpchar)',
oprnegate => '<>(bpchar,bpchar)', oprcode => 'bpchareq', oprrest => 'eqsel',
@ -1180,7 +1182,7 @@
oprresult => 'bool', oprcom => '<>(bpchar,bpchar)',
oprnegate => '=(bpchar,bpchar)', oprcode => 'bpcharne', oprrest => 'neqsel',
oprjoin => 'neqjoinsel' },
{ oid => '1058', descr => 'less than',
{ oid => '1058', oid_symbol => 'BpcharLessOperator', descr => 'less than',
oprname => '<', oprleft => 'bpchar', oprright => 'bpchar',
oprresult => 'bool', oprcom => '>(bpchar,bpchar)',
oprnegate => '>=(bpchar,bpchar)', oprcode => 'bpcharlt',
@ -1195,7 +1197,8 @@
oprresult => 'bool', oprcom => '<(bpchar,bpchar)',
oprnegate => '<=(bpchar,bpchar)', oprcode => 'bpchargt',
oprrest => 'scalargtsel', oprjoin => 'scalargtjoinsel' },
{ oid => '1061', descr => 'greater than or equal',
{ oid => '1061', oid_symbol => 'BpcharGreaterEqualOperator',
descr => 'greater than or equal',
oprname => '>=', oprleft => 'bpchar', oprright => 'bpchar',
oprresult => 'bool', oprcom => '<=(bpchar,bpchar)',
oprnegate => '<(bpchar,bpchar)', oprcode => 'bpcharge',
@ -2330,7 +2333,7 @@
oprresult => 'numeric', oprcode => 'numeric_uplus' },
# bytea operators
{ oid => '1955', descr => 'equal',
{ oid => '1955', oid_symbol => 'ByteaEqualOperator', descr => 'equal',
oprname => '=', oprcanmerge => 't', oprcanhash => 't', oprleft => 'bytea',
oprright => 'bytea', oprresult => 'bool', oprcom => '=(bytea,bytea)',
oprnegate => '<>(bytea,bytea)', oprcode => 'byteaeq', oprrest => 'eqsel',
@ -2339,7 +2342,7 @@
oprname => '<>', oprleft => 'bytea', oprright => 'bytea', oprresult => 'bool',
oprcom => '<>(bytea,bytea)', oprnegate => '=(bytea,bytea)',
oprcode => 'byteane', oprrest => 'neqsel', oprjoin => 'neqjoinsel' },
{ oid => '1957', descr => 'less than',
{ oid => '1957', oid_symbol => 'ByteaLessOperator', descr => 'less than',
oprname => '<', oprleft => 'bytea', oprright => 'bytea', oprresult => 'bool',
oprcom => '>(bytea,bytea)', oprnegate => '>=(bytea,bytea)',
oprcode => 'bytealt', oprrest => 'scalarltsel',
@ -2354,7 +2357,8 @@
oprcom => '<(bytea,bytea)', oprnegate => '<=(bytea,bytea)',
oprcode => 'byteagt', oprrest => 'scalargtsel',
oprjoin => 'scalargtjoinsel' },
{ oid => '1960', descr => 'greater than or equal',
{ oid => '1960', oid_symbol => 'ByteaGreaterEqualOperator',
descr => 'greater than or equal',
oprname => '>=', oprleft => 'bytea', oprright => 'bytea', oprresult => 'bool',
oprcom => '<=(bytea,bytea)', oprnegate => '<(bytea,bytea)',
oprcode => 'byteage', oprrest => 'scalargesel',
@ -2416,7 +2420,8 @@
oprresult => 'timestamp', oprcode => 'timestamp_mi_interval' },
# character-by-character (not collation order) comparison operators for character types
{ oid => '2314', descr => 'less than',
{ oid => '2314', oid_symbol => 'TextPatternLessOperator',
descr => 'less than',
oprname => '~<~', oprleft => 'text', oprright => 'text', oprresult => 'bool',
oprcom => '~>~(text,text)', oprnegate => '~>=~(text,text)',
oprcode => 'text_pattern_lt', oprrest => 'scalarltsel',
@ -2426,7 +2431,8 @@
oprcom => '~>=~(text,text)', oprnegate => '~>~(text,text)',
oprcode => 'text_pattern_le', oprrest => 'scalarlesel',
oprjoin => 'scalarlejoinsel' },
{ oid => '2317', descr => 'greater than or equal',
{ oid => '2317', oid_symbol => 'TextPatternGreaterEqualOperator',
descr => 'greater than or equal',
oprname => '~>=~', oprleft => 'text', oprright => 'text', oprresult => 'bool',
oprcom => '~<=~(text,text)', oprnegate => '~<~(text,text)',
oprcode => 'text_pattern_ge', oprrest => 'scalargesel',
@ -2437,7 +2443,8 @@
oprcode => 'text_pattern_gt', oprrest => 'scalargtsel',
oprjoin => 'scalargtjoinsel' },
{ oid => '2326', descr => 'less than',
{ oid => '2326', oid_symbol => 'BpcharPatternLessOperator',
descr => 'less than',
oprname => '~<~', oprleft => 'bpchar', oprright => 'bpchar',
oprresult => 'bool', oprcom => '~>~(bpchar,bpchar)',
oprnegate => '~>=~(bpchar,bpchar)', oprcode => 'bpchar_pattern_lt',
@ -2447,7 +2454,8 @@
oprresult => 'bool', oprcom => '~>=~(bpchar,bpchar)',
oprnegate => '~>~(bpchar,bpchar)', oprcode => 'bpchar_pattern_le',
oprrest => 'scalarlesel', oprjoin => 'scalarlejoinsel' },
{ oid => '2329', descr => 'greater than or equal',
{ oid => '2329', oid_symbol => 'BpcharPatternGreaterEqualOperator',
descr => 'greater than or equal',
oprname => '~>=~', oprleft => 'bpchar', oprright => 'bpchar',
oprresult => 'bool', oprcom => '~<=~(bpchar,bpchar)',
oprnegate => '~<~(bpchar,bpchar)', oprcode => 'bpchar_pattern_ge',