Commit Graph

11 Commits

Author SHA1 Message Date
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Tom Lane 9ff5b699ed Sync patternsel_common's operator selection logic with pattern_prefix's.
Make patternsel_common() select the comparison operators to use with
hardwired logic that matches pattern_prefix()'s new logic, eliminating
its dependencies on particular index opfamilies.

This shouldn't change any behavior, as it's just replacing runtime
operator lookups with the same values hard-wired.  But it makes these
closely-related functions look more alike, and saving some runtime
syscache lookups is worth something.

Actually, it's not quite true that this is zero behavioral change:
when estimating for a column of type "name", the comparison constant
will be kept as "text" not coerced to "name".  But that's more correct
anyway, and it allows additional simplification of the coercion logic,
again syncing this more closely with pattern_prefix().

Per consideration of a report from Manuel Rigger.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
2019-11-20 15:00:18 -05:00
Tom Lane 2ddedcafca Reduce match_pattern_prefix()'s dependencies on index opfamilies.
Historically, the planner's LIKE/regex index optimizations were only
carried out for specific index opfamilies.  That's never been a great
idea from the standpoint of extensibility, but it didn't matter so
much as long as we had no practical way to extend such behaviors anyway.
With the addition of planner support functions, and in view of ongoing
work to support additional table and index AMs, it seems like a good
time to relax this.

Hence, recast the decisions in match_pattern_prefix() so that rather
than decide which operators to generate by looking at what the index
opfamily contains, we decide which operators to generate a-priori
and then see if the opfamily supports them.  This is much more
defensible from a semantic standpoint anyway, since we know the
semantics of the chosen operators precisely, and we only need to
assume that the opfamily correctly implements operators it claims
to support.

The existing "pattern" opfamilies put a crimp in this approach, since
we need to select the pattern operators if we want those to work.
So we still have to special-case those opfamilies.  But that seems
all right, since in view of the addition of collations, the pattern
opfamilies seem like a legacy hack that nobody will be building on.

The only immediate effect of this change, so far as the core code is
concerned, is that anchored LIKE/regex patterns can be mapped onto
BRIN index searches, and exact-match patterns can be mapped onto hash
indexes, not only btree and spgist indexes as before.  That's not a
terribly exciting result, but it does fix an omission mentioned in
the ancient comments here.

Note: no catversion bump, even though this touches pg_operator.dat,
because it's only adding OID macros not changing the contents of
postgres.bki.

Per consideration of a report from Manuel Rigger.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
2019-11-20 14:13:04 -05:00
Tom Lane b3c265d7be Fix corner-case failure in match_pattern_prefix().
The planner's optimization code for LIKE and regex operators could
error out with a complaint like "no = operator for opfamily NNN"
if someone created a binary-compatible index (for example, a
bpchar_ops index on a text column) on the LIKE's left argument.

This is a consequence of careless refactoring in commit 74dfe58a5.
The old code in match_special_index_operator only accepted specific
combinations of the pattern operator and the index opclass, thereby
indirectly guaranteeing that the opclass would have a comparison
operator with the same LHS input type as the pattern operator.
While moving the logic out to a planner support function, I simplified
that test in a way that no longer guarantees that.  Really though we'd
like an altogether weaker dependency on the opclass, so rather than
put back exactly the old code, just allow lookup failure.  I have in
mind now to rewrite this logic completely, but this is the minimum
change needed to fix the bug in v12.

Per report from Manuel Rigger.  Back-patch to v12 where the mistake
came in.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
2019-11-19 17:03:34 -05:00
Tom Lane 03c811a483 Fix planner's test for case-foldable characters in ILIKE with ICU.
As coded, the ICU-collation path in pattern_char_isalpha() failed
to consider regular ASCII letters to be case-varying.  This led to
like_fixed_prefix treating too much of an ILIKE pattern as being a
fixed prefix, so that indexscans derived from an ILIKE clause might
miss entries that they should find.

Per bug #15892 from James Inform.  This is an oversight in the original
ICU patch (commit eccfef81e), so back-patch to v10 where that came in.

Discussion: https://postgr.es/m/15892-e5d2bea3e8a04a1b@postgresql.org
2019-08-12 13:15:47 -04:00
Tom Lane 8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
Tom Lane be76af171c Initial pgindent run for v12.
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.

Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
2019-05-22 12:55:34 -04:00
Peter Eisentraut abb9c63b2c Unbreak index optimization for LIKE on bytea
The same code is used to handle both text and bytea, but bytea is not
collation-aware, so we shouldn't call get_collation_isdeterministic()
in that case, since that will error out with an invalid collation.

Reported-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAM2%2B6%3DWaf3qJ1%3DyVTUH8_yG-SC0xcBMY%2BSFLhvKKNnWNXSUDBw%40mail.gmail.com
2019-04-15 09:29:17 +02:00
Peter Eisentraut 5e1963fb76 Collations with nondeterministic comparison
This adds a flag "deterministic" to collations.  If that is false,
such a collation disables various optimizations that assume that
strings are equal only if they are byte-wise equal.  That then allows
use cases such as case-insensitive or accent-insensitive comparisons
or handling of strings with different Unicode normal forms.

This functionality is only supported with the ICU provider.  At least
glibc doesn't appear to have any locales that work in a
nondeterministic way, so it's not worth supporting this for the libc
provider.

The term "deterministic comparison" in this context is from Unicode
Technical Standard #10
(https://unicode.org/reports/tr10/#Deterministic_Comparison).

This patch makes changes in three areas:

- CREATE COLLATION DDL changes and system catalog changes to support
  this new flag.

- Many executor nodes and auxiliary code are extended to track
  collations.  Previously, this code would just throw away collation
  information, because the eventually-called user-defined functions
  didn't use it since they only cared about equality, which didn't
  need collation information.

- String data type functions that do equality comparisons and hashing
  are changed to take the (non-)deterministic flag into account.  For
  comparison, this just means skipping various shortcuts and tie
  breakers that use byte-wise comparison.  For hashing, we first need
  to convert the input string to a canonical "sort key" using the ICU
  analogue of strxfrm().

Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
2019-03-22 12:12:43 +01:00
Tom Lane 49fa99e54e Move pattern selectivity code from selfuncs.c to like_support.c.
While at it, refactor patternsel() a bit so that it can be used from
the LIKE/regex planner support functions as well.  This makes the
planner able to deal equally well with either operator or function
syntax for these operations.  I'm not excited about that as a feature
in itself, but it provides a nice model for extensions to follow if
they want such behavior for their operations.

This change localizes the use of pattern_fixed_prefix() and
make_greater_string() so that they no longer need be exported.
(We might get pushback from extensions about that, perhaps,
in which case I'd be inclined to re-export them in a new header
file like_support.h.)

This reduces the bulk of selfuncs.c a fair amount, removing ~1370
lines or about one-sixth of that file; it's still too big, but this
is progress.

Discussion: https://postgr.es/m/24537.1550093915@sss.pgh.pa.us
2019-02-14 10:51:59 -05:00
Tom Lane 74dfe58a59 Allow extensions to generate lossy index conditions.
For a long time, indxpath.c has had the ability to extract derived (lossy)
index conditions from certain operators such as LIKE.  For just as long,
it's been obvious that we really ought to make that capability available
to extensions.  This commit finally accomplishes that, by adding another
API for planner support functions that lets them create derived index
conditions for their functions.  As proof of concept, the hardwired
"special index operator" code formerly present in indxpath.c is pushed
out to planner support functions attached to LIKE and other relevant
operators.

A weak spot in this design is that an extension needs to know OIDs for
the operators, datatypes, and opfamilies involved in the transformation
it wants to make.  The core-code prototypes use hard-wired OID references
but extensions don't have that option for their own operators etc.  It's
usually possible to look up the required info, but that may be slow and
inconvenient.  However, improving that situation is a separate task.

I want to do some additional refactorization around selfuncs.c, but
that also seems like a separate task.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-11 21:26:14 -05:00