postgresql/contrib/pg_trgm
Tom Lane 80a5cf643a Improve contrib/pg_trgm's heuristics for regexp index searches.
When extracting trigrams from a regular expression for search of a GIN or
GIST trigram index, it's useful to penalize (preferentially discard)
trigrams that contain whitespace, since those are typically far more common
in the index than trigrams not containing whitespace.  Of course, this
should only be a preference not a hard rule, since we might otherwise end
up with no trigrams to search for.  The previous coding tended to produce
fairly inefficient trigram search sets for anchored regexp patterns, as
reported by Erik Rijkers.  This patch penalizes whitespace-containing
trigrams, and also reduces the target number of extracted trigrams, since
experience suggests that the original coding tended to select too many
trigrams to search for.

Alexander Korotkov, reviewed by Tom Lane
2014-04-05 20:48:47 -04:00
..
data trgm - Trigram matching for PostgreSQL 2004-05-31 17:18:12 +00:00
expected Make contrib/pg_trgm also support regex searches with GiST indexes. 2013-04-10 13:31:02 -04:00
sql Make contrib/pg_trgm also support regex searches with GiST indexes. 2013-04-10 13:31:02 -04:00
.gitignore Support "make check" in contrib 2011-04-25 22:27:11 +03:00
Makefile Support indexing of regular-expression searches in contrib/pg_trgm. 2013-04-09 01:06:54 -04:00
pg_trgm--1.0--1.1.sql Fix typo in update scripts for some contrib modules. 2013-07-19 04:13:01 +09:00
pg_trgm--1.1.sql Make contrib/pg_trgm also support regex searches with GiST indexes. 2013-04-10 13:31:02 -04:00
pg_trgm--unpackaged--1.0.sql Throw a useful error message if an extension script file is fed to psql. 2011-10-12 15:45:03 -04:00
pg_trgm.control Support indexing of regular-expression searches in contrib/pg_trgm. 2013-04-09 01:06:54 -04:00
trgm.h Make contrib/pg_trgm also support regex searches with GiST indexes. 2013-04-10 13:31:02 -04:00
trgm_gin.c Make contrib/pg_trgm also support regex searches with GiST indexes. 2013-04-10 13:31:02 -04:00
trgm_gist.c Improve GiST index search performance for trigram regex queries. 2013-04-15 12:49:29 -04:00
trgm_op.c Fix possible buffer overrun in contrib/pg_trgm. 2014-01-13 13:07:10 -05:00
trgm_regexp.c Improve contrib/pg_trgm's heuristics for regexp index searches. 2014-04-05 20:48:47 -04:00