postgresql/contrib/pg_trgm
Tom Lane 9e43e8714c Fix contrib/pg_trgm's extraction of trigrams from regular expressions.
The logic for removing excess trigrams from the result was faulty.
It intends to avoid merging the initial and final states of the NFA,
which is necessary, but in testing whether removal of a specific trigram
would cause that, it failed to consider the combined effects of all the
state merges that that trigram's removal would cause.  This could result
in a broken final graph that would never match anything, leading to GIN
or GiST indexscans not finding anything.

To fix, add a "tentParent" field that is used only within this loop,
and set it to show state merges that we are tentatively going to do.
While examining a particular arc, we must chase up through tentParent
links as well as regular parent links (the former can only appear atop
the latter), and we must account for state init/fin flag merges that
haven't actually been done yet.

To simplify the latter, combine the separate init and fin bool fields
into a bitmap flags field.  I also chose to get rid of the "children"
state list, which seems entirely inessential.

Per bug #14563 from Alexey Isayko, which the added test cases are based on.
Back-patch to 9.3 where this code was added.

Report: https://postgr.es/m/20170222111446.1256.67547@wrigleys.postgresql.org
Discussion: https://postgr.es/m/8816.1487787594@sss.pgh.pa.us
2017-02-22 15:04:26 -05:00
..
data Add files forgotten in f576b17cd6 2016-03-16 19:23:41 +03:00
expected Fix contrib/pg_trgm's extraction of trigrams from regular expressions. 2017-02-22 15:04:26 -05:00
sql Fix contrib/pg_trgm's extraction of trigrams from regular expressions. 2017-02-22 15:04:26 -05:00
.gitignore Support "make check" in contrib 2011-04-25 22:27:11 +03:00
Makefile Handle contrib's GIN/GIST support function signature changes honestly. 2016-06-09 16:44:25 -04:00
pg_trgm--1.0--1.1.sql Fix typo in update scripts for some contrib modules. 2013-07-19 04:13:01 +09:00
pg_trgm--1.1--1.2.sql Add word_similarity to pg_trgm contrib module. 2016-03-16 18:59:21 +03:00
pg_trgm--1.2--1.3.sql pg_trgm's set_limit() function is parallel unsafe, not parallel restricted. 2016-06-20 11:29:54 -04:00
pg_trgm--1.3.sql pg_trgm's set_limit() function is parallel unsafe, not parallel restricted. 2016-06-20 11:29:54 -04:00
pg_trgm--unpackaged--1.0.sql Fix typos in some error messages thrown by extension scripts when fed to psql. 2014-08-25 18:30:37 +02:00
pg_trgm.control Handle contrib's GIN/GIST support function signature changes honestly. 2016-06-09 16:44:25 -04:00
trgm.h Add word_similarity to pg_trgm contrib module. 2016-03-16 18:59:21 +03:00
trgm_gin.c pgindent run for 9.6 2016-06-09 18:02:36 -04:00
trgm_gist.c Fix comparison of similarity to threshold in GIST trigram searches. 2016-06-20 10:49:19 -04:00
trgm_op.c Fix typos in comments. 2017-02-06 11:33:58 +02:00
trgm_regexp.c Fix contrib/pg_trgm's extraction of trigrams from regular expressions. 2017-02-22 15:04:26 -05:00