Commit Graph

19 Commits

Author SHA1 Message Date
Teodor Sigaev 3dbbd0f02a Do not fallback to AND for FTS phrase operator.
If there is no positional information of lexemes then phrase operator will not
fallback to AND operator. This change makes needing to modify TS_execute()
interface, because somewhere (in indexes, for example) positional information
is unaccesible and in this cases we need to force fallback to AND.

Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
2016-06-27 20:47:32 +03:00
Teodor Sigaev a7ace3b6d9 Make testing of phraseto_tsquery independ from value of
default_text_search_config variable.

Per skink buldfarm member
2016-04-07 19:33:23 +03:00
Teodor Sigaev bb140506df Phrase full text search.
Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery.
On-disk and binary in/out format of tsquery are backward compatible.
It has two side effect:
- change order for tsquery, so, users, who has a btree index over tsquery,
  should reindex it
- less number of parenthesis in tsquery output, and tsquery becomes more
  readable

Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov
Reviewers: Alexander Korotkov, Artur Zakirov
2016-04-07 18:44:18 +03:00
Teodor Sigaev 61d66c44f1 Fix support of digits in email/hostnames.
When tsearch was implemented I did several mistakes in hostname/email
definition rules:
1) allow underscore in hostname what prohibited by RFC
2) forget to allow leading digits separated by hyphen (like 123-x.com)
   in hostname
3) do no allow underscore/hyphen after leading digits in localpart of email

Artur's patch resolves two last issues, but by the way allows hosts name like
123_x.com together with 123-x.com. RFC forbids underscore usage in hostname
but pg allows that since initial tsearch version in core, although only
for non-digits. Patch syncs support digits and nondigits in both hostname and
email.

Forbidding underscore in hostname may break existsing usage of tsearch and,
anyhow, it should be done by separate patch.

Author: Artur Zakirov
BUG: #13964
2016-03-29 18:28:49 +03:00
Bruce Momjian 1420f3a982 Fix ts_rank_cd() to ignore stripped lexemes
Previously, stripped lexemes got a default location and could be
considered if mixed with non-stripped lexemes.

BACKWARD INCOMPATIBILITY CHANGE
2014-03-24 14:37:16 -04:00
Tom Lane 1db5af2794 Fix gincostestimate to handle ScalarArrayOpExpr reasonably.
The original coding of this function overlooked the possibility that
it could be passed anything except simple OpExpr indexquals.  But
ScalarArrayOpExpr is possible too, and the code would probably crash
(and surely give ridiculous answers) in such a case.  Add logic to try
to estimate sanely for such cases.

In passing, fix the treatment of inner-indexscan cost estimation: it was
failing to scale up properly for multiple iterations of a nestloop.
(I think somebody might've thought that index_pages_fetched() is linear,
but of course it's not.)

Report, diagnosis, and preliminary patch by Marti Raudsepp; I refactored
it a bit and fixed the cost estimation.

Back-patch into 9.1 where the bogus code was introduced.
2011-12-20 19:57:34 -05:00
Peter Eisentraut fc946c39ae Remove useless whitespace at end of lines 2010-11-23 22:34:55 +02:00
Tom Lane 2c265adea3 Modify the built-in text search parser to handle URLs more nearly according
to RFC 3986.  In particular, these characters now terminate the path part
of a URL: '"', '<', '>', '\', '^', '`', '{', '|', '}'.  The previous behavior
was inconsistent and depended on whether a "?" was present in the path.
Per gripe from Donald Fraser and spec research by Kevin Grittner.

This is a pre-existing bug, but not back-patching since the risks of
breaking existing applications seem to outweigh the benefits.
2010-04-28 02:04:16 +00:00
Tom Lane 7280fab717 Fix bug #4814 (wrong subscript in consistent-function call), and add some
minimal regression test coverage for matchPartialInPendingList().
2009-05-19 02:48:26 +00:00
Teodor Sigaev 2a0083ede8 Improve headeline generation. Now headline can contain
several fragments a-la Google.

Sushant Sinha <sushant354@gmail.com>
2008-10-17 18:05:19 +00:00
Tom Lane e6dbcb72fa Extend GIN to support partial-match searches, and extend tsquery to support
prefix matching using this facility.

Teodor Sigaev and Oleg Bartunov
2008-05-16 16:31:02 +00:00
Tom Lane 689d02a2e9 Fix a regression test that fails if default_text_search_config isn't
'english'.
2008-01-13 21:17:46 +00:00
Tom Lane 82ca4d0210 Fix attribution for Rime of the Ancient Mariner (obviously it's been
too long since freshman English :-()
2007-12-10 00:12:31 +00:00
Tom Lane 71e90b0df2 The E. J. Pratt verse used as a tsearch test case is unfortunately still
under copyright in the US and many other places.  Substitute a little
something from a poet who's more safely dead.  Per gripe from Bjorn Munch.
2007-12-09 21:01:18 +00:00
Andrew Dunstan 3de1f0daac Fix XML tag namespace change inadvertantly missed from previous fix. Add
regression test for XML names and numeric entities.
2007-11-25 15:37:11 +00:00
Tom Lane 592c88a0d2 Remove the aggregate form of ts_rewrite(), since it doesn't work as desired
if there are zero rows to aggregate over, and the API seems both conceptually
and notationally ugly anyway.  We should look for something that improves
on the tsquery-and-text-SELECT version (which is also pretty ugly but at
least it works...), but it seems that will take query infrastructure that
doesn't exist today.  (Hm, I wonder if there's anything in or near SQL2003
window functions that would help?)  Per discussion.
2007-10-24 02:24:49 +00:00
Tom Lane 93eab9312f Rename built-in Snowball stemmer dictionaries to be english_stem,
russian_stem, etc.  Per discussion.
2007-08-25 01:06:25 +00:00
Bruce Momjian 1c36de33b0 Uppercase keywords in regression tsearch test scripts. 2007-08-21 15:41:13 +00:00
Tom Lane 140d4ebcb4 Tsearch2 functionality migrates to core. The bulk of this work is by
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.

Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
2007-08-21 01:11:32 +00:00