Commit Graph

37 Commits

Author SHA1 Message Date
Tom Lane d70d119151 Make contrib regression tests safe for Danish locale.
In btree_gin and citext, avoid some not-particularly-interesting
dependencies on the sorting of 'aa'.  In tsearch2, use COLLATE "C" to
remove an uninteresting dependency on locale sort order (and thereby
allow removal of a variant expected-file).

Also, in citext, avoid assuming that lower('I') = 'i'.  This isn't relevant
to Danish but it does fail in Turkish.
2016-07-21 16:52:35 -04:00
Tom Lane de33af8823 Update contrib/tsearch2/expected/tsearch2_1.out for phrase FTS.
Commits bb140506d and 38627f687 didn't bother with this.
Per buildfarm member magpie.
2016-06-04 00:49:42 -04:00
Teodor Sigaev 38627f6878 Fix output of regression test of contrib/tsearch2
Just forget to add in 1ec4c7c055
2016-04-08 20:37:12 +03:00
Teodor Sigaev bb140506df Phrase full text search.
Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery.
On-disk and binary in/out format of tsquery are backward compatible.
It has two side effect:
- change order for tsquery, so, users, who has a btree index over tsquery,
  should reindex it
- less number of parenthesis in tsquery output, and tsquery becomes more
  readable

Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov
Reviewers: Alexander Korotkov, Artur Zakirov
2016-04-07 18:44:18 +03:00
Tom Lane 629b3af27d Convert contrib modules to use the extension facility.
This isn't fully tested as yet, in particular I'm not sure that the
"foo--unpackaged--1.0.sql" scripts are OK.  But it's time to get some
buildfarm cycles on it.

sepgsql is not converted to an extension, mainly because it seems to
require a very nonstandard installation process.

Dimitri Fontaine and Tom Lane
2011-02-13 22:54:49 -05:00
Peter Eisentraut fc946c39ae Remove useless whitespace at end of lines 2010-11-23 22:34:55 +02:00
Tom Lane a2de4826e9 Fix contrib/tsearch2 expected results to match recent changes in URL parsing. 2010-04-28 15:07:59 +00:00
Tom Lane 1753337cf5 Improve psql's tabular display of wrapped-around data by inserting markers
in the formerly-always-blank columns just to left and right of the data.
Different marking is used for a line break caused by a newline in the data
than for a straight wraparound.  A newline break is signaled by a "+" in the
right margin column in ASCII mode, or a carriage return arrow in UNICODE mode.
Wraparound is signaled by a dot in the right margin as well as the following
left margin in ASCII mode, or an ellipsis symbol in the same places in UNICODE
mode.  "\pset linestyle old-ascii" is added to make the previous behavior
available if anyone really wants it.

In passing, this commit also cleans up a few regression test files that
had unintended spacing differences from the current actual output.

Roger Leigh, reviewed by Gabrielle Roth and other members of PDXPUG.
2009-11-22 05:20:41 +00:00
Heikki Linnakangas a41e9ec0db Add alternative expected output files for cs_CZ locale for btree_gist and
tsearch2 tests. This should make 'comet_moth' buildfarm member pass
contrib check. Zdenek Kotala.
2009-05-08 14:48:06 +00:00
Tom Lane 8461ab5ab1 Update contrib for tsearch changes. 2008-05-16 17:26:07 +00:00
Andrew Dunstan d6eaeb335b Adjust contrib/tsearch2 regression results to use XML tag and XML entity descriptions, as now used by core text search default parser. 2007-11-20 04:23:10 +00:00
Tom Lane 4394c1b09c Resurrect the code for the rewrite(ARRAY[...]) aggregate function,
and put it into contrib/tsearch2 compatibility module.
2007-11-13 22:14:50 +00:00
Tom Lane 90e3f2aca7 Replace the now-incompatible-with-core contrib/tsearch2 module with a
compatibility package.  This supports importing dumps from past versions
using tsearch2, and provides the old names and API for most functions
that were changed.  (rewrite(ARRAY[...]) is a glaring omission, though.)

Pavel Stehule and Tom Lane
2007-11-13 21:02:29 +00:00
Tom Lane 684ad6a92f Rename contrib contains/contained-by operators to @> and <@, per discussion. 2006-09-10 17:36:52 +00:00
Bruce Momjian 0c4f2894f9 Use '' rather than \' for literal single quotes in strings in
/contrib/tsearch2.

Teodor Sigaev
2006-09-02 22:03:30 +00:00
Teodor Sigaev 74dbba701f Fix regression tests: after changing comparing function
order is changed.
2006-08-25 07:39:08 +00:00
Teodor Sigaev 22505f4703 Add thesaurus dictionary which can replace N>0 lexemes by M>0 lexemes.
It required some changes in lexize algorithm, but interface with
dictionaries stays compatible with old dictionaries.

Funded by Georgia Public Library Service and LibLime, Inc.
2006-05-31 14:05:31 +00:00
Teodor Sigaev 8a3631f8d8 GIN: Generalized Inverted iNdex.
text[], int4[], Tsearch2 support for GIN.
2006-05-02 11:28:56 +00:00
Teodor Sigaev 38c4fe87ac Significantly improve ranking:
1) rank_cd now use weight of lexemes
2) rank_cd and rank can use any combination of normalization methods:
        no normalization
        normalization by log(length of document)
        -----/------- by length of document
        -----/------- by number of unique word in document
        -----/------- by log(number of unique word in document)
        -----/------- by number of covers (only rank_cd)

Improve cover's search.

TODO: changes in documentation
2006-03-02 19:07:19 +00:00
Teodor Sigaev 011c520cb6 renew output of regression test accordingly to
http://archives.postgresql.org/pgsql-committers/2006-02/msg00089.php
2006-02-10 11:18:40 +00:00
Teodor Sigaev 5e2707c45f Snowball multibyte. It's a pity, but snowball sources is very diferent for multibyte and
singlebyte encodings, so we should have snowball for every encodings.

I hope that finalize multibyte support work in tsearch2, but testing is needed...
2006-01-27 16:32:31 +00:00
Teodor Sigaev c52795d18a Text parser rewritten:
- supports multibyte encodings
        - more strict rules for lexemes
        - flex isn't used
Add:
        - tsquery plainto_tsquery(text)
          Function makes tsquery from plain text.
        - &&, ||, !! operation for tsquery for combining
          tsquery from it's parts:  'foo & bar' || 'asd' => 'foo & bar | asd'
2005-11-21 12:27:57 +00:00
Teodor Sigaev 134bed8089 Fix rwrite(ARRAY) on 64-bit boxes:
Instead of getting elements of array manually call deconstruct_array
2005-11-09 09:26:04 +00:00
Teodor Sigaev 0645663e6c New features for tsearch2:
1 Comparison operation for tsquery
2 Btree index on tsquery
3 numnode(tsquery) - returns 'length' of tsquery
4 tsquery @ tsquery, tsquery ~ tsquery - contains, contained for tsquery.
  Note: They don't gurantee exact result, only MAY BE, so it
  useful only for speed up rewrite functions
5 GiST index support for @,~
6 rewrite():
        select rewrite(orig, what, to);
        select rewrite(ARRAY[orig, what, to]) from tsquery_table;
        select rewrite(orig, 'select what, to from tsquery_table;');
7 significantly improve cover algorithm
2005-11-08 17:08:46 +00:00
Teodor Sigaev 21b748e76a 1 Fix problem with lost precision in rank with OR-ed lexemes
2 Allow tsquery_in to input void tsquery: resolve dump/restore problem with tsquery
2005-10-28 13:05:06 +00:00
Bruce Momjian bb3cce4ec9 Add E'' syntax so eventually normal strings can treat backslashes
literally.

Add GUC variables:

        "escape_string_warning" - warn about backslashes in non-E strings
        "escape_string_syntax" - supports E'' syntax?
        "standard_compliant_strings" - treats backslashes literally in ''

Update code to use E'' when escapes are used.
2005-06-26 03:04:37 +00:00
Tom Lane b04e70b11e Adjust tsearch2.sql to avoid use of COPY FROM STDIN, so as to
simplify life for the Win32 installer.  Per Dave Page.
2004-09-14 03:58:54 +00:00
Teodor Sigaev bb89237531 1 Eliminate duplicate field HLWORD->skip
2 Rework support for html tags in parser
3 add HighlightAll to headline function for generating highlighted
  whole text with saved html tags
2004-06-28 16:19:09 +00:00
Teodor Sigaev 09bc52fe73 Fix stupid bug in installcheck 2004-06-23 09:43:43 +00:00
Teodor Sigaev a6ea6457fa Stat function now can show statistics per weight of lexemes 2004-05-28 15:36:49 +00:00
Teodor Sigaev eebdfcdbe6 1 Minimize memory allocation for void (but not null) value.
2 Add silly ordering for ts_vector to aim grouping, union, except etc. Don't use BTree opclass (tsvector_ops).
2004-03-25 16:56:10 +00:00
Teodor Sigaev 0b1ee9b5a3 fix hlfinditem function. Thanks to "Stphane Bidoul" <stephane.bidoul@softwareag.com>.
The 'word' variable there is initialised from
the prs->words array, but immediately after,
that array may be reallocated, thus leaving
word pointing to unallocated memory.
2003-09-22 13:32:33 +00:00
Teodor Sigaev 61366a9503 More accuracy works with stopwords in queries 2003-08-28 12:23:24 +00:00
Teodor Sigaev dd2870f76f Add ts_debug function for debugging configurations 2003-08-06 09:19:21 +00:00
Tom Lane 6ed071bca5 Update contrib regression tests for recent error message editing. 2003-08-01 02:38:09 +00:00
Teodor Sigaev 8f146a9077 Fix output to psql:tsearch2.sql:13: NOTICE: ... "pg_ts_dict_pkey" 2003-07-21 15:15:19 +00:00
Teodor Sigaev b88605337e tsearch2 module 2003-07-21 10:27:44 +00:00