postgresql/src/test/modules/test_parser
Robert Haas 7ada2d31f4 Remove contrib/tsearch2.
This module was intended to ease migrations of applications that used
the pre-8.3 version of text search to the in-core version introduced
in that release.  However, since all pre-8.3 releases of the database
have been out of support for more than 5 years at this point, we
expect that few people are depending on it at this point.  If some
people still need it, nothing prevents it from being maintained as a
separate extension, outside of core.

Discussion: http://postgr.es/m/CA+Tgmob5R8aDHiFRTQsSJbT1oreKg2FOSBrC=2f4tqEH3dOMAg@mail.gmail.com
2017-02-13 11:06:11 -05:00
..
expected Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
sql Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
.gitignore Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
Makefile Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
README Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
test_parser--1.0.sql Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
test_parser--unpackaged--1.0.sql Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00
test_parser.c Remove contrib/tsearch2. 2017-02-13 11:06:11 -05:00
test_parser.control Move test modules from contrib to src/test/modules 2014-11-29 23:55:00 -03:00

README

test_parser is an example of a custom parser for full-text
search.  It doesn't do anything especially useful, but can serve as
a starting point for developing your own parser.

test_parser recognizes words separated by white space,
and returns just two token types:

mydb=# SELECT * FROM ts_token_type('testparser');
 tokid | alias |  description
-------+-------+---------------
     3 | word  | Word
    12 | blank | Space symbols
(2 rows)

These token numbers have been chosen to be compatible with the default
parser's numbering.  This allows us to use its headline()
function, thus keeping the example simple.

Usage
=====

Installing the test_parser extension creates a text search
parser testparser.  It has no user-configurable parameters.

You can test the parser with, for example,

mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser');
 tokid | token
-------+--------
     3 | That's
    12 |
     3 | my
    12 |
     3 | first
    12 |
     3 | own
    12 |
     3 | parser

Real-world use requires setting up a text search configuration
that uses the parser.  For example,

mydb=# CREATE TEXT SEARCH CONFIGURATION testcfg ( PARSER = testparser );
CREATE TEXT SEARCH CONFIGURATION

mydb=# ALTER TEXT SEARCH CONFIGURATION testcfg
mydb-#   ADD MAPPING FOR word WITH english_stem;
ALTER TEXT SEARCH CONFIGURATION

mydb=#  SELECT to_tsvector('testcfg', 'That''s my first own parser');
          to_tsvector
-------------------------------
 'that':1 'first':3 'parser':5
(1 row)

mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies',
mydb(#                    to_tsquery('testcfg', 'star'));
                           ts_headline
-----------------------------------------------------------------
 Supernovae <b>stars</b> are the brightest phenomena in galaxies
(1 row)