mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-10-03 01:36:59 +02:00
78ee60ed84
This doesn't have any external effect at the moment, but it will allow adding useful link-discoverability features later. Brar Piening, reviewed by Karl Pinc. Discussion: https://postgr.es/m/CAB8KJ=jpuQU9QJe4+RgWENrK5g9jhoysMw2nvTN_esoOU0=a_w@mail.gmail.com
150 lines
4.3 KiB
Plaintext
150 lines
4.3 KiB
Plaintext
<!-- doc/src/sgml/dict-xsyn.sgml -->
|
|
|
|
<sect1 id="dict-xsyn" xreflabel="dict_xsyn">
|
|
<title>dict_xsyn</title>
|
|
|
|
<indexterm zone="dict-xsyn">
|
|
<primary>dict_xsyn</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<filename>dict_xsyn</filename> (Extended Synonym Dictionary) is an example of an
|
|
add-on dictionary template for full-text search. This dictionary type
|
|
replaces words with groups of their synonyms, and so makes it possible to
|
|
search for a word using any of its synonyms.
|
|
</para>
|
|
|
|
<sect2 id="dict-xsyn-config">
|
|
<title>Configuration</title>
|
|
|
|
<para>
|
|
A <literal>dict_xsyn</literal> dictionary accepts the following options:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<literal>matchorig</literal> controls whether the original word is accepted by
|
|
the dictionary. Default is <literal>true</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>matchsynonyms</literal> controls whether the synonyms are
|
|
accepted by the dictionary. Default is <literal>false</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>keeporig</literal> controls whether the original word is included in
|
|
the dictionary's output. Default is <literal>true</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>keepsynonyms</literal> controls whether the synonyms are included in
|
|
the dictionary's output. Default is <literal>true</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>rules</literal> is the base name of the file containing the list of
|
|
synonyms. This file must be stored in
|
|
<filename>$SHAREDIR/tsearch_data/</filename> (where <literal>$SHAREDIR</literal> means
|
|
the <productname>PostgreSQL</productname> installation's shared-data directory).
|
|
Its name must end in <literal>.rules</literal> (which is not to be included in
|
|
the <literal>rules</literal> parameter).
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
The rules file has the following format:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Each line represents a group of synonyms for a single word, which is
|
|
given first on the line. Synonyms are separated by whitespace, thus:
|
|
<programlisting>
|
|
word syn1 syn2 syn3
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The sharp (<literal>#</literal>) sign is a comment delimiter. It may appear at
|
|
any position in a line. The rest of the line will be skipped.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Look at <filename>xsyn_sample.rules</filename>, which is installed in
|
|
<filename>$SHAREDIR/tsearch_data/</filename>, for an example.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="dict-xsyn-usage">
|
|
<title>Usage</title>
|
|
|
|
<para>
|
|
Installing the <literal>dict_xsyn</literal> extension creates a text search
|
|
template <literal>xsyn_template</literal> and a dictionary <literal>xsyn</literal>
|
|
based on it, with default parameters. You can alter the
|
|
parameters, for example
|
|
|
|
<programlisting>
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
</programlisting>
|
|
|
|
or create new dictionaries based on the template.
|
|
</para>
|
|
|
|
<para>
|
|
To test the dictionary, you can try
|
|
|
|
<programlisting>
|
|
mydb=# SELECT ts_lexize('xsyn', 'word');
|
|
ts_lexize
|
|
-----------------------
|
|
{syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'word');
|
|
ts_lexize
|
|
-----------------------
|
|
{word,syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false, MATCHSYNONYMS=true);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'syn1');
|
|
ts_lexize
|
|
-----------------------
|
|
{syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true, MATCHORIG=false, KEEPSYNONYMS=false);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'syn1');
|
|
ts_lexize
|
|
-----------------------
|
|
{word}
|
|
</programlisting>
|
|
|
|
Real-world usage will involve including it in a text search
|
|
configuration as described in <xref linkend="textsearch"/>.
|
|
That might look like this:
|
|
|
|
<programlisting>
|
|
ALTER TEXT SEARCH CONFIGURATION english
|
|
ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
|
|
</programlisting>
|
|
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|