150 lines
4.1 KiB
Plaintext
150 lines
4.1 KiB
Plaintext
<!-- doc/src/sgml/dict-xsyn.sgml -->
|
|
|
|
<sect1 id="dict-xsyn" xreflabel="dict_xsyn">
|
|
<title>dict_xsyn</title>
|
|
|
|
<indexterm zone="dict-xsyn">
|
|
<primary>dict_xsyn</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<filename>dict_xsyn</> (Extended Synonym Dictionary) is an example of an
|
|
add-on dictionary template for full-text search. This dictionary type
|
|
replaces words with groups of their synonyms, and so makes it possible to
|
|
search for a word using any of its synonyms.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>Configuration</title>
|
|
|
|
<para>
|
|
A <literal>dict_xsyn</> dictionary accepts the following options:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<literal>matchorig</> controls whether the original word is accepted by
|
|
the dictionary. Default is <literal>true</>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>matchsynonyms</> controls whether the synonyms are
|
|
accepted by the dictionary. Default is <literal>false</>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>keeporig</> controls whether the original word is included in
|
|
the dictionary's output. Default is <literal>true</>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>keepsynonyms</> controls whether the synonyms are included in
|
|
the dictionary's output. Default is <literal>true</>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>rules</> is the base name of the file containing the list of
|
|
synonyms. This file must be stored in
|
|
<filename>$SHAREDIR/tsearch_data/</> (where <literal>$SHAREDIR</> means
|
|
the <productname>PostgreSQL</> installation's shared-data directory).
|
|
Its name must end in <literal>.rules</> (which is not to be included in
|
|
the <literal>rules</> parameter).
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
The rules file has the following format:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Each line represents a group of synonyms for a single word, which is
|
|
given first on the line. Synonyms are separated by whitespace, thus:
|
|
<programlisting>
|
|
word syn1 syn2 syn3
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The sharp (<literal>#</>) sign is a comment delimiter. It may appear at
|
|
any position in a line. The rest of the line will be skipped.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Look at <filename>xsyn_sample.rules</>, which is installed in
|
|
<filename>$SHAREDIR/tsearch_data/</>, for an example.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Usage</title>
|
|
|
|
<para>
|
|
Installing the <literal>dict_xsyn</> extension creates a text search
|
|
template <literal>xsyn_template</> and a dictionary <literal>xsyn</>
|
|
based on it, with default parameters. You can alter the
|
|
parameters, for example
|
|
|
|
<programlisting>
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
</programlisting>
|
|
|
|
or create new dictionaries based on the template.
|
|
</para>
|
|
|
|
<para>
|
|
To test the dictionary, you can try
|
|
|
|
<programlisting>
|
|
mydb=# SELECT ts_lexize('xsyn', 'word');
|
|
ts_lexize
|
|
-----------------------
|
|
{syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'word');
|
|
ts_lexize
|
|
-----------------------
|
|
{word,syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false, MATCHSYNONYMS=true);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'syn1');
|
|
ts_lexize
|
|
-----------------------
|
|
{syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true, MATCHORIG=false, KEEPSYNONYMS=false);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'syn1');
|
|
ts_lexize
|
|
-----------------------
|
|
{word}
|
|
</programlisting>
|
|
|
|
Real-world usage will involve including it in a text search
|
|
configuration as described in <xref linkend="textsearch">.
|
|
That might look like this:
|
|
|
|
<programlisting>
|
|
ALTER TEXT SEARCH CONFIGURATION english
|
|
ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
|
|
</programlisting>
|
|
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|