mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-09-29 00:31:53 +02:00
3c49c6facb
Since some preparation work had already been done, the only source changes left were changing empty-element tags like <xref linkend="foo"> to <xref linkend="foo"/>, and changing the DOCTYPE. The source files are still named *.sgml, but they are actually XML files now. Renaming could be considered later. In the build system, the intermediate step to convert from SGML to XML is removed. Everything is build straight from the source files again. The OpenSP (or the old SP) package is no longer needed. The documentation toolchain instructions are updated and are much simpler now. Peter Eisentraut, Alexander Lakhin, Jürgen Purtz
150 lines
4.2 KiB
Plaintext
150 lines
4.2 KiB
Plaintext
<!-- doc/src/sgml/dict-xsyn.sgml -->
|
|
|
|
<sect1 id="dict-xsyn" xreflabel="dict_xsyn">
|
|
<title>dict_xsyn</title>
|
|
|
|
<indexterm zone="dict-xsyn">
|
|
<primary>dict_xsyn</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<filename>dict_xsyn</filename> (Extended Synonym Dictionary) is an example of an
|
|
add-on dictionary template for full-text search. This dictionary type
|
|
replaces words with groups of their synonyms, and so makes it possible to
|
|
search for a word using any of its synonyms.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>Configuration</title>
|
|
|
|
<para>
|
|
A <literal>dict_xsyn</literal> dictionary accepts the following options:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<literal>matchorig</literal> controls whether the original word is accepted by
|
|
the dictionary. Default is <literal>true</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>matchsynonyms</literal> controls whether the synonyms are
|
|
accepted by the dictionary. Default is <literal>false</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>keeporig</literal> controls whether the original word is included in
|
|
the dictionary's output. Default is <literal>true</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>keepsynonyms</literal> controls whether the synonyms are included in
|
|
the dictionary's output. Default is <literal>true</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>rules</literal> is the base name of the file containing the list of
|
|
synonyms. This file must be stored in
|
|
<filename>$SHAREDIR/tsearch_data/</filename> (where <literal>$SHAREDIR</literal> means
|
|
the <productname>PostgreSQL</productname> installation's shared-data directory).
|
|
Its name must end in <literal>.rules</literal> (which is not to be included in
|
|
the <literal>rules</literal> parameter).
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
The rules file has the following format:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Each line represents a group of synonyms for a single word, which is
|
|
given first on the line. Synonyms are separated by whitespace, thus:
|
|
<programlisting>
|
|
word syn1 syn2 syn3
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The sharp (<literal>#</literal>) sign is a comment delimiter. It may appear at
|
|
any position in a line. The rest of the line will be skipped.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Look at <filename>xsyn_sample.rules</filename>, which is installed in
|
|
<filename>$SHAREDIR/tsearch_data/</filename>, for an example.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Usage</title>
|
|
|
|
<para>
|
|
Installing the <literal>dict_xsyn</literal> extension creates a text search
|
|
template <literal>xsyn_template</literal> and a dictionary <literal>xsyn</literal>
|
|
based on it, with default parameters. You can alter the
|
|
parameters, for example
|
|
|
|
<programlisting>
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
</programlisting>
|
|
|
|
or create new dictionaries based on the template.
|
|
</para>
|
|
|
|
<para>
|
|
To test the dictionary, you can try
|
|
|
|
<programlisting>
|
|
mydb=# SELECT ts_lexize('xsyn', 'word');
|
|
ts_lexize
|
|
-----------------------
|
|
{syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'word');
|
|
ts_lexize
|
|
-----------------------
|
|
{word,syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false, MATCHSYNONYMS=true);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'syn1');
|
|
ts_lexize
|
|
-----------------------
|
|
{syn1,syn2,syn3}
|
|
|
|
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=true, MATCHORIG=false, KEEPSYNONYMS=false);
|
|
ALTER TEXT SEARCH DICTIONARY
|
|
|
|
mydb=# SELECT ts_lexize('xsyn', 'syn1');
|
|
ts_lexize
|
|
-----------------------
|
|
{word}
|
|
</programlisting>
|
|
|
|
Real-world usage will involve including it in a text search
|
|
configuration as described in <xref linkend="textsearch"/>.
|
|
That might look like this:
|
|
|
|
<programlisting>
|
|
ALTER TEXT SEARCH CONFIGURATION english
|
|
ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
|
|
</programlisting>
|
|
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|