doc: Add more ICU rules examples

In particular, add an example EBCDIC collation.

Author: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/flat/35cc1684-e516-4a01-a256-351632d47066@manitou-mail.org
This commit is contained in:
Peter Eisentraut 2023-08-23 11:23:42 +02:00
parent 27a36f79b6
commit 17ec2c5dfa
3 changed files with 62 additions and 13 deletions

View File

@ -1481,7 +1481,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false
</sect3>
<sect3 id="icu-locale-examples">
<title>Examples</title>
<title>Collation Settings Examples</title>
<variablelist>
<varlistentry id="collation-managing-create-icu-de-u-co-phonebk-x-icu">
@ -1530,6 +1530,62 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false
</variablelist>
</sect3>
<sect3 id="icu-tailoring-rules">
<title>ICU Tailoring Rules</title>
<para>
If the options provided by the collation settings shown above are not
sufficient, the order of collation elements can be changed with tailoring
rules, whose syntax is detailed at <ulink
url="https://unicode-org.github.io/icu/userguide/collation/customization/"></ulink>.
</para>
<para>
This small example creates a collation based on the root locale with a
tailoring rule:
<programlisting>
<![CDATA[CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&V << w <<< W');]]>
</programlisting>
With this rule, the letter <quote>W</quote> is sorted after
<quote>V</quote>, but is treated as a secondary difference similar to an
accent. Rules like this are contained in the locale definitions of some
languages. (Of course, if a locale definition already contains the
desired rules, then they don't need to be specified again explicitly.)
</para>
<para>
Here is a more complex example. The following statement sets up a
collation named <literal>ebcdic</literal> with rules to sort US-ASCII
characters in the order of the EBCDIC encoding.
<programlisting>
<![CDATA[CREATE COLLATION ebcdic (provider = icu, locale = 'und',
rules = $$
& ' ' < '.' < '<' < '(' < '+' < \|
< '&' < '!' < '$' < '*' < ')' < ';'
< '-' < '/' < ',' < '%' < '_' < '>' < '?'
< '`' < ':' < '#' < '@' < \' < '=' < '"'
<*a-r < '~' <*s-z < '^' < '[' < ']'
< '{' <*A-I < '}' <*J-R < '\' <*S-Z <*0-9
$$);]]>
SELECT c
FROM (VALUES ('a'), ('b'), ('A'), ('B'), ('1'), ('2'), ('!'), ('^')) AS x(c)
ORDER BY c COLLATE ebcdic;
c
---
!
a
b
^
A
B
1
2
</programlisting>
</para>
</sect3>
<sect3 id="icu-external-references">
<title>External References for ICU</title>

View File

@ -165,9 +165,8 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
<listitem>
<para>
Specifies additional collation rules to customize the behavior of the
collation. This is supported for ICU only. See <ulink
url="https://unicode-org.github.io/icu/userguide/collation/customization/"/>
for details on the syntax.
collation. This is supported for ICU only. See <xref
linkend="icu-tailoring-rules"/> for details.
</para>
</listitem>
</varlistentry>
@ -257,12 +256,8 @@ CREATE COLLATION german_phonebook (provider = icu, locale = 'de-u-co-phonebk');
<programlisting>
<![CDATA[CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&V << w <<< W');]]>
</programlisting>
With this rule, the letter <quote>W</quote> is sorted after
<quote>V</quote>, but is treated as a secondary difference similar to an
accent. Rules like this are contained in the locale definitions of some
languages. (Of course, if a locale definition already contains the desired
rules, then they don't need to be specified again explicitly.) See the ICU
documentation for further details and examples on the rules syntax.
See <xref linkend="icu-tailoring-rules"/> for further details and examples
on the rules syntax.
</para>
<para>

View File

@ -232,9 +232,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
<para>
Specifies additional collation rules to customize the behavior of the
default collation of this database. This is supported for ICU only.
See <ulink
url="https://unicode-org.github.io/icu/userguide/collation/customization/"/>
for details on the syntax.
See <xref linkend="icu-tailoring-rules"/> for details.
</para>
</listitem>
</varlistentry>