2009-05-06 18:15:21 +02:00
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.94 2009/05/06 16:15:20 tgl Exp $ -->
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2000-09-30 18:58:20 +02:00
|
|
|
<chapter id="charset">
|
|
|
|
<title>Localization</>
|
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
<para>
|
|
|
|
This chapter describes the available localization features from the
|
|
|
|
point of view of the administrator.
|
|
|
|
<productname>PostgreSQL</productname> supports localization with
|
2003-08-04 06:03:10 +02:00
|
|
|
two approaches:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2000-09-30 18:58:20 +02:00
|
|
|
Using the locale features of the operating system to provide
|
2001-11-18 21:33:32 +01:00
|
|
|
locale-specific collation order, number formatting, translated
|
|
|
|
messages, and other aspects.
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-05-06 18:15:21 +02:00
|
|
|
Providing a number of different character sets to support storing text
|
|
|
|
in all kinds of languages, and providing character set translation
|
|
|
|
between client and server.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2000-09-12 07:37:09 +02:00
|
|
|
</itemizedlist>
|
|
|
|
</para>
|
|
|
|
|
2000-09-30 18:58:20 +02:00
|
|
|
|
|
|
|
<sect1 id="locale">
|
|
|
|
<title>Locale Support</title>
|
2009-03-26 21:55:49 +01:00
|
|
|
|
2001-11-12 20:19:39 +01:00
|
|
|
<indexterm zone="locale"><primary>locale</></>
|
|
|
|
|
2000-09-30 18:58:20 +02:00
|
|
|
<para>
|
|
|
|
<firstterm>Locale</> support refers to an application respecting
|
|
|
|
cultural preferences regarding alphabets, sorting, number
|
|
|
|
formatting, etc. <productname>PostgreSQL</> uses the standard ISO
|
2003-03-24 15:32:51 +01:00
|
|
|
C and <acronym>POSIX</acronym> locale facilities provided by the server operating
|
2001-01-19 05:47:50 +01:00
|
|
|
system. For additional information refer to the documentation of your
|
2000-09-30 18:58:20 +02:00
|
|
|
system.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<sect2>
|
|
|
|
<title>Overview</>
|
|
|
|
|
2002-04-03 07:39:33 +02:00
|
|
|
<para>
|
|
|
|
Locale support is automatically initialized when a database
|
|
|
|
cluster is created using <command>initdb</command>.
|
|
|
|
<command>initdb</command> will initialize the database cluster
|
2004-03-23 03:47:35 +01:00
|
|
|
with the locale setting of its execution environment by default,
|
|
|
|
so if your system is already set to use the locale that you want
|
|
|
|
in your database cluster then there is nothing else you need to
|
|
|
|
do. If you want to use a different locale (or you are not sure
|
|
|
|
which locale your system is set to), you can instruct
|
|
|
|
<command>initdb</command> exactly which locale to use by
|
|
|
|
specifying the <option>--locale</option> option. For example:
|
2000-09-30 18:58:20 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
initdb --locale=sv_SE
|
2000-09-30 18:58:20 +02:00
|
|
|
</screen>
|
|
|
|
</para>
|
|
|
|
|
2001-11-18 21:33:32 +01:00
|
|
|
<para>
|
2009-03-26 21:55:49 +01:00
|
|
|
This example for Unix systems sets the locale to Swedish
|
2008-07-15 03:35:23 +02:00
|
|
|
(<literal>sv</>) as spoken
|
2004-03-23 03:47:35 +01:00
|
|
|
in Sweden (<literal>SE</>). Other possibilities might be
|
|
|
|
<literal>en_US</> (U.S. English) and <literal>fr_CA</> (French
|
|
|
|
Canadian). If more than one character set can be useful for a
|
|
|
|
locale then the specifications look like this:
|
2000-09-30 18:58:20 +02:00
|
|
|
<literal>cs_CZ.ISO8859-2</>. What locales are available under what
|
|
|
|
names on your system depends on what was provided by the operating
|
2008-07-15 03:35:23 +02:00
|
|
|
system vendor and what was installed. On most Unix systems, the command
|
|
|
|
<literal>locale -a</> will provide a list of available locales.
|
2009-05-06 18:15:21 +02:00
|
|
|
Windows uses more verbose locale names, such as <literal>German_Germany</>
|
|
|
|
or <literal>Swedish_Sweden.1252</>, but the principles are the same.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Occasionally it is useful to mix rules from several locales, e.g.,
|
2003-03-24 15:32:51 +01:00
|
|
|
use English collation rules but Spanish messages. To support that, a
|
2002-04-03 07:39:33 +02:00
|
|
|
set of locale subcategories exist that control only a certain
|
2004-12-27 23:30:10 +01:00
|
|
|
aspect of the localization rules:
|
2000-09-30 18:58:20 +02:00
|
|
|
|
|
|
|
<informaltable>
|
|
|
|
<tgroup cols="2">
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2001-09-10 01:52:12 +02:00
|
|
|
<entry><envar>LC_COLLATE</></>
|
2000-09-30 18:58:20 +02:00
|
|
|
<entry>String sort order</>
|
|
|
|
</row>
|
|
|
|
<row>
|
2001-09-10 01:52:12 +02:00
|
|
|
<entry><envar>LC_CTYPE</></>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry>Character classification (What is a letter? Its upper-case equivalent?)</>
|
2000-09-30 18:58:20 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2001-09-10 01:52:12 +02:00
|
|
|
<entry><envar>LC_MESSAGES</></>
|
2000-09-30 18:58:20 +02:00
|
|
|
<entry>Language of messages</>
|
|
|
|
</row>
|
|
|
|
<row>
|
2001-09-10 01:52:12 +02:00
|
|
|
<entry><envar>LC_MONETARY</></>
|
2000-09-30 18:58:20 +02:00
|
|
|
<entry>Formatting of currency amounts</>
|
|
|
|
</row>
|
|
|
|
<row>
|
2001-09-10 01:52:12 +02:00
|
|
|
<entry><envar>LC_NUMERIC</></>
|
2000-09-30 18:58:20 +02:00
|
|
|
<entry>Formatting of numbers</>
|
|
|
|
</row>
|
|
|
|
<row>
|
2001-09-10 01:52:12 +02:00
|
|
|
<entry><envar>LC_TIME</></>
|
2000-09-30 18:58:20 +02:00
|
|
|
<entry>Formatting of dates and times</>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</informaltable>
|
|
|
|
|
2002-04-03 07:39:33 +02:00
|
|
|
The category names translate into names of
|
|
|
|
<command>initdb</command> options to override the locale choice
|
|
|
|
for a specific category. For instance, to set the locale to
|
|
|
|
French Canadian, but use U.S. rules for formatting currency, use
|
|
|
|
<literal>initdb --locale=fr_CA --lc-monetary=en_US</literal>.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you want the system to behave as if it had no locale support,
|
2002-04-03 07:39:33 +02:00
|
|
|
use the special locale <literal>C</> or <literal>POSIX</>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The nature of some locale categories is that their value has to be
|
2008-09-23 11:20:39 +02:00
|
|
|
fixed when the database is created. You can use different settings
|
|
|
|
for different databases, but once a database is created, you cannot
|
|
|
|
change them for that database anymore. <literal>LC_COLLATE</literal>
|
2009-05-06 18:15:21 +02:00
|
|
|
and <literal>LC_CTYPE</literal> are these categories. They affect
|
2008-09-23 11:20:39 +02:00
|
|
|
the sort order of indexes, so they must be kept fixed, or indexes on
|
|
|
|
text columns will become corrupt. The default values for these
|
2009-03-26 21:55:49 +01:00
|
|
|
categories are determined when <command>initdb</command> is run, and
|
2008-09-23 11:20:39 +02:00
|
|
|
those values are used when new databases are created, unless
|
|
|
|
specified otherwise in the <command>CREATE DATABASE</command> command.
|
2002-04-03 07:39:33 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2009-03-26 21:55:49 +01:00
|
|
|
The other locale categories can be changed whenever desired
|
|
|
|
by setting the server configuration parameters
|
2002-04-03 07:39:33 +02:00
|
|
|
that have the same name as the locale categories (see <xref
|
2009-03-26 21:55:49 +01:00
|
|
|
linkend="runtime-config-client-format"> for details). The values
|
|
|
|
that are chosen by <command>initdb</command> are actually only written
|
|
|
|
into the configuration file <filename>postgresql.conf</filename> to
|
2005-04-16 18:50:01 +02:00
|
|
|
serve as defaults when the server is started. If you delete these
|
2002-04-03 07:39:33 +02:00
|
|
|
assignments from <filename>postgresql.conf</filename> then the
|
2005-04-16 18:50:01 +02:00
|
|
|
server will inherit the settings from its execution environment.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-11-18 21:33:32 +01:00
|
|
|
Note that the locale behavior of the server is determined by the
|
|
|
|
environment variables seen by the server, not by the environment
|
2002-04-03 07:39:33 +02:00
|
|
|
of any client. Therefore, be careful to configure the correct locale settings
|
2001-11-18 21:33:32 +01:00
|
|
|
before starting the server. A consequence of this is that if
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
client and server are set up in different locales, messages might
|
2001-11-18 21:33:32 +01:00
|
|
|
appear in different languages depending on where they originated.
|
2001-01-19 05:47:50 +01:00
|
|
|
</para>
|
|
|
|
|
2002-04-03 07:39:33 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
When we speak of inheriting the locale from the execution
|
|
|
|
environment, this means the following on most operating systems:
|
|
|
|
For a given locale category, say the collation, the following
|
|
|
|
environment variables are consulted in this order until one is
|
|
|
|
found to be set: <envar>LC_ALL</envar>, <envar>LC_COLLATE</envar>
|
2009-05-06 18:15:21 +02:00
|
|
|
(or the variable corresponding to the respective category),
|
2002-04-03 07:39:33 +02:00
|
|
|
<envar>LANG</envar>. If none of these environment variables are
|
|
|
|
set then the locale defaults to <literal>C</literal>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Some message localization libraries also look at the environment
|
|
|
|
variable <envar>LANGUAGE</envar> which overrides all other locale
|
|
|
|
settings for the purpose of setting the language of messages. If
|
|
|
|
in doubt, please refer to the documentation of your operating
|
2003-03-24 15:32:51 +01:00
|
|
|
system, in particular the documentation about
|
|
|
|
<application>gettext</>, for more information.
|
2002-04-03 07:39:33 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
|
2001-01-19 05:47:50 +01:00
|
|
|
<para>
|
2004-12-27 23:30:10 +01:00
|
|
|
To enable messages to be translated to the user's preferred language,
|
2009-05-06 18:15:21 +02:00
|
|
|
<acronym>NLS</acronym> must have been selected at build time
|
|
|
|
(<literal>configure --enable-nls</>). All other locale support is
|
|
|
|
built in automatically.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2>
|
2005-01-04 01:05:45 +01:00
|
|
|
<title>Behavior</>
|
2000-09-30 18:58:20 +02:00
|
|
|
|
|
|
|
<para>
|
2005-04-16 18:50:01 +02:00
|
|
|
The locale settings influence the following SQL features:
|
2000-09-30 18:58:20 +02:00
|
|
|
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-03-26 21:55:49 +01:00
|
|
|
Sort order in queries using <literal>ORDER BY</> or the standard
|
|
|
|
comparison operators on textual data
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm><primary>ORDER BY</><secondary>and locales</></indexterm>
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
2005-01-04 01:05:45 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The ability to use indexes with <literal>LIKE</> clauses
|
|
|
|
<indexterm><primary>LIKE</><secondary>and locales</></indexterm>
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
2005-04-16 18:50:01 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The <function>upper</>, <function>lower</>, and <function>initcap</>
|
|
|
|
functions
|
|
|
|
<indexterm><primary>upper</><secondary>and locales</></indexterm>
|
|
|
|
<indexterm><primary>lower</><secondary>and locales</></indexterm>
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
2000-09-30 18:58:20 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The <function>to_char</> family of functions
|
2005-04-16 18:50:01 +02:00
|
|
|
<indexterm><primary>to_char</><secondary>and locales</></indexterm>
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2005-01-04 01:05:45 +01:00
|
|
|
The drawback of using locales other than <literal>C</> or
|
|
|
|
<literal>POSIX</> in <productname>PostgreSQL</> is its performance
|
|
|
|
impact. It slows character handling and prevents ordinary indexes
|
|
|
|
from being used by <literal>LIKE</>. For this reason use locales
|
|
|
|
only if you actually need them.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
2005-03-17 01:22:24 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
As a workaround to allow <productname>PostgreSQL</> to use indexes
|
|
|
|
with <literal>LIKE</> clauses under a non-C locale, several custom
|
|
|
|
operator classes exist. These allow the creation of an index that
|
|
|
|
performs a strict character-by-character comparison, ignoring
|
|
|
|
locale comparison rules. Refer to <xref linkend="indexes-opclass">
|
|
|
|
for more information.
|
|
|
|
</para>
|
2000-09-30 18:58:20 +02:00
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2>
|
|
|
|
<title>Problems</>
|
|
|
|
|
|
|
|
<para>
|
2007-09-29 00:25:49 +02:00
|
|
|
If locale support doesn't work according to the explanation above,
|
2003-03-24 15:32:51 +01:00
|
|
|
check that the locale support in your operating system is
|
|
|
|
correctly configured. To check what locales are installed on your
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
system, you can use the command <literal>locale -a</literal> if
|
2003-03-24 15:32:51 +01:00
|
|
|
your operating system provides it.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
Check that <productname>PostgreSQL</> is actually using the locale
|
2009-03-26 21:55:49 +01:00
|
|
|
that you think it is. The <envar>LC_COLLATE</> and <envar>LC_CTYPE</>
|
|
|
|
settings are determined when a database is created, and cannot be
|
|
|
|
changed except by creating a new database. Other locale
|
2003-03-24 15:32:51 +01:00
|
|
|
settings including <envar>LC_MESSAGES</> and <envar>LC_MONETARY</>
|
|
|
|
are initially determined by the environment the server is started
|
2004-12-27 23:30:10 +01:00
|
|
|
in, but can be changed on-the-fly. You can check the active locale
|
|
|
|
settings using the <command>SHOW</> command.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
|
|
|
|
2001-11-19 04:58:25 +01:00
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
The directory <filename>src/test/locale</> in the source
|
|
|
|
distribution contains a test suite for
|
|
|
|
<productname>PostgreSQL</>'s locale support.
|
2000-09-30 18:58:20 +02:00
|
|
|
</para>
|
2001-11-18 21:33:32 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
Client applications that handle server-side errors by parsing the
|
|
|
|
text of the error message will obviously have problems when the
|
2003-11-04 10:55:39 +01:00
|
|
|
server's messages are in a different language. Authors of such
|
|
|
|
applications are advised to make use of the error code scheme
|
|
|
|
instead.
|
2001-11-18 21:33:32 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Maintaining catalogs of message translations requires the on-going
|
|
|
|
efforts of many volunteers that want to see
|
|
|
|
<productname>PostgreSQL</> speak their preferred language well.
|
2004-12-27 23:30:10 +01:00
|
|
|
If messages in your language are currently not available or not fully
|
2001-11-18 21:33:32 +01:00
|
|
|
translated, your assistance would be appreciated. If you want to
|
2004-12-27 23:30:10 +01:00
|
|
|
help, refer to <xref linkend="nls"> or write to the developers'
|
2003-01-19 01:13:31 +01:00
|
|
|
mailing list.
|
2001-11-18 21:33:32 +01:00
|
|
|
</para>
|
2000-09-30 18:58:20 +02:00
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
<sect1 id="multibyte">
|
|
|
|
<title>Character Set Support</title>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
<indexterm zone="multibyte"><primary>character set</></>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
<para>
|
|
|
|
The character set support in <productname>PostgreSQL</productname>
|
2007-09-29 00:25:49 +02:00
|
|
|
allows you to store text in a variety of character sets (also called
|
|
|
|
encodings), including
|
2003-03-24 15:32:51 +01:00
|
|
|
single-byte character sets such as the ISO 8859 series and
|
|
|
|
multiple-byte character sets such as <acronym>EUC</> (Extended Unix
|
2006-07-28 17:33:17 +02:00
|
|
|
Code), UTF-8, and Mule internal code. All supported character sets
|
|
|
|
can be used transparently by clients, but a few are not supported
|
|
|
|
for use within the server (that is, as a server-side encoding).
|
|
|
|
The default character set is selected while
|
2003-03-24 15:32:51 +01:00
|
|
|
initializing your <productname>PostgreSQL</productname> database
|
|
|
|
cluster using <command>initdb</>. It can be overridden when you
|
2006-07-28 17:33:17 +02:00
|
|
|
create a database, so you can have multiple
|
2003-03-24 15:32:51 +01:00
|
|
|
databases each with a different character set.
|
|
|
|
</para>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2007-09-29 00:25:49 +02:00
|
|
|
<para>
|
2008-09-23 11:20:39 +02:00
|
|
|
An important restriction, however, is that each database's character set
|
2008-09-23 12:58:03 +02:00
|
|
|
must be compatible with the database's <envar>LC_CTYPE</> and
|
2008-09-24 18:30:26 +02:00
|
|
|
<envar>LC_COLLATE</> locale settings. For <literal>C</> or
|
2008-09-23 12:58:03 +02:00
|
|
|
<literal>POSIX</> locale, any character set is allowed, but for other
|
|
|
|
locales there is only one character set that will work correctly.
|
2009-05-06 18:15:21 +02:00
|
|
|
(On Windows, however, UTF-8 encoding can be used with any locale.)
|
2007-09-29 00:25:49 +02:00
|
|
|
</para>
|
|
|
|
|
2004-03-23 03:47:35 +01:00
|
|
|
<sect2 id="multibyte-charset-supported">
|
2003-03-24 15:32:51 +01:00
|
|
|
<title>Supported Character Sets</title>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2002-07-24 07:51:56 +02:00
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
<xref linkend="charset-table"> shows the character sets available
|
2006-07-28 17:33:17 +02:00
|
|
|
for use in <productname>PostgreSQL</productname>.
|
2003-03-24 15:32:51 +01:00
|
|
|
</para>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
<table id="charset-table">
|
2006-07-28 18:21:57 +02:00
|
|
|
<title><productname>PostgreSQL</productname> Character Sets</title>
|
2006-07-28 17:33:17 +02:00
|
|
|
<tgroup cols="6">
|
2000-09-12 07:37:09 +02:00
|
|
|
<thead>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
|
|
|
<entry>Name</entry>
|
|
|
|
<entry>Description</entry>
|
2005-03-13 04:02:08 +01:00
|
|
|
<entry>Language</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Server?</entry>
|
2005-03-15 03:30:33 +01:00
|
|
|
<!--
|
|
|
|
The Bytes/Char field is populated by looking at the values returned
|
|
|
|
by pg_wchar_table.mblen function for each encoding.
|
|
|
|
-->
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>Bytes/Char</entry>
|
2005-03-13 03:20:50 +01:00
|
|
|
<entry>Aliases</entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
2000-09-12 07:37:09 +02:00
|
|
|
</thead>
|
|
|
|
<tbody>
|
2005-03-13 02:26:30 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>BIG5</literal></entry>
|
2005-03-13 05:10:23 +01:00
|
|
|
<entry>Big Five</entry>
|
|
|
|
<entry>Traditional Chinese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>No</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-2</entry>
|
|
|
|
<entry><literal>WIN950</>, <literal>Windows950</></entry>
|
2005-03-13 02:26:30 +01:00
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>EUC_CN</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Extended UNIX Code-CN</entry>
|
2005-03-13 05:10:23 +01:00
|
|
|
<entry>Simplified Chinese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-3</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<entry><literal>EUC_JP</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Extended UNIX Code-JP</entry>
|
|
|
|
<entry>Japanese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-3</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
2007-03-25 13:56:04 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>EUC_JIS_2004</literal></entry>
|
|
|
|
<entry>Extended UNIX Code-JP, JIS X 0213</entry>
|
|
|
|
<entry>Japanese</entry>
|
|
|
|
<entry>Yes</entry>
|
|
|
|
<entry>1-3</entry>
|
|
|
|
<entry></entry>
|
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<entry><literal>EUC_KR</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Extended UNIX Code-KR</entry>
|
|
|
|
<entry>Korean</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-3</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>EUC_TW</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Extended UNIX Code-TW</entry>
|
2005-03-13 05:10:23 +01:00
|
|
|
<entry>Traditional Chinese, Taiwanese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-3</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>GB18030</literal></entry>
|
2005-03-13 05:10:23 +01:00
|
|
|
<entry>National Standard</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Chinese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>No</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-2</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:30:59 +01:00
|
|
|
</row>
|
2005-03-13 02:26:30 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>GBK</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Extended National Standard</entry>
|
2005-03-13 05:10:23 +01:00
|
|
|
<entry>Simplified Chinese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>No</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-2</entry>
|
|
|
|
<entry><literal>WIN936</>, <literal>Windows936</></entry>
|
2005-03-13 02:26:30 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<entry><literal>ISO_8859_5</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-5, <acronym>ECMA</> 113</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Latin/Cyrillic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:26:30 +01:00
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<entry><literal>ISO_8859_6</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-6, <acronym>ECMA</> 114</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Latin/Arabic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<entry><literal>ISO_8859_7</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-7, <acronym>ECMA</> 118</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Latin/Greek</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:30:59 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>ISO_8859_8</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-8, <acronym>ECMA</> 121</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Latin/Hebrew</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:30:59 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>JOHAB</literal></entry>
|
2005-03-13 05:10:23 +01:00
|
|
|
<entry><acronym>JOHAB</></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>Korean (Hangul)</entry>
|
2007-04-15 12:56:30 +02:00
|
|
|
<entry>No</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-3</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:30:59 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2009-02-10 20:29:39 +01:00
|
|
|
<entry><literal>KOI8R</literal></entry>
|
|
|
|
<entry><acronym>KOI</acronym>8-R</entry>
|
|
|
|
<entry>Cyrillic (Russian)</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2009-02-10 20:29:39 +01:00
|
|
|
<entry><literal>KOI8</></entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>KOI8U</literal></entry>
|
|
|
|
<entry><acronym>KOI</acronym>8-U</entry>
|
|
|
|
<entry>Cyrillic (Ukrainian)</entry>
|
|
|
|
<entry>Yes</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN1</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-1, <acronym>ECMA</> 94</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Western European</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO88591</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN2</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-2, <acronym>ECMA</> 94</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Central European</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO88592</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN3</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-3, <acronym>ECMA</> 94</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>South European</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO88593</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN4</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-4, <acronym>ECMA</> 94</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>North European</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO88594</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN5</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-9, <acronym>ECMA</> 128</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Turkish</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO88599</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN6</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-10, <acronym>ECMA</> 144</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Nordic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO885910</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN7</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>ISO 8859-13</entry>
|
|
|
|
<entry>Baltic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO885913</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN8</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>ISO 8859-14</entry>
|
|
|
|
<entry>Celtic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO885914</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN9</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>ISO 8859-15</entry>
|
|
|
|
<entry>LATIN1 with Euro and accents</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:20:50 +01:00
|
|
|
<entry>ISO885915</entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>LATIN10</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>ISO 8859-16, <acronym>ASRO</> SR 14111</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Romanian</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ISO885916</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<entry><literal>MULE_INTERNAL</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Mule internal code</entry>
|
2005-11-05 00:14:02 +01:00
|
|
|
<entry>Multilingual Emacs</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-4</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:26:30 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>SJIS</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Shift JIS</entry>
|
|
|
|
<entry>Japanese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>No</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-2</entry>
|
|
|
|
<entry><literal>Mskanji</>, <literal>ShiftJIS</>, <literal>WIN932</>, <literal>Windows932</></entry>
|
2005-03-13 02:26:30 +01:00
|
|
|
</row>
|
2007-03-25 13:56:04 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>SHIFT_JIS_2004</literal></entry>
|
|
|
|
<entry>Shift JIS, JIS X 0213</entry>
|
|
|
|
<entry>Japanese</entry>
|
|
|
|
<entry>No</entry>
|
|
|
|
<entry>1-2</entry>
|
|
|
|
<entry></entry>
|
|
|
|
</row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>SQL_ASCII</literal></entry>
|
2005-10-13 23:43:43 +02:00
|
|
|
<entry>unspecified (see text)</entry>
|
|
|
|
<entry><emphasis>any</></entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry></entry>
|
2005-03-13 02:30:59 +01:00
|
|
|
</row>
|
2005-03-13 02:26:30 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>UHC</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Unified Hangul Code</entry>
|
|
|
|
<entry>Korean</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>No</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1-2</entry>
|
|
|
|
<entry><literal>WIN949</>, <literal>Windows949</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
2005-03-13 02:30:59 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>UTF8</literal></entry>
|
2005-03-13 04:44:51 +01:00
|
|
|
<entry>Unicode, 8-bit</entry>
|
|
|
|
<entry><emphasis>all</></entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-10-13 23:43:43 +02:00
|
|
|
<entry>1-4</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry><literal>Unicode</></entry>
|
2005-03-13 02:30:59 +01:00
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>WIN866</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Windows CP866</entry>
|
|
|
|
<entry>Cyrillic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ALT</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN874</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Windows CP874</entry>
|
|
|
|
<entry>Thai</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1250</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Windows CP1250</entry>
|
|
|
|
<entry>Central European</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>WIN1251</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Windows CP1251</entry>
|
2005-03-15 03:30:33 +01:00
|
|
|
<entry>Cyrillic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
<entry>1</entry>
|
2005-03-15 03:30:33 +01:00
|
|
|
<entry><literal>WIN</></entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1252</literal></entry>
|
|
|
|
<entry>Windows CP1252</entry>
|
2005-03-15 03:30:33 +01:00
|
|
|
<entry>Western European</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-15 03:30:33 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
2006-07-28 17:33:17 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1253</literal></entry>
|
|
|
|
<entry>Windows CP1253</entry>
|
|
|
|
<entry>Greek</entry>
|
|
|
|
<entry>Yes</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
<entry></entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1254</literal></entry>
|
|
|
|
<entry>Windows CP1254</entry>
|
|
|
|
<entry>Turkish</entry>
|
|
|
|
<entry>Yes</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
<entry></entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1255</literal></entry>
|
|
|
|
<entry>Windows CP1255</entry>
|
|
|
|
<entry>Hebrew</entry>
|
|
|
|
<entry>Yes</entry>
|
|
|
|
<entry>1</entry>
|
2006-02-18 17:15:23 +01:00
|
|
|
<entry></entry>
|
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1256</literal></entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Windows CP1256</entry>
|
|
|
|
<entry>Arabic</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
|
|
|
<entry>1</entry>
|
|
|
|
<entry></entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1257</literal></entry>
|
|
|
|
<entry>Windows CP1257</entry>
|
|
|
|
<entry>Baltic</entry>
|
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
2005-03-13 03:33:03 +01:00
|
|
|
<entry></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>WIN1258</literal></entry>
|
2005-03-13 05:35:06 +01:00
|
|
|
<entry>Windows CP1258</entry>
|
2005-03-13 03:54:34 +01:00
|
|
|
<entry>Vietnamese</entry>
|
2006-07-28 17:33:17 +02:00
|
|
|
<entry>Yes</entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry>1</entry>
|
|
|
|
<entry><literal>ABC</>, <literal>TCVN</>, <literal>TCVN5712</>, <literal>VSCII</></entry>
|
2004-12-27 23:30:10 +01:00
|
|
|
</row>
|
2000-09-12 07:37:09 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
Not all <acronym>API</>s support all the listed character sets. For example, the
|
2001-11-28 21:49:10 +01:00
|
|
|
<productname>PostgreSQL</>
|
|
|
|
JDBC driver does not support <literal>MULE_INTERNAL</>, <literal>LATIN6</>,
|
|
|
|
<literal>LATIN8</>, and <literal>LATIN10</>.
|
|
|
|
</para>
|
2005-10-13 23:43:43 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The <literal>SQL_ASCII</> setting behaves considerably differently
|
|
|
|
from the other settings. When the server character set is
|
|
|
|
<literal>SQL_ASCII</>, the server interprets byte values 0-127
|
|
|
|
according to the ASCII standard, while byte values 128-255 are taken
|
|
|
|
as uninterpreted characters. No encoding conversion will be done when
|
|
|
|
the setting is <literal>SQL_ASCII</>. Thus, this setting is not so
|
|
|
|
much a declaration that a specific encoding is in use, as a declaration
|
|
|
|
of ignorance about the encoding. In most cases, if you are
|
|
|
|
working with any non-ASCII data, it is unwise to use the
|
|
|
|
<literal>SQL_ASCII</> setting, because
|
|
|
|
<productname>PostgreSQL</productname> will be unable to help you by
|
|
|
|
converting or validating non-ASCII characters.
|
|
|
|
</para>
|
2002-07-24 07:51:56 +02:00
|
|
|
</sect2>
|
2009-03-26 21:55:49 +01:00
|
|
|
|
2000-09-12 07:37:09 +02:00
|
|
|
<sect2>
|
2003-03-24 15:32:51 +01:00
|
|
|
<title>Setting the Character Set</title>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
<command>initdb</> defines the default character set
|
|
|
|
for a <productname>PostgreSQL</productname> cluster. For example,
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
initdb -E EUC_JP
|
2001-11-28 21:49:10 +01:00
|
|
|
</screen>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
sets the default character set (encoding) to
|
|
|
|
<literal>EUC_JP</literal> (Extended Unix Code for Japanese). You
|
|
|
|
can use <option>--encoding</option> instead of
|
|
|
|
<option>-E</option> if you prefer to type longer option strings.
|
2002-07-24 07:51:56 +02:00
|
|
|
If no <option>-E</> or <option>--encoding</option> option is
|
2005-04-16 18:50:01 +02:00
|
|
|
given, <command>initdb</> attempts to determine the appropriate
|
|
|
|
encoding to use based on the specified or default locale.
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2008-09-23 11:20:39 +02:00
|
|
|
You can specify a non-default encoding at database creation time,
|
|
|
|
provided that the encoding is compatible with the selected locale:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<screen>
|
2008-09-23 11:20:39 +02:00
|
|
|
createdb -E EUC_KR -T template0 --lc-collate=ko_KR.euckr --lc-ctype=ko_KR.euckr korean
|
2001-11-28 21:49:10 +01:00
|
|
|
</screen>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
This will create a database named <literal>korean</literal> that
|
2008-09-23 11:20:39 +02:00
|
|
|
uses the character set <literal>EUC_KR</literal>, and locale <literal>ko_KR</literal>.
|
|
|
|
Another way to accomplish this is to use this SQL command:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<programlisting>
|
2009-04-06 10:42:53 +02:00
|
|
|
CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;
|
2001-11-28 21:49:10 +01:00
|
|
|
</programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2009-05-06 18:15:21 +02:00
|
|
|
Notice that the above commands specify copying the <literal>template0</>
|
|
|
|
database. When copying any other database, the encoding and locale
|
|
|
|
settings cannot be changed from those of the source database, because
|
|
|
|
that might result in corrupt data. For more information see
|
|
|
|
<xref linkend="manage-ag-templatedbs">.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
The encoding for a database is stored in the system catalog
|
2007-09-29 00:25:49 +02:00
|
|
|
<literal>pg_database</literal>. You can see it by using the
|
2003-03-24 15:32:51 +01:00
|
|
|
<option>-l</option> option or the <command>\l</command> command
|
|
|
|
of <command>psql</command>.
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<screen>
|
|
|
|
$ <userinput>psql -l</userinput>
|
2008-09-23 11:20:39 +02:00
|
|
|
List of databases
|
|
|
|
Name | Owner | Encoding | Collation | Ctype | Access Privileges
|
|
|
|
-----------+----------+-----------+-------------+-------------+-------------------------------------
|
|
|
|
clocaledb | hlinnaka | SQL_ASCII | C | C |
|
|
|
|
englishdb | hlinnaka | UTF8 | en_GB.UTF8 | en_GB.UTF8 |
|
|
|
|
japanese | hlinnaka | UTF8 | ja_JP.UTF8 | ja_JP.UTF8 |
|
|
|
|
korean | hlinnaka | EUC_KR | ko_KR.euckr | ko_KR.euckr |
|
|
|
|
postgres | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 |
|
|
|
|
template0 | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 | {=c/hlinnaka,hlinnaka=CTc/hlinnaka}
|
|
|
|
template1 | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 | {=c/hlinnaka,hlinnaka=CTc/hlinnaka}
|
|
|
|
(7 rows)
|
2001-11-28 21:49:10 +01:00
|
|
|
</screen>
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
2004-12-27 23:30:10 +01:00
|
|
|
|
|
|
|
<important>
|
|
|
|
<para>
|
2007-09-29 00:25:49 +02:00
|
|
|
On most modern operating systems, <productname>PostgreSQL</productname>
|
|
|
|
can determine which character set is implied by an <envar>LC_CTYPE</>
|
2009-05-06 18:15:21 +02:00
|
|
|
setting, and it will enforce that only the matching database encoding is
|
2007-09-29 00:25:49 +02:00
|
|
|
used. On older systems it is your responsibility to ensure that you use
|
|
|
|
the encoding expected by the locale you have selected. A mistake in
|
|
|
|
this area is likely to lead to strange misbehavior of locale-dependent
|
|
|
|
operations such as sorting.
|
2004-12-27 23:30:10 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-09-29 00:25:49 +02:00
|
|
|
<productname>PostgreSQL</productname> will allow superusers to create
|
|
|
|
databases with <literal>SQL_ASCII</> encoding even when
|
|
|
|
<envar>LC_CTYPE</> is not <literal>C</> or <literal>POSIX</>. As noted
|
|
|
|
above, <literal>SQL_ASCII</> does not enforce that the data stored in
|
|
|
|
the database has any particular encoding, and so this choice poses risks
|
|
|
|
of locale-dependent misbehavior. Using this combination of settings is
|
|
|
|
deprecated and may someday be forbidden altogether.
|
2004-12-27 23:30:10 +01:00
|
|
|
</para>
|
|
|
|
</important>
|
2000-09-12 07:37:09 +02:00
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2>
|
2003-03-24 15:32:51 +01:00
|
|
|
<title>Automatic Character Set Conversion Between Server and Client</title>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
|
|
|
<para>
|
2003-03-24 15:32:51 +01:00
|
|
|
<productname>PostgreSQL</productname> supports automatic
|
|
|
|
character set conversion between server and client for certain
|
2006-07-28 17:33:17 +02:00
|
|
|
character set combinations. The conversion information is stored in the
|
|
|
|
<literal>pg_conversion</> system catalog. <productname>PostgreSQL</>
|
|
|
|
comes with some predefined conversions, as shown in <xref
|
|
|
|
linkend="multibyte-translation-table">. You can create a new
|
|
|
|
conversion using the SQL command <command>CREATE CONVERSION</command>.
|
2001-11-28 21:49:10 +01:00
|
|
|
</para>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2003-03-24 15:32:51 +01:00
|
|
|
<table id="multibyte-translation-table">
|
|
|
|
<title>Client/Server Character Set Conversions</title>
|
2000-09-12 07:37:09 +02:00
|
|
|
<tgroup cols="2">
|
|
|
|
<thead>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
|
|
|
<entry>Server Character Set</entry>
|
|
|
|
<entry>Available Client Character Sets</entry>
|
|
|
|
</row>
|
2000-09-12 07:37:09 +02:00
|
|
|
</thead>
|
|
|
|
<tbody>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>BIG5</literal></entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
<entry><emphasis>not supported as a server encoding</emphasis>
|
2005-03-13 03:02:44 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>EUC_CN</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>EUC_CN</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>EUC_JP</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>EUC_JP</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>SJIS</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>EUC_KR</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>EUC_KR</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>EUC_TW</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>EUC_TW</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>BIG5</literal>,
|
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2005-03-13 03:02:44 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>GB18030</literal></entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
<entry><emphasis>not supported as a server encoding</emphasis>
|
2005-03-13 03:02:44 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>GBK</literal></entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
<entry><emphasis>not supported as a server encoding</emphasis>
|
2005-03-13 03:02:44 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>ISO_8859_5</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>ISO_8859_5</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>KOI8</literal>,
|
2005-03-13 03:02:44 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>,
|
2005-03-13 03:02:44 +01:00
|
|
|
<literal>WIN866</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>WIN1251</literal>
|
2005-03-13 03:02:44 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>ISO_8859_6</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>ISO_8859_6</emphasis>,
|
2005-03-13 03:02:44 +01:00
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>ISO_8859_7</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>ISO_8859_7</emphasis>,
|
2005-03-13 03:02:44 +01:00
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>ISO_8859_8</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>ISO_8859_8</emphasis>,
|
2005-03-13 03:02:44 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>JOHAB</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>JOHAB</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>KOI8</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>KOI8</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>ISO_8859_5</literal>,
|
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>,
|
|
|
|
<literal>WIN866</literal>,
|
|
|
|
<literal>WIN1251</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN1</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN1</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN2</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN2</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
2005-03-07 05:30:55 +01:00
|
|
|
<literal>UTF8</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>WIN1250</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN3</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN3</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN4</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN4</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN5</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN5</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN6</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN6</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN7</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN7</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN8</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN8</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN9</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN9</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<entry><literal>LATIN10</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>LATIN10</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>MULE_INTERNAL</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>MULE_INTERNAL</emphasis>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>BIG5</literal>,
|
|
|
|
<literal>EUC_CN</literal>,
|
|
|
|
<literal>EUC_JP</literal>,
|
|
|
|
<literal>EUC_KR</literal>,
|
|
|
|
<literal>EUC_TW</literal>,
|
|
|
|
<literal>ISO_8859_5</literal>,
|
2005-03-13 06:16:33 +01:00
|
|
|
<literal>KOI8</literal>,
|
2005-03-13 23:04:29 +01:00
|
|
|
<literal>LATIN1</literal> to <literal>LATIN4</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>SJIS</literal>,
|
|
|
|
<literal>WIN866</literal>,
|
|
|
|
<literal>WIN1250</literal>,
|
|
|
|
<literal>WIN1251</literal>
|
2005-03-13 06:16:33 +01:00
|
|
|
</entry>
|
2005-03-13 06:11:49 +01:00
|
|
|
</row>
|
2009-03-26 21:55:49 +01:00
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>SJIS</literal></entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
<entry><emphasis>not supported as a server encoding</emphasis>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>SQL_ASCII</literal></entry>
|
2005-10-13 23:43:43 +02:00
|
|
|
<entry><emphasis>any (no conversion will be performed)</emphasis>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-13 03:02:44 +01:00
|
|
|
<entry><literal>UHC</literal></entry>
|
2005-03-14 19:31:25 +01:00
|
|
|
<entry><emphasis>not supported as a server encoding</emphasis>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>UTF8</literal></entry>
|
2005-03-14 03:14:42 +01:00
|
|
|
<entry><emphasis>all supported encodings</emphasis>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>WIN866</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>WIN866</emphasis>,
|
2005-03-13 06:31:04 +01:00
|
|
|
<literal>ISO_8859_5</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>KOI8</literal>,
|
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>,
|
|
|
|
<literal>WIN1251</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN874</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>WIN874</emphasis>,
|
2005-03-07 05:30:55 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1250</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>WIN1250</emphasis>,
|
2005-03-13 06:31:04 +01:00
|
|
|
<literal>LATIN2</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>WIN1251</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>WIN1251</emphasis>,
|
2005-03-13 06:31:04 +01:00
|
|
|
<literal>ISO_8859_5</literal>,
|
2005-03-13 06:11:49 +01:00
|
|
|
<literal>KOI8</literal>,
|
|
|
|
<literal>MULE_INTERNAL</literal>,
|
|
|
|
<literal>UTF8</literal>,
|
|
|
|
<literal>WIN866</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2005-03-14 19:31:25 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1252</literal></entry>
|
|
|
|
<entry><emphasis>WIN1252</emphasis>,
|
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
2006-02-18 17:15:23 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1253</literal></entry>
|
|
|
|
<entry><emphasis>WIN1253</emphasis>,
|
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1254</literal></entry>
|
|
|
|
<entry><emphasis>WIN1254</emphasis>,
|
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1255</literal></entry>
|
|
|
|
<entry><emphasis>WIN1255</emphasis>,
|
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1256</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>WIN1256</emphasis>,
|
2005-03-07 05:30:55 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2006-02-18 17:15:23 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>WIN1257</literal></entry>
|
|
|
|
<entry><emphasis>WIN1257</emphasis>,
|
|
|
|
<literal>UTF8</literal>
|
|
|
|
</entry>
|
|
|
|
</row>
|
2004-12-27 23:30:10 +01:00
|
|
|
<row>
|
2005-03-07 05:30:55 +01:00
|
|
|
<entry><literal>WIN1258</literal></entry>
|
2005-03-13 23:04:29 +01:00
|
|
|
<entry><emphasis>WIN1258</emphasis>,
|
2005-03-07 05:30:55 +01:00
|
|
|
<literal>UTF8</literal>
|
2004-12-27 23:30:10 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2000-09-12 07:37:09 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
<para>
|
2005-10-13 23:43:43 +02:00
|
|
|
To enable automatic character set conversion, you have to
|
2003-03-24 15:32:51 +01:00
|
|
|
tell <productname>PostgreSQL</productname> the character set
|
|
|
|
(encoding) you would like to use in the client. There are several
|
|
|
|
ways to accomplish this:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-12-27 23:30:10 +01:00
|
|
|
Using the <command>\encoding</command> command in
|
|
|
|
<application>psql</application>.
|
|
|
|
<command>\encoding</command> allows you to change client
|
|
|
|
encoding on the fly. For
|
|
|
|
example, to change the encoding to <literal>SJIS</literal>, type:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
\encoding SJIS
|
2001-11-28 21:49:10 +01:00
|
|
|
</programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-03-06 16:37:56 +01:00
|
|
|
<application>libpq</> (<xref linkend="libpq-control">) has functions to control the client encoding.
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-12-27 23:30:10 +01:00
|
|
|
Using <command>SET client_encoding TO</command>.
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2004-12-27 23:30:10 +01:00
|
|
|
Setting the client encoding can be done with this SQL command:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<programlisting>
|
2003-03-24 15:32:51 +01:00
|
|
|
SET CLIENT_ENCODING TO '<replaceable>value</>';
|
2001-11-28 21:49:10 +01:00
|
|
|
</programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2006-07-28 17:33:17 +02:00
|
|
|
Also you can use the standard SQL syntax <literal>SET NAMES</literal>
|
|
|
|
for this purpose:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<programlisting>
|
2003-03-24 15:32:51 +01:00
|
|
|
SET NAMES '<replaceable>value</>';
|
2001-11-28 21:49:10 +01:00
|
|
|
</programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2004-12-27 23:30:10 +01:00
|
|
|
To query the current client encoding:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<programlisting>
|
2003-09-11 20:30:39 +02:00
|
|
|
SHOW client_encoding;
|
2001-11-28 21:49:10 +01:00
|
|
|
</programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2004-12-27 23:30:10 +01:00
|
|
|
To return to the default encoding:
|
2000-09-12 07:37:09 +02:00
|
|
|
|
2001-11-28 21:49:10 +01:00
|
|
|
<programlisting>
|
2003-09-11 20:30:39 +02:00
|
|
|
RESET client_encoding;
|
2001-11-28 21:49:10 +01:00
|
|
|
</programlisting>
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2001-01-19 05:47:50 +01:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-12-27 23:30:10 +01:00
|
|
|
Using <envar>PGCLIENTENCODING</envar>. If the environment variable
|
2004-03-09 17:57:47 +01:00
|
|
|
<envar>PGCLIENTENCODING</envar> is defined in the client's
|
|
|
|
environment, that client encoding is automatically selected
|
|
|
|
when a connection to the server is made. (This can
|
|
|
|
subsequently be overridden using any of the other methods
|
|
|
|
mentioned above.)
|
2001-01-19 05:47:50 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2002-07-24 07:51:56 +02:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-03-09 17:57:47 +01:00
|
|
|
Using the configuration variable <xref
|
|
|
|
linkend="guc-client-encoding">. If the
|
|
|
|
<varname>client_encoding</> variable is set, that client
|
|
|
|
encoding is automatically selected when a connection to the
|
|
|
|
server is made. (This can subsequently be overridden using any
|
|
|
|
of the other methods mentioned above.)
|
2002-07-24 07:51:56 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
2000-09-12 07:37:09 +02:00
|
|
|
</itemizedlist>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2004-11-15 07:32:15 +01:00
|
|
|
If the conversion of a particular character is not possible
|
|
|
|
— suppose you chose <literal>EUC_JP</literal> for the
|
|
|
|
server and <literal>LATIN1</literal> for the client, then some
|
2006-07-28 17:33:17 +02:00
|
|
|
Japanese characters do not have a representation in
|
|
|
|
<literal>LATIN1</literal> — then an error is reported.
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
2005-10-13 23:43:43 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
If the client character set is defined as <literal>SQL_ASCII</>,
|
|
|
|
encoding conversion is disabled, regardless of the server's character
|
|
|
|
set. Just as for the server, use of <literal>SQL_ASCII</> is unwise
|
|
|
|
unless you are working with all-ASCII data.
|
|
|
|
</para>
|
2000-09-12 07:37:09 +02:00
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2>
|
2003-03-24 15:32:51 +01:00
|
|
|
<title>Further Reading</title>
|
2000-09-12 07:37:09 +02:00
|
|
|
|
|
|
|
<para>
|
2001-01-19 05:47:50 +01:00
|
|
|
These are good sources to start learning about various kinds of encoding
|
2000-09-12 07:37:09 +02:00
|
|
|
systems.
|
|
|
|
|
2005-03-12 07:28:17 +01:00
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
|
|
<term><ulink url="http://www.i18ngurus.com/docs/984813247.html"></ulink></term>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
An extensive collection of documents about character sets, encodings,
|
|
|
|
and code pages.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2001-10-09 20:46:00 +02:00
|
|
|
<varlistentry>
|
2001-10-31 21:35:02 +01:00
|
|
|
<term><ulink url="ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf"></ulink></term>
|
2001-10-09 20:46:00 +02:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Detailed explanations of <literal>EUC_JP</literal>,
|
|
|
|
<literal>EUC_CN</literal>, <literal>EUC_KR</literal>,
|
|
|
|
<literal>EUC_TW</literal> appear in section 3.2.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2001-10-31 21:35:02 +01:00
|
|
|
<term><ulink url="http://www.unicode.org/"></ulink></term>
|
2001-10-09 20:46:00 +02:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-05-06 18:15:21 +02:00
|
|
|
The web site of the Unicode Consortium.
|
2001-10-09 20:46:00 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2007-01-09 23:22:55 +01:00
|
|
|
<term>RFC 3629</term>
|
2001-10-09 20:46:00 +02:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-12-27 23:30:10 +01:00
|
|
|
<acronym>UTF</acronym>-8 is defined here.
|
2001-10-09 20:46:00 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
2000-09-12 07:37:09 +02:00
|
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
</sect1>
|
2000-09-30 18:58:20 +02:00
|
|
|
|
|
|
|
</chapter>
|