postgresql/doc/README.locale

===========
1999 Jul 21
===========

   Josef Balatka, <balatka@email.cz> asked us not to remove RECODE and sent me
Czech ISO-8859-2 -> WIN-1250 translation table.
   RECODE is no longer contains just Cyrillic RECODE and will stay in
PostgreSQL.

   He also created some bits of documentation, mostly concerning RECODE -
see README.Charsets.


===========
1999 Apr 14
===========

   Tatsuo Ishii <t-ishii@sra.co.jp> updated Multibyte support extending it
to Cyrillic language. Now PostgreSQL supports KOI8-R, WIN-1251, ISO8859-5
and CP866 (ALT) encodings.

   Short instruction on using this feature follows. Longer discussion of
Multibyte support is in README.mb.

   WARNING! Now with Multibyte support Cyrillic RECODE declared obsolete
and will be removed from Postgres. If you are using RECODE consider
switching to Multibyte support.

   Instructions on how to prepare Postgres for Cyrillic Multibyte support.
   ----------------------------------------------------------------------

   First, you need to backup all your databases. I recommend to backup the
entire Postgres directory, including binaries and libraries - thus you can
easily restore if something goes wrong.

   Dump you data: pg_dumpall > dump.db

   Stop postmaster.

   Configure, compile and install Postgres. (I'll mostly talk about KOI8-R
encoding, this is just to make examples a little more clear; you can use
any supported encoding.)

   cd src
   ./configure --enable-locale --with-mb=KOI8
   make
   make install

   Make sure you've backed up your databases. Doublecheck your backup. I
really mean it - make regular backups and test your backups sometimes by
fake restore.

   Remove your data directory (better, rename or move it).

   Run initdb saying your primary encoding: initdb -e KOI8. If you omit
encoding, primary encoding from configure will be taken.

   Start postmaster.

   Create databases: createdb -e KOI8. Again, you can omit encoding -
default encoding will be used. You are not forced to use the same encoding
for all your databases - you can create different databases with different
encodings.

   Load your data from the dump you've created: psql < dump.db

   That's all! Now you are ready to enjoy the full power of Multibyte
support.

   To use Multibyte support you do not need to do something special - just
execute your queries. If client program does not set encoding, it will get
the data in database encoding. But client may ask Postgres to do automatic
server-to-client and client-to-server conversions. There are 2 (two) ways
client program declares its encoding:
   1) client explicitly executes the query SET CLIENT_ENCODING TO 'win';
   2) client started with environment variable set. Examples -
using sh syntax:
   PGCLIENTENCODING='win'; export PGCLIENTENCODING
using csh syntax:
   setenv PGCLIENTENCODING 'win'

   Setting PGCLIENTENCODING even if you use same client encding as the
database would omit an overhead of asking the database encoding while
initiating the connection, so it is good idea to set it in any case.

   Now you may run test suite and see Multibyte support in action. Go to
.../src/test/locale and run
   make clean all test-koi2win


===========
1998 Nov 20
===========

   I extended locale support, originally written by Oleg Bartunov
<oleg@sai.msu.su>. Now ORDER BY (if PostgreSQL configured with
--enable-locale) uses strcoll() for all text fields: char(n), varchar(n),
text.

   I included test suite .../src/test/locale. I didn't include this in
the regression test because not so much people require locale support. Read
.../src/test/locale/README for details on the test suite.

   Many thanks to Oleg Bartunov (oleg@sai.msu.su) and Thomas G. Lockhart
(lockhart@alumni.caltech.edu) for hints, tips, help and discussion.

Oleg.