Commit Graph

140 Commits

Author SHA1 Message Date
Tatsuo Ishii eb335a034b I have committed many support files for CREATE CONVERSION. Default
conversion procs and conversions are added in initdb. Currently
supported conversions are:

UTF-8(UNICODE) <--> SQL_ASCII, ISO-8859-1 to 16, EUC_JP, EUC_KR,
		    EUC_CN, EUC_TW, SJIS, BIG5, GBK, GB18030, UHC,
		    JOHAB, TCVN

EUC_JP <--> SJIS
EUC_TW <--> BIG5
MULE_INTERNAL <--> EUC_JP, SJIS, EUC_TW, BIG5

Note that initial contents of pg_conversion system catalog are created
in the initdb process. So doing initdb required is ideal, it's
possible to add them to your databases by hand, however. To accomplish
this:

psql -f your_postgresql_install_path/share/conversion_create.sql your_database

So I did not bump up the version in cataversion.h.

TODO:
Add more conversion procs
Add [CASCADE|RESTRICT] to DROP CONVERSION
Add tuples to pg_depend
Add regression tests
Write docs
Add SQL99 CONVERT command?
--
Tatsuo Ishii
2002-07-18 02:02:30 +00:00
Tatsuo Ishii 14f72b9a4d Add GB18030 support. Contributed by Bill Huang <bill_huanghb@ybb.ne.jp>
(ODBC support has not been committed yet. left for Hiroshi...)
2002-06-13 08:30:22 +00:00
Bruce Momjian a8bd7e1c6e > Tatsuo Ishii wrote:
> > > > It was made to cope with encoding such as an Asian bloc in 7.2Beta2.
> > > >
> > > > Added ServerEncoding
> > > >         Korean (JOHAB), Thai (WIN874),
> > > >         Vietnamese (TCVN), Arabic (WIN1256)
> > > >
> > > > Added ClientEncoding
> > > >         Simplified Chinese (GBK), Korean (UHC)
> > > >
> > > >
> > > >
> http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2b2.newencoding.diff.tar.gz
> > > > (608K)
> > >
> > > Looks good.  I need some people to review this for me.
> >
> > For me they look good too. The only missing part is a
> > documentation. I will ask him to write it up. If he couldn't, I will
> > do it for him.
> > > The diff is 3mb
> > > but appears to address only additions to multibyte.  I have attached a
> > > list of files it modifies.  Also, look at the sizes of the mb/
> > > directory.  It is getting large:
> > >
> > >   4       ./CVS
> > >   6       ./Unicode/CVS
> > >   3433    ./Unicode
> > >   6197    .
> >
> > Yes. We definitely need the on-the-fly encoding addition capability:
> > i.e. CREATE CHRACTER SET in the future...
> > --
> > Tatsuo Ishii
> >
> >

Address chainge.

http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2.newencoding.diff.gz

Add PsqlODBC and document ...etc patch.

Eiji Tokuya
2002-03-05 05:52:50 +00:00
Bruce Momjian ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Bruce Momjian 6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Tatsuo Ishii cfe01796e6 Ok, here is the modified encoding table (column1 is the standard name,
2 is our "official" name, and 3 is alias). If there's no objection, I
will change them.

ASCII		SQL_ASCII
UTF-8		UNICODE		UTF_8
MULE-INTERNAL	MULE_INTERNAL
ISO-8859-1	LATIN1		ISO_8859_1
ISO-8859-2	LATIN2		ISO_8859_2
ISO-8859-3	LATIN3		ISO_8859_3
ISO-8859-4	LATIN4		ISO_8859_4
ISO-8859-5	ISO_8859_5
ISO-8859-6	ISO_8859_6
ISO-8859-7	ISO_8859_7
ISO-8859-8	ISO_8859_8
ISO-8859-9	LATIN5		ISO_8859_9
ISO-8859-10	LATIN6		ISO_8859_10
ISO-8859-13	LATIN7		ISO_8859_13
ISO-8859-14	LATIN8		ISO_8859_14
ISO-8859-15	LATIN9		ISO_8859_15
ISO-8859-16	LATIN10		ISO_8859_16
2001-10-16 10:09:17 +00:00
Tatsuo Ishii 51053d3216 Add support for ISO-8859-6 to 16 2001-10-11 14:20:35 +00:00
Tatsuo Ishii be629abfc8 Add pg_database_encoding_max_length() function. 2001-09-23 10:59:45 +00:00
Tom Lane e3f5bc3492 Fix type_maximum_size() to give the right answer in MULTIBYTE cases.
Avoid use of prototype-less function pointers in MB code.
2001-09-21 15:27:38 +00:00
Tatsuo Ishii e1de3e0833 Implement following item in TODO:
* Reject character sequences those are not valid in their charset
2001-09-11 04:50:36 +00:00
Tatsuo Ishii 227767112c Commit Karel's patch.
-------------------------------------------------------------------
Subject: Re: [PATCHES] encoding names
From: Karel Zak <zakkr@zf.jcu.cz>
To: Peter Eisentraut <peter_e@gmx.net>
Cc: pgsql-patches <pgsql-patches@postgresql.org>
Date: Fri, 31 Aug 2001 17:24:38 +0200

On Thu, Aug 30, 2001 at 01:30:40AM +0200, Peter Eisentraut wrote:
> > 		- convert encoding 'name' to 'id'
>
> I thought we decided not to add functions returning "new" names until we
> know exactly what the new names should be, and pending schema

 Ok, the patch not to add functions.

> better
>
>     ...(): encoding name too long

 Fixed.

 I found new bug in command/variable.c in parse_client_encoding(), nobody
probably never see this error:

if (pg_set_client_encoding(encoding))
{
	elog(ERROR, "Conversion between %s and %s is not supported",
                     value, GetDatabaseEncodingName());
}

because pg_set_client_encoding() returns -1 for error and 0 as true.
It's fixed too.

 IMHO it can be apply.

		Karel
PS:

    * following files are renamed:

src/utils/mb/Unicode/KOI8_to_utf8.map  -->
        src/utils/mb/Unicode/koi8r_to_utf8.map

src/utils/mb/Unicode/WIN_to_utf8.map  -->
        src/utils/mb/Unicode/win1251_to_utf8.map

src/utils/mb/Unicode/utf8_to_KOI8.map -->
        src/utils/mb/Unicode/utf8_to_koi8r.map

src/utils/mb/Unicode/utf8_to_WIN.map -->
        src/utils/mb/Unicode/utf8_to_win1251.map

   * new file:

src/utils/mb/encname.c

   * removed file:

src/utils/mb/common.c

--
 Karel Zak  <zakkr@zf.jcu.cz>
 http://home.zf.jcu.cz/~zakkr/

 C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
2001-09-06 04:57:30 +00:00
Tatsuo Ishii ab9b6c45cf Add conver/convert2 functions. They are similar to the SQL99's convert. 2001-08-15 07:07:40 +00:00
Tatsuo Ishii 1032445e5d TODO item:
* Make n of CHAR(n)/VARCHAR(n) the number of letters, not bytes
2001-07-15 11:07:37 +00:00
Bruce Momjian 0cec2bb0cd BTW it does not add encodign it just patches existing one (KOI8) to
support two - KOI8-R and KOI8-U (latter is superset of the former if
not to take to the account pseudographics)

Andy Rysin
2001-05-03 21:38:45 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Tom Lane d08741eab5 Restructure the key include files per recent pghackers discussion: there
are now separate files "postgres.h" and "postgres_fe.h", which are meant
to be the primary include files for backend .c files and frontend .c files
respectively.  By default, only include files meant for frontend use are
installed into the installation include directory.  There is a new make
target 'make install-all-headers' that adds the whole content of the
src/include tree to the installed fileset, for use by people who want to
develop server-side code without keeping the complete source tree on hand.
Cleaned up a whole lot of crufty and inconsistent header inclusions.
2001-02-10 02:31:31 +00:00
Tom Lane 2cf48ca04b Extend CREATE DATABASE to allow selection of a template database to be
cloned, rather than always cloning template1.  Modify initdb to generate
two identical databases rather than one, template0 and template1.
Connections to template0 are disallowed, so that it will always remain
in its virgin as-initdb'd state.  pg_dumpall now dumps databases with
restore commands that say CREATE DATABASE foo WITH TEMPLATE = template0.
This allows proper behavior when there is user-added data in template1.
initdb forced!
2000-11-14 18:37:49 +00:00
Tatsuo Ishii 1acf6f9c8e Add support for code conversion between Unicode and other encodings.
Supported encodings are: EUC_JP, EUC_CN, EUC_KR, EUC_TW, Shift JIS,
Big5, ISO8859-[1-5].
TODO: testings! and documentations...
2000-10-30 10:41:05 +00:00
Tatsuo Ishii 2969c01d55 Remove gcc-only macro definition 2000-10-27 02:23:51 +00:00
Tom Lane 0a63b6d066 Support SET/SHOW/RESET client_encoding and server_encoding even when
MULTIBYTE support is not compiled (you just can't set them to anything
but SQL_ASCII).  This should reduce interoperability problems between
MB-enabled clients and non-MB-enabled servers.
2000-10-25 19:44:44 +00:00
Tatsuo Ishii 40184eecd5 Disable elog when linked with frontend. 2000-10-12 07:36:51 +00:00
Tatsuo Ishii f48b9f9ec7 Support for automatic code conversion between UNICODE and other
encodings
2000-10-12 06:08:28 +00:00
Tatsuo Ishii 8cca25728b Change return type of:
pg_mb2wchar(const unsigned char *, pg_wchar *);
       pg_mb2wchar_with_len(const unsigned char *, pg_wchar *, int);
from void to int. Now they return the number of
wide chars.
2000-08-25 14:24:09 +00:00
Bruce Momjian e362d4e1ea #include cleanups 2000-06-15 00:52:26 +00:00
Tom Lane f2d1205322 Another batch of fmgr updates. I think I have gotten all old-style
functions that take pass-by-value datatypes.  Should be ready for
port testing ...
2000-06-13 07:35:40 +00:00
Tom Lane 87e701b8d5 Clean up const-vs-not-const compiler warning in MULTIBYTE code.
'Twas my fault, I think.
2000-04-20 22:40:18 +00:00
Tatsuo Ishii 5eb1d0deb1 Add builtin functions:
pg_char_to_encoding()
pg_encoding_to_char()
2000-01-18 05:10:29 +00:00
Peter Eisentraut 2a1bfbce24 - Allow array on int8
- Prevent permissions on indexes
- Instituted --enable-multibyte option and tweaked the MB build process where necessary
- initdb prompts for superuser password
2000-01-15 18:30:35 +00:00
Bruce Momjian 33e826d167 Fix for multi-byte includes. 1999-07-17 16:25:28 +00:00
Bruce Momjian 0cf1b79528 Cleanup of /include #include's, for 6.6 only. 1999-07-14 01:20:30 +00:00
Tatsuo Ishii 8f02f2252d Fix some compiler warnings (Tomoaki Nishiyama), add WIN1250 support (Pavel Behal) 1999-07-11 22:47:21 +00:00
Bruce Momjian 07842084fe pgindent run over code. 1999-05-25 16:15:34 +00:00
Tatsuo Ishii 2c775870bc Add KOI8/WIN/ALT to the multi-byte encoding selections 1999-03-24 06:53:28 +00:00
Bruce Momjian a7ad43cd18 Included patches make some enhancements to the multi-byte support.
o allow to use Big5 (a Chinese encoding used in Taiwan) as a client
  encoding. In this case the server side encoding should be EUC_TW

o add EUC_TW and Big5 test cases to the regression and the mb test
  (contributed by Jonah Kuo)

o fix mistake in include/mb/pg_wchar.h. An encoding id for EUC_TW was
  not correct (was 3 and now is 4)

o update documents (doc/README.mb and README.mb.jp)

o update psql helpfile (bin/psql/psqlHelp.h)

--
Tatsuo Ishii
t-ishii@sra.co.jp
1999-02-02 18:51:40 +00:00
Bruce Momjian f52e7346ea MB patches from Tatsuo Ishii 1998-09-25 01:46:25 +00:00
Bruce Momjian fa1a8d6a97 OK, folks, here is the pgindent output. 1998-09-01 04:40:42 +00:00
Bruce Momjian c0b01461db o note that now pg_database has a new attribuite "encoding" even
if MULTIBYTE is not enabled. So be sure to run initdb.

o these patches are made against the latest source tree (after
Bruce's massive patch, I think) BTW, I noticed that after running
regression, the oid field of pg_type seems disappeared.

	regression=> select oid from pg_type; ERROR:  attribute
	'oid' not found

this happens after the constraints test. This occures with/without
my patches. strange...

o pg_database_mb.h, pg_class_mb.h, pg_attribute_mb.h are no longer
used, and shoud be removed.

o GetDatabaseInfo() in utils/misc/database.c removed (actually in
#ifdef 0). seems nobody uses.

t-ishii@sra.co.jp
1998-08-24 01:14:24 +00:00
Marc G. Fournier 5979d73841 From: t-ishii@sra.co.jp
As Bruce mentioned, this is due to the conflict among changes we made.
Included patches should fix the problem(I changed all MB to
MULTIBYTE). Please let me know if you have further problem.

P.S. I did not include pathces to configure and gram.c to save the
file size(configure.in and gram.y modified).
1998-07-26 04:31:41 +00:00
Marc G. Fournier bf00bbb0c4 I really hope that I haven't missed anything in this one...
From: t-ishii@sra.co.jp

Attached are patches to enhance the multi-byte support.  (patches are
against 7/18 snapshot)

* determine encoding at initdb/createdb rather than compile time

Now initdb/createdb has an option to specify the encoding. Also, I
modified the syntax of CREATE DATABASE to accept encoding option. See
README.mb for more details.

For this purpose I have added new column "encoding" to pg_database.
Also pg_attribute and pg_class are changed to catch up the
modification to pg_database.  Actually I haved added pg_database_mb.h,
pg_attribute_mb.h and pg_class_mb.h. These are used only when MB is
enabled. The reason having separate files is I couldn't find a way to
use ifdef or whatever in those files. I have to admit it looks
ugly. No way.

* support for PGCLIENTENCODING when issuing COPY command

commands/copy.c modified.

* support for SQL92 syntax "SET NAMES"

See gram.y.

* support for LATIN2-5
* add UNICODE regression test case
* new test suite for MB

New directory test/mb added.

* clean up source files

Basic idea is to have MB's own subdirectory for easier maintenance.
These are include/mb and backend/utils/mb.
1998-07-24 03:32:46 +00:00