Hi, here are patches I promised (against 6.3.2):
* character_length(), position(), substring() are now aware of
multi-byte characters
* add octet_length()
* add --with-mb option to configure
* new regression tests for EUC_KR
(contributed by "Soonmyung. Hong" <hong@lunaris.hanmesoft.co.kr>)
* add some test cases to the EUC_JP regression test
* fix problem in regress/regress.sh in case of System V
* fix toupper(), tolower() to handle 8bit chars
note that:
o patches for both configure.in and configure are
included. maybe the one for configure is not necessary.
o pg_proc.h was modified to add octet_length(). I used OIDs
(1374-1379) for that. Please let me know if these numbers are not
appropriate.
1. Remove the char2, char4, char8 and char16 types from postgresql
2. Change references of char16 to name in the regression tests.
3. Rename the char16.sql regression test to name.sql. 4. Modify
the regression test scripts and outputs to match up.
Might require new regression.{SYSTEM} files...
Darren King
Included are patches intended for allowing PostgreSQL to handle
multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and
Mule internal code. With the MB patch you can use multi-byte character
sets in regexp and LIKE. The encoding system chosen is determined at
the compile time.
To enable the MB extension, you need to define a variable "MB" in
Makefile.global or in Makefile.custom. For further information please
take a look at README.mb under doc directory.
(Note that unlike "jp patch" I do not use modified GNU regexp any
more. I changed Henry Spencer's regexp coming with PostgreSQL.)
I thought it would be a good idea to ensure that the new view
permission model will not get broken by subsequent
fixes/changes. So I wrote a little regression test for it.
There is an ugly thing in this regression test. It creates
temporary a test user that is required for the tests. The
user is removed at the end of the test, but if sometimes the
regression suite is aborted or crashes exactly here, the test
user will lay around in the pg_shadow. Don't have a clue how
to get around.
if an operating specific expected file exists, use that for the comparison.
This allows for "legit" differences between results, like the "Result too
large" message vs "Math result not representable" ...
Also, have the failed diffs get output to regression.diffs so that its easy to
view those tests that failed
Subject: [PATCHES] Re: [PORTS] AIX 6.1 fixes...
Here are the patches for the two things that wouldn't make it thru the AIX
compiler. The geo_ops.c change is harmless I believe. The nbtcompare.c patch
fixes me, but I don't know about any other ports. Maybe wait on that one
until Vadim decides what to do about the unsigned vs signed chars varlena
issue.
Subject: [PATCHES] Patches for boolean, timespan and reltime regression tests.
Hi All,
Here are a couple of patches to the regression tests to introduce
some specific ordering to the results.
I've only made changes to the queries that were exhibiting differences
on my regression runs.
This will also have the side effect of testing the ordering code for
the boolean and some of the time types.
the DROP TABLE calls from the destroy.sql file to the 'types' .sql files,
so that they are self-contained
btree_index, hash_index and misc all fail as there seems to be missing
a 'misc.out' expected file...have asked Thomas for one...
OK, here are a passel of patches for the geometric data types.
These add a "circle" data type, new operators and functions
for the existing data types, and change the default formats
for some of the existing types to make them consistant with
each other. Current formatting conventions (e.g. compatible
with v6.0 to allow dump/reload) are supported, but the new
conventions should be an improvement and we can eventually
drop the old conventions entirely.
For example, there are two kinds of paths (connected line segments),
open and closed, and the old format was
'(1,2,1,2,3,4)' for a closed path with two points (1,2) and (3,4)
'(0,2,1,2,3,4)' for an open path with two points (1,2) and (3,4)
Pretty arcane, huh? The new format for paths is
'((1,2),(3,4))' for a closed path with two points (1,2) and (3,4)
'[(1,2),(3,4)]' for an open path with two points (1,2) and (3,4)
For polygons, the old convention is
'(0,4,2,0,4,3)' for a triangle with points at (0,0),(4,4), and (2,3)
and the new convention is
'((0,0),(4,4),(2,3))' for a triangle with points at (0,0),(4,4), and (2,3)
Other data types which are also represented as lists of points
(e.g. boxes, line segments, and polygons) have similar representations
(they surround each point with parens).
For v6.1, any format which can be interpreted as the old style format
is decoded as such; we can remove that backwards compatibility but ugly
convention for v7.0. This will allow dump/reloads from v6.0.
These include some updates to the regression test files to change the test
for creating a data type from "circle" to "widget" to keep the test from
trashing the new builtin circle type.