Commit Graph

454 Commits

Author SHA1 Message Date
Tom Lane 4571185111 Suppress gcc warning about possibly-uninitialized variable. It's not
clear to me why I'd not seen this message before --- on F-9 it seems to
only happen if Asserts are disabled, which ought to be irrelevant.
Maybe that affects a decision whether to inline get_ten(), which would
be needed to expose the warning condition to the compiler?  Anyway,
the fix is clear.
2008-08-29 16:34:14 +00:00
Bruce Momjian 31ad4e5396 Add missing descriptions for aggregates, functions and conversions.
Bernd Helmle
2008-08-23 20:31:37 +00:00
Tom Lane d1da215d32 Fix compiler warning introduced by recent patch. Tsk tsk. 2008-06-18 23:08:47 +00:00
Bruce Momjian 9de09c087d Move wchar2char() and char2wchar() from tsearch into /mb to be easier to
use for other modules;  also move pnstrdup().

Clean up code slightly.
2008-06-18 18:42:54 +00:00
Bruce Momjian 4274726d42 Add URL for introduction to multibyte programming in C. 2008-06-17 18:22:43 +00:00
Magnus Hagander ea7f9648fe Explicitly bind gettext() to the UTF8 locale when in use.
This is required on Windows due to the special locale
handling for UTF8 that doesn't change the full environment.

Fixes crash with translated error messages per bugs 4180
and 4196.

Tom Lane
2008-05-27 12:24:42 +00:00
Andrew Dunstan 53972b460c Add $PostgreSQL$ markers to a lot of files that were missing them.
This particular batch was just for *.c and *.h file.

The changes were made with the following 2 commands:

find . \( \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' \) -prune \) -o  \( -name '*.[ch]'  \) \( -exec grep -q '\$PostgreSQL' {} \; -o -print \) | while read file ; do head -n 1 < $file | grep -q '^/\*' && echo $file; done | xargs -l sed -i -e '1s/^\// /' -e '1i/*\n * $PostgreSQL:$ \n *'

find . \( \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' \) -prune \) -o  \( -name '*.[ch]'  \) \( -exec grep -q '\$PostgreSQL' {} \; -o -print \) | xargs -l sed -i -e '1i/*\n * $PostgreSQL:$ \n */'
2008-05-17 01:28:26 +00:00
Tom Lane ba1c463096 Clean up a few places where Datums were being treated as pointers without
going through DatumGetPointer or some other "official" conversion macro.
Not actually a bug, since Datum the same size as pointer is the only
supported case at the moment, but good cleanup for the future.

Gavin Sherry
2008-04-12 23:21:04 +00:00
Peter Eisentraut 46e76373ec Implement a few changes to how shared libraries and dynamically loadable
modules are built.  Foremost, it creates a solid distinction between these two
types of targets based on what had already been implemented and duplicated in
ad hoc ways before.  Specifically,

- Dynamically loadable modules no longer get a soname.  The numbers previously
set in the makefiles were dummy numbers anyway, and the presence of a soname
upset a few packaging tools, so it is nicer not to have one.

- The cumbersome detour taken on installation (build a libfoo.so.0.0.0 and
then override the rule to install foo.so instead) is removed.

- Lots of duplicated code simplified.
2008-04-07 14:15:58 +00:00
Bruce Momjian fca9fff41b More README src cleanups. 2008-03-21 13:23:29 +00:00
Bruce Momjian 4e228447aa Make source code READMEs more consistent. Add CVS tags to all README files. 2008-03-20 17:55:15 +00:00
Heikki Linnakangas f4b7624eb0 Add the missing cyrillic "Yo" characters ('e' and 'E' with two dots) to the
ISO_8859-5 <-> MULE_INTERNAL conversion tables.

This was discovered when trying to convert a string containing those characters
from ISO_8859-5 to Windows-1251, because we use MULE_INTERNAL/KOI8R as an
intermediate encoding between those two.

While the missing "Yo" was just an omission in the conversion tables, there are
a few other characters like the "Numero" sign ("No" as a single character) that
exists in all the other cyrillic encodings (win1251, ISO_8859-5 and cp866), but
not in KOI8R. Added comments about that.

Patch by Sergey Burladyan. Back-patch to 7.4.
2008-03-20 10:30:04 +00:00
Peter Eisentraut 8c87cc370f Catch all errors in for and while loops in makefiles. Don't ignore any
errors in any commands, including in various clean targets that have so far
been handled inconsistently.  make -i is available to ignore all errors in
a consistent and official way.
2008-03-18 16:24:50 +00:00
Peter Eisentraut 0474dcb608 Refactor backend makefiles to remove lots of duplicate code 2008-02-19 10:30:09 +00:00
Tom Lane a9742f123c Remove incorrect (and ill-advised anyway) pfree's in pg_convert_from and
pg_convert_to.  Per bug #3866 from Andrew Gilligan.
2008-01-09 23:43:54 +00:00
Tom Lane ce9baa06f0 Fix some missed copyright updates. 2008-01-01 20:31:21 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Bruce Momjian 0c2c061eb0 Cleanup for new else/comment handling. 2007-11-16 01:11:04 +00:00
Bruce Momjian 7d4c99b414 Fix pgindent to properly handle 'else' and single-line comments on the
same line;  previous fix was only partial.  Re-run pgindent on files
that need it.
2007-11-15 23:23:44 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane febd60bf5d Fix pg_wchar_table[] to match revised ordering of the encoding ID enum.
Add some comments so hopefully the next poor sod doesn't fall into the
same trap.  (Wrong comments are worse than none at all...)
2007-10-15 22:46:27 +00:00
Tom Lane 8468146b03 Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so.  For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway.  (This does force initdb
unfortunately.)

Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using.  To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.

It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs.  However the code is now prepared to avoid this
type of problem in future.

Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly.  The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
2007-10-13 20:18:42 +00:00
Andrew Dunstan a1b14ae1dd Add comments re text <-> bytea internal equivalence in convert routines. 2007-09-24 16:38:24 +00:00
Andrew Dunstan 82467e4e70 Use correct PG_GETARG macro in pg_convert 2007-09-24 14:59:37 +00:00
Andrew Dunstan 55613bf9cd Close previously open holes for invalidly encoded data to enter the
database via builtin functions, as recently discussed on -hackers.

chr() now returns a character in the database encoding. For UTF8 encoded databases
the argument is treated as a Unicode code point. For other multi-byte encodings
the argument must designate a strict ascii character, or an error is raised,
as is also the case if the argument is 0.

ascii() is adjusted so that it remains the inverse of chr().

The two argument form of convert() is gone, and the three argument form now
takes a bytea first argument and returns a bytea. To cover this loss three new
functions are introduced:
. convert_from(bytea, name) returns text - converts the first argument from the
  named encoding to the database encoding
. convert_to(text, name) returns bytea - converts the first argument from the
  database encoding to the named encoding
. length(bytea, name) returns int - gives the length of the first argument in
  characters in the named encoding
2007-09-18 17:41:17 +00:00
Tom Lane 4dbbef2845 Suppress an integer-overflow warning. 2007-07-12 21:17:09 +00:00
Tom Lane fa98a86f65 Tweak the code in a couple of places to try to deliver more user-friendly
error messages when a single COPY line is too long for us to handle.  Per
example from Johann Spies.
2007-05-28 16:43:24 +00:00
Tom Lane 274dfdb513 Tweak clean_encoding_name() API to avoid need to cast away const.
Kris Jurka
2007-04-16 18:50:49 +00:00
Tatsuo Ishii 6041b92238 Make JOHAB client only encoding per discussions in pgsql-hackers
"Server-side support of all encodings" around 2007/3/26.
initdb required.
2007-04-15 10:56:30 +00:00
Tatsuo Ishii bf47e3e419 Fix description how to create conversion function. 2007-04-15 10:49:26 +00:00
Bruce Momjian b8f856512e Fix typo in Makefile.
Marko Kreen
2007-03-27 14:29:51 +00:00
Bruce Momjian 9dd3ec6c3b Remove advertising clause from Berkeley BSD-licensed files, per
instructions from Berkeley.
2007-03-26 21:44:11 +00:00
Tatsuo Ishii a6fbd2f12a Fix pg_wchar_table's maxmblen field of EUC_CN, EUC_TW, MULE_INTERNAL
and GB18030. patches from ITAGAKI Takahiro.
2007-03-26 11:15:13 +00:00
Tatsuo Ishii 75c6519ff6 Add new encoding EUC_JIS_2004 and SHIFT_JIS_2004,
along with new conversions among EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8.
catalog version has been bump up.
2007-03-25 11:56:04 +00:00
Tatsuo Ishii 4c35ec53a9 Allow 4 bytes UTF-8 (UCS-4 range 00010000-001FFFFF)
This is necessary to support JIS X 0213 <--> UTF-8 conversion.
2007-03-23 13:51:30 +00:00
Tom Lane 234a02b2a8 Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len).
Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with
VARSIZE and VARDATA, and as a consequence almost no code was using the
longer names.  Rename the length fields of struct varlena and various
derived structures to catch anyplace that was accessing them directly;
and clean up various places so caught.  In itself this patch doesn't
change any behavior at all, but it is necessary infrastructure if we hope
to play any games with the representation of varlena headers.
Greg Stark and Tom Lane
2007-02-27 23:48:10 +00:00
Peter Eisentraut c138b966d4 Replace useless uses of := by = in makefiles. 2007-02-09 15:56:00 +00:00
Tom Lane 0887fa1117 Get pg_utf_mblen(), pg_utf2wchar_with_len(), and utf2ucs() all on the same
page about the maximum UTF8 sequence length we support (4 bytes since 8.1,
3 before that).  pg_utf2wchar_with_len never got updated to support 4-byte
characters at all, and in any case had a buffer-overrun risk in that it
could produce multiple pg_wchars from what mblen claims to be just one UTF8
character.  The only reason we don't have a major security hole is that most
callers allocate worst-case output buffers; the sole exception in released
versions appears to be pre-8.2 iwchareq() (ie, ILIKE), which can be crashed
due to zeroing out its return address --- but AFAICS that can't be exploited
for anything more than a crash, due to inability to control what gets written
there.  Per report from James Russell and Michael Fuhr.

Pre-8.1 the risk is much less, but I still think pg_utf2wchar_with_len's
behavior given an incomplete final character risks buffer overrun, so
back-patch that logic change anyway.

This patch also makes sure that UTF8 sequences exceeding the supported
length (whichever it is) are consistently treated as error cases, rather
than being treated like a valid shorter sequence in some places.
2007-01-24 17:12:17 +00:00
Peter Eisentraut 2cc01004c6 Remove remains of old depend target. 2007-01-20 17:16:17 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane e9da20ab4d Fix machine-dependent crash in sqlchar_to_unicode(). Get rid of
bletcherous and unsafe manipulation of global encoding setting.
Clean up libxml reporting mechanism a bit (it still looks like a
dangling-pointer crash waiting to happen, though, not to mention
being far less than sane from a localization standpoint).
2006-12-24 00:57:48 +00:00
Peter Eisentraut 8c1de5fb00 Initial SQL/XML support: xml data type and initial set of functions. 2006-12-21 16:05:16 +00:00
Peter Eisentraut 3cd318a8d1 Fix gratuitous message spelling differences 2006-11-27 15:50:55 +00:00
Peter Eisentraut b9b4f10b5b Message style improvements 2006-10-06 17:14:01 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Bruce Momjian a3132359fd In new "invalid byte sequence" error hint, call it "error", not
"failure".
2006-08-22 12:11:28 +00:00
Bruce Momjian e11cab650c Add hint for "invalid byte sequence for encoding" error message,
suggesting review of client_encoding.
2006-08-22 03:30:20 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian ac230e7431 Alphabetically order reference to include files, "S"-"Z". 2006-07-11 18:26:11 +00:00
Bruce Momjian 3a534ade39 Alphabetically order reference to include files, "G" - "M". 2006-07-11 17:04:13 +00:00
Bruce Momjian 399a36a75d Prepare code to be built by MSVC:
o  remove many WIN32_CLIENT_ONLY defines
	o  add WIN32_ONLY_COMPILER define
	o  add 3rd argument to open() for portability
	o  add include/port/win32_msvc directory for
	   system includes

Magnus Hagander
2006-06-07 22:24:46 +00:00
Tom Lane a0ffab351e Magic blocks don't do us any good unless we use 'em ... so install one
in every shared library.
2006-05-30 22:12:16 +00:00
Tom Lane c61a2f5841 Change the backend to reject strings containing invalidly-encoded multibyte
characters in all cases.  Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release).  Embedded
zero (null) bytes will be rejected as well.  The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated.  Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.

Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.

Patches by Tatsuo Ishii and Tom Lane.  Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.
2006-05-21 20:05:21 +00:00
Bruce Momjian f3d99d160d Add CVS tag lines to files that were lacking them. 2006-03-11 04:38:42 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tatsuo Ishii b3d0442ab3 Tighten up SJIS byte sequence check. Now we reject invalid SJIS byte
sequence such as "0x95 0x27". Patches from Akio Ishida.
Also update copyright notice.
2006-03-04 10:57:35 +00:00
Peter Eisentraut 7f4f42fa10 Clean up CREATE FUNCTION syntax usage in contrib and elsewhere, in
particular get rid of single quotes around language names and old WITH ()
construct.
2006-02-27 16:09:50 +00:00
Peter Eisentraut 268c1b6077 The Makefile was invoking perl scripts as ./script.pl. This fails when
the script is not executable as UCS_to_most.pl is in CVS.  It also won't
pick up any custom setting of the perl version/location to use.  This
patch calls perl scripts like $(PERL) $(srcdir)/script.pl.

Kris Jurka
2006-02-24 13:25:44 +00:00
Peter Eisentraut 1b658473ea Add support for Windows codepages 1253, 1254, 1255, and 1257 and clean
up a bunch of the support utilities.

In src/backend/utils/mb/Unicode remove nearly duplicate copies of the
UCS_to_XXX perl script and replace with one version to handle all generic
files.  Update the Makefile so that it knows about all the map files.
This produces a slight difference in some of the map files, using a
uniform naming convention and not mapping the null character.

In src/backend/utils/mb/conversion_procs create a master utf8<->win
codepage function like the ISO 8859 versions instead of having a separate
handler for each conversion.

There is an externally visible change in the name of the win1258 to utf8
conversion.  According to the documentation notes, it was named
incorrectly and this changes it to a standard name.

Running the Unicode mapping perl scripts has shown some additional mapping
changes in koi8r and iso8859-7.
2006-02-18 16:15:23 +00:00
Tom Lane 226a980bb0 Fix bug that allowed any logged-in user to SET ROLE to any other database user
id (CVE-2006-0553).  Also fix related bug in SET SESSION AUTHORIZATION that
allows unprivileged users to crash the server, if it has been compiled with
Asserts enabled.  The escalation-of-privilege risk exists only in 8.1.0-8.1.2.
However, the Assert-crash risk exists in all releases back to 7.3.
Thanks to Akio Ishida for reporting this problem.
2006-02-12 22:32:43 +00:00
Bruce Momjian 2a5180c26e Throw a warning rather than an error on invalid character from UTF8 to
Latin1, like we do for other Latin encodings.
2006-02-12 21:15:19 +00:00
Bruce Momjian c01999a557 Allow psql multi-line column values to align in the proper columns
If the second output column value is 'a\nb', the 'b' should appear
  in the second display column, rather than the first column as it
  does now.

Change libpq's PQdsplen() to return more useful values.

> Note: this changes the PQdsplen function, it can now return zero or
> minus one which was not possible before. It doesn't appear anyone is
> actually using the functions other than psql but it is a change. The
> functions are not actually documentated anywhere so it's not like we're
> breaking a defined interface. The new semantics follow the Unicode
> standard.

BACKWARD COMPATIBLE CHANGE.

The only user-visible change I saw in the regression tests is that a
SELECT * on a table where all the columns have been dropped doesn't
return a blank line like before.  This seems like a step forward.

Martijn van Oosterhout
2006-02-10 00:39:04 +00:00
Neil Conway d3a4d63387 mbutils was previously doing some allocations, including invoking
fmgr_info(), in the TopMemoryContext. I couldn't see that the code
actually leaked, but in general I think it's fragile to assume that
pfree'ing an FmgrInfo along with its fn_extra field is enough to
reclaim all the resources allocated by fmgr_info().  I changed the
code to do its allocations in a new child context of
TopMemoryContext, MbProcContext. When we want to release the
allocations we can just reset the context, which is cleaner.
2006-01-12 22:04:02 +00:00
Neil Conway fb627b76cc Cosmetic code cleanup: fix a bunch of places that used "return (expr);"
rather than "return expr;" -- the latter style is used in most of the
tree. I kept the parentheses when they were necessary or useful because
the return expression was complex.
2006-01-11 08:43:13 +00:00
Neil Conway 762bcbdba2 Remove a confusing pair of parentheses. 2006-01-11 06:59:22 +00:00
Bruce Momjian a2384d008a More uses of IS_HIGHBIT_SET() macro. 2005-12-26 19:30:45 +00:00
Bruce Momjian 261114a23f I have added these macros to c.h:
#define HIGHBIT                 (0x80)
        #define IS_HIGHBIT_SET(ch)      ((unsigned char)(ch) & HIGHBIT)

and removed CSIGNBIT and mapped it uses to HIGHBIT.  I have also added
uses for IS_HIGHBIT_SET where appropriate.  This change is
purely for code clarity.
2005-12-25 02:14:19 +00:00
Bruce Momjian d8a8183456 Formatting cleanups. 2005-12-24 17:19:40 +00:00
Bruce Momjian 0658a6a634 Formatting cleanup. 2005-12-24 16:49:48 +00:00
Tatsuo Ishii 804f6b8fc9 Fix long standing Asian multibyte charsets bug.
See:

Subject: [HACKERS] bugs with certain Asian multibyte charsets
From: Tatsuo Ishii <ishii@sraoss.co.jp>
To: pgsql-hackers@postgresql.org
Date: Sat, 24 Dec 2005 18:25:33 +0900 (JST)

for more details/
2005-12-24 09:35:36 +00:00
Tatsuo Ishii dcc7da8d5e Fix for rearranging encoding id ISO-8859-5 to ISO-8859-8.
Also make the code more robust by searching for target encoding
in the internal charset map.

Problem reported by Sagi Bashari on 2005/12/21.
See "[BUGS] BUG #2120: Crash when doing UTF8<->ISO_8859_8 encoding conversion"
on pgsql-bugs list for more details.
2005-12-23 02:11:02 +00:00
Peter Eisentraut a29c04a541 Allow installation into directories containing spaces in the name. 2005-12-09 21:19:36 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Peter Eisentraut 07bb9f086b Message corrections 2005-10-29 00:31:52 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 8889685555 Suppress signed-vs-unsigned-char warnings. 2005-09-24 17:53:28 +00:00
Tom Lane d78397d301 Change typreceive function API so that receive functions get the same
optional arguments as text input functions, ie, typioparam OID and
atttypmod.  Make all the datatypes that use typmod enforce it the same
way in typreceive as they do in typinput.  This fixes a problem with
failure to enforce length restrictions during COPY FROM BINARY.
2005-07-10 21:14:00 +00:00
Tatsuo Ishii e2d088de03 Allow direct conversion between EUC_JP and SJIS to improve
performance. patches submitted by Atsushi Ogawa.
2005-06-24 13:56:39 +00:00
Bruce Momjian 5955945828 Support 3 and 4-byte unicode characters.
John Hansen
2005-06-15 00:15:08 +00:00
Tatsuo Ishii b4cbd60fcf Fix bug in MIC -> EUC_JP conversion. Per Atsushi Ogawa. 2005-06-10 16:43:56 +00:00
Tom Lane 893b57c871 Alter the signature for encoding conversion functions to declare the
output area as INTERNAL not CSTRING.  This is to prevent people from
calling the functions by hand.  This is a permanent solution for the
back branches but I hope it is just a stopgap for HEAD.
2005-05-03 19:17:59 +00:00
Bruce Momjian e7fb9f18bf Add support for Win1252 encoding.
Roland Volkmann
2005-03-14 18:31:25 +00:00
Bruce Momjian 41e2a80f57 Update comments for new encoding names. 2005-03-14 00:19:13 +00:00
Bruce Momjian ee1bd33dd0 Document aliases for our supported encodings.
Add a few encodings that were not documented.
2005-03-13 01:26:30 +00:00
Neil Conway 4cd2fd66f8 Unbreak out-of-tree builds, by fixing a typo. 2005-03-07 23:18:06 +00:00
Bruce Momjian e3d7de6b99 Rename canonical encodings, per Peter:
UNICODE => UTF8
	ALT => WIN866
	WIN => WIN1251
	TCVN => WIN1258

The old codes continue to work.
2005-03-07 04:30:55 +00:00
Tom Lane 7e1c8ef4fc Some more missed copyright notices. Many of these look like they
should have been caught by the src/tools/copyright script ... why
weren't they?
2005-01-01 20:44:34 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Bruce Momjian e09567d850 Back out addition of Win1252 encoding. 2004-12-04 18:19:33 +00:00
Bruce Momjian 08e0b34bad Back out fix for Unicode characters above 0x10000 2004-12-03 01:20:33 +00:00
Bruce Momjian 4ea4f8bd06 Fix for Unicode characters above 0x10000.
John Hansen
2004-12-02 22:37:14 +00:00
Bruce Momjian 7af770d005 Add Charset WIN1252 support.
Roland Volkmann
2004-12-02 22:14:38 +00:00
Neil Conway 7069dbcc31 More minor cosmetic improvements:
- remove another senseless "extern" keyword that was applied to a
function definition
- change a foo more function signatures from "some_type foo()" to
"some_type foo(void)"
- rewrite another K&R style function definition
- make the type of the "action" function pointer in the KeyWord struct
in src/backend/utils/adt/formatting.c more precise
2004-10-13 01:25:13 +00:00
Neil Conway 0e72b9d440 Cosmetic improvements/code cleanup:
- replace some function signatures of the form "some_type foo()" with
"some_type foo(void)"
- replace a few instances of a literal 0 being used as a NULL pointer;
there are more instances of this in the code, but I just fixed a few
- in src/backend/utils/mb/wstrncmp.c, replace K&R style function
declarations with ANSI style, remove use of 'register' keyword
- remove an "extern" modifier that was applied to a function definition
(rather than a declaration)
2004-10-10 23:37:45 +00:00
Bruce Momjian e1c8b37afb Add new macro as shorthand for MS VC and Borland C++:
+ #if   defined(_MSC_VER) || defined(__BORLANDC__)
+ #define       WIN32_CLIENT_ONLY
+ #endif
2004-09-27 23:24:45 +00:00
Peter Eisentraut 152a101f2b Allow WIN1250 as server encoding. 2004-09-17 21:59:57 +00:00
Bruce Momjian 15d3f9f6b7 Another pgindent run with lib typedefs added. 2004-08-30 02:54:42 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tatsuo Ishii e8c3205037 Add PQmbdsplen() which returns the "display length" of a character.
Still some works needed:
- UTF-8, MULE_INTERNAL always returns 1
2004-03-15 10:41:26 +00:00
Tom Lane ecb156d484 If we don't have shared libraries, we don't have conversions. Make
conversion_create.sql be empty (except for a helpful comment) in this
case.  Allows initdb to succeed with --disable-shared.
2004-01-21 19:22:19 +00:00
Tom Lane a4f8f124b7 Fix bit-rot in support for building with --disable-shared. This patch
gets us past 'make install', but initdb still fails for lack of conversion
libraries ...
2004-01-21 19:04:11 +00:00
PostgreSQL Daemon 55b113257c make sure the $Id tags are converted to $PostgreSQL as well ... 2003-11-29 22:41:33 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tatsuo Ishii 0c9f978c0c Fix GB18030 to UTF-8 mapping table 2003-08-25 01:46:16 +00:00
Tatsuo Ishii b4ab39ff05 Fix GB18030 to UTF-8 mapping table 2003-08-24 05:18:04 +00:00
Peter Eisentraut 200b7d11af Fix uninstall target. 2003-08-23 04:22:34 +00:00
Tom Lane f65643771b Conversion functions must be STRICT to prevent them from getting null inputs. 2003-08-08 14:31:12 +00:00
Tom Lane 2f9c859ea1 Fix some copyright notices that weren't updated. Improve copyright tool
so it won't miss 'em again.
2003-08-04 23:59:41 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane b6a1d25b0a Error message editing in utils/adt. Again thanks to Joe Conway for doing
the bulk of the heavy lifting ...
2003-07-27 04:53:12 +00:00
Tom Lane 689eb53e47 Error message editing in backend/utils (except /adt). 2003-07-25 20:18:01 +00:00
Bruce Momjian b14295cfe4 Attached is the complete diff against current CVS.
Compiles on BCC 5.5 and VC++ 6.0 (with warnings).

Karl Waclawek
2003-06-12 08:15:29 +00:00
Bruce Momjian dc4ee8a833 Back out patch that got bundled into another patch. 2003-06-12 08:11:07 +00:00
Bruce Momjian a647e30ba3 New patch with corrected README attached.
Also quickly added mention that it may be a qualified schema name.

Rod Taylor
2003-06-12 08:02:57 +00:00
Bruce Momjian 12c9423832 Allow Win32 to compile under MinGW. Major changes are:
Win32 port is now called 'win32' rather than 'win'
        add -lwsock32 on Win32
        make gethostname() be only used when kerberos4 is enabled
        use /port/getopt.c
        new /port/opendir.c routines
        disable GUC unix_socket_group on Win32
        convert some keywords.c symbols to KEYWORD_P to prevent conflict
        create new FCNTL_NONBLOCK macro to turn off socket blocking
        create new /include/port.h file that has /port prototypes, move
          out of c.h
        new /include/port/win32_include dir to hold missing include files
        work around ERROR being defined in Win32 includes
2003-05-15 16:35:30 +00:00
Tom Lane 351372e585 Department of second thoughts: probably still need an IsTransactionState
test in there...
2003-04-27 18:01:46 +00:00
Tom Lane 5f15fa8d06 Clean up some problems in SetClientEncoding: failed to honor doit flag
in all cases, leaked TopMemoryContext memory in others.  Make the
interaction between SetClientEncoding and InitializeClientEncoding
cleaner and better documented.  I suspect these changes should be
back-patched into 7.3, but will wait on Tatsuo's verification.
2003-04-27 17:31:25 +00:00
Tatsuo Ishii 35a0995992 Fix encoding conversion function bug.
See following posting for more details.

Subject: Re: [HACKERS] [BUGS] Bug #943: Server-Encoding from EUC_TW to UTF-8 doesn't
From: Tatsuo Ishii <t-ishii@sra.co.jp>
To: michael.enke@wincor-nixdorf.com, pgsql-bugs@postgresql.org
Cc: pgsql-hackers@postgresql.org
Date: Sat, 12 Apr 2003 10:51:45 +0900 (JST)
2003-04-12 07:53:57 +00:00
Tom Lane 1d650da2e5 This is a derived file and should never have been added to CVS. 2003-04-02 00:58:08 +00:00
Bruce Momjian 4b0b8dadd2 Add new files. 2003-03-27 16:53:15 +00:00
Tom Lane e4704001ea This patch fixes a bunch of spelling mistakes in comments throughout the
PostgreSQL source code.

Neil Conway
2003-03-10 22:28:22 +00:00
Tatsuo Ishii e2a618fe25 Fix for GUC client_encoding variable not being handled
correctly. See following thread for more details.

Subject: [HACKERS] client_encoding directive is ignored in postgresql.conf
From: Tatsuo Ishii <t-ishii@sra.co.jp>
Date: Wed, 29 Jan 2003 22:24:04 +0900 (JST)
2003-02-19 14:31:26 +00:00
Tom Lane b8add56ed0 Fix array subscript overruns identified by Yichen Xie. 2003-01-29 01:01:05 +00:00
Tatsuo Ishii 38535f8e32 Fix typo in an error message 2003-01-11 06:55:11 +00:00
Peter Eisentraut 4ed6be54e2 Fix Latin9/Unicode conversion by selecting the right table. 2002-12-09 19:47:21 +00:00
Bruce Momjian ceab6f7283 As far as I figured from the source code this function only deals with
cleaning up locale names and nothing else. Since all the locale names
are in plain  ASCII I think it will be safe to use ASCII-only lower-case
conversion.

Nicolai Tufar
2002-12-05 23:21:07 +00:00
Tatsuo Ishii ac47950238 Guard against 0 length string encoding conversion case. 2002-11-26 02:22:29 +00:00
Tatsuo Ishii 90a06dba16 Fix broken GB18030 <--> UTF-8 conversion map 2002-11-12 11:33:40 +00:00
Tom Lane 5123139210 Remove encoding lookups from grammar stage, push them back to places
where it's safe to do database access.  Along the way, fix core dump
for 'DEFAULT' parameters to CREATE DATABASE.  initdb forced due to
change in pg_proc entry.
2002-11-02 18:41:22 +00:00
Tom Lane 3518fbe86f Add missing semicolons to a few PG_FUNCTION_INFO_V1 calls. 2002-10-26 15:01:01 +00:00
Peter Eisentraut 8c3ab663ab Tweak conversion names to follow the established naming scheme, and
document that scheme.
2002-09-24 20:14:59 +00:00
Tatsuo Ishii 4b23f05c4f Fix bug in encoding conversion map. 2002-09-18 02:10:10 +00:00
Tatsuo Ishii 4c0bdd1ba8 Update Japanese README so that it reflects the changes made to the
conversion function interface.
2002-09-18 01:21:28 +00:00
Tatsuo Ishii 3357577247 Change Assert(len > 0) to Assert(len >= 0)
Change PG_RETURN_INT32(0) to PG_RETURN_VOID()
2002-09-13 06:41:18 +00:00
Peter Eisentraut 337da0678a Assorted fixes for Cygwin:
Eliminate the mysterious games that the Cygwin build plays with the linker
flag variables.  DLLLIBS is gone, use SHLIB_LINK like everyone else.
Detect cygipc in configure, after the linker flags are set up, otherwise
configure might not work at all.

Make sure everything is covered by make clean.

Fix the build of the new conversion procedure modules.

Add new DLLIMPORT markers where required.

Finally, the compiler complains if we use an explicit
-I/usr/local/include, so don't do that.  Curiously, -L/usr/local/lib is
still necessary.
2002-09-05 18:28:46 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Tom Lane 07c67187bf Avoid multiple scans of utils/mb/conversion_procs/ subdirectories during
'make install'; there are enough of 'em that this slowed down the make
noticeably.  Ensure that 'all' is the default make target in all these
directories (defaulting to 'make install' is surprising and dangerous
IMHO).  Fix a couple small typos.
2002-09-04 15:45:50 +00:00
Tatsuo Ishii 97592e6a6c Refrect the changes to src/test/regress/sql/conversion.sql By Tom. 2002-09-04 02:42:34 +00:00
Peter Eisentraut 77f7763b55 Remove all traces of multibyte and locale options. Clean up comments
referring to "multibyte" where it really means character encoding.
2002-09-03 21:45:44 +00:00
Tatsuo Ishii ed7baeaf4d Remove #ifdef MULTIBYTE per hackers list discussion. 2002-08-29 07:22:30 +00:00
Bruce Momjian ff1793f036 Remove erroneous character from Makefile due to editor error. 2002-08-22 02:18:45 +00:00
Tom Lane b663f3443b Add a bunch of pseudo-types to replace the behavior formerly associated
with OPAQUE, as per recent pghackers discussion.  I still want to do some
more work on the 'cstring' pseudo-type, but I'm going to commit the bulk
of the changes now before the tree starts shifting under me ...
2002-08-22 00:01:51 +00:00
Bruce Momjian d46e3dc00f Changes made so new conversion Makefiles will build out of the source tree. 2002-08-21 21:33:55 +00:00
Tatsuo Ishii 10b374aecf Fix bug in pg_convert() per report from MaC.Yui.
It pfree() wrong pointer.
2002-08-19 04:08:08 +00:00
Tatsuo Ishii 538b101595 Fix memory leak in SetClientEncoding(). 2002-08-14 05:33:34 +00:00
Tatsuo Ishii 969e0246ed Add Cyrillic and other encodings for encoding conversion.
Patches submitted by Kaori Inaba (i-kaori@sra.co.jp).
2002-08-14 02:45:10 +00:00
Tatsuo Ishii 697b472099 Address build problems on cygwin and (hopefully) AIX. 2002-08-08 07:47:43 +00:00
Tatsuo Ishii 3c63578a7e Load and keep conversion function info when SET CLIENT_ENCODING TO is
executed to prevent database access while performing encoding
conversion.
2002-08-08 06:35:26 +00:00
Tatsuo Ishii 6206a880cf Add SQL99 CONVERT() function. 2002-08-06 05:40:47 +00:00
Tatsuo Ishii 0345f58496 Implement DROP CONVERSION
Add regression test
2002-07-25 10:07:13 +00:00
Tatsuo Ishii 19a20e04bd Add Japanese README explaining how to add new conversion.
English README will come soon...
2002-07-24 07:05:41 +00:00
Tatsuo Ishii 86270024ff Oops. Too much ifdef out. 2002-07-19 11:09:25 +00:00
Tatsuo Ishii 248cbb5796 Temporary ifdef out migrating functions to avoid compiler warnings. 2002-07-19 00:22:24 +00:00
Peter Eisentraut 85d2a629c6 Create directory before installing files. 2002-07-18 22:58:08 +00:00
Tatsuo Ishii eb335a034b I have committed many support files for CREATE CONVERSION. Default
conversion procs and conversions are added in initdb. Currently
supported conversions are:

UTF-8(UNICODE) <--> SQL_ASCII, ISO-8859-1 to 16, EUC_JP, EUC_KR,
		    EUC_CN, EUC_TW, SJIS, BIG5, GBK, GB18030, UHC,
		    JOHAB, TCVN

EUC_JP <--> SJIS
EUC_TW <--> BIG5
MULE_INTERNAL <--> EUC_JP, SJIS, EUC_TW, BIG5

Note that initial contents of pg_conversion system catalog are created
in the initdb process. So doing initdb required is ideal, it's
possible to add them to your databases by hand, however. To accomplish
this:

psql -f your_postgresql_install_path/share/conversion_create.sql your_database

So I did not bump up the version in cataversion.h.

TODO:
Add more conversion procs
Add [CASCADE|RESTRICT] to DROP CONVERSION
Add tuples to pg_depend
Add regression tests
Write docs
Add SQL99 CONVERT command?
--
Tatsuo Ishii
2002-07-18 02:02:30 +00:00
Tatsuo Ishii 3c7798f068 Add conversion procs for CREATE CONVERSION 2002-07-16 09:25:06 +00:00
Tatsuo Ishii 15378a53f8 Add support for GB18030 2002-06-14 03:30:56 +00:00
Tatsuo Ishii 14f72b9a4d Add GB18030 support. Contributed by Bill Huang <bill_huanghb@ybb.ne.jp>
(ODBC support has not been committed yet. left for Hiroshi...)
2002-06-13 08:30:22 +00:00
Bruce Momjian e6227fd0ec Add missing Unicode multibyte files. 2002-03-06 06:12:59 +00:00
Bruce Momjian 92288a1cf9 Change made to elog:
o  Change all current CVS messages of NOTICE to WARNING.  We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.

o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.

o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.

o Remove INFO from the client_min_messages options and add NOTICE.

Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.

Regression passed.
2002-03-06 06:10:59 +00:00
Bruce Momjian a8bd7e1c6e > Tatsuo Ishii wrote:
> > > > It was made to cope with encoding such as an Asian bloc in 7.2Beta2.
> > > >
> > > > Added ServerEncoding
> > > >         Korean (JOHAB), Thai (WIN874),
> > > >         Vietnamese (TCVN), Arabic (WIN1256)
> > > >
> > > > Added ClientEncoding
> > > >         Simplified Chinese (GBK), Korean (UHC)
> > > >
> > > >
> > > >
> http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2b2.newencoding.diff.tar.gz
> > > > (608K)
> > >
> > > Looks good.  I need some people to review this for me.
> >
> > For me they look good too. The only missing part is a
> > documentation. I will ask him to write it up. If he couldn't, I will
> > do it for him.
> > > The diff is 3mb
> > > but appears to address only additions to multibyte.  I have attached a
> > > list of files it modifies.  Also, look at the sizes of the mb/
> > > directory.  It is getting large:
> > >
> > >   4       ./CVS
> > >   6       ./Unicode/CVS
> > >   3433    ./Unicode
> > >   6197    .
> >
> > Yes. We definitely need the on-the-fly encoding addition capability:
> > i.e. CREATE CHRACTER SET in the future...
> > --
> > Tatsuo Ishii
> >
> >

Address chainge.

http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2.newencoding.diff.gz

Add PsqlODBC and document ...etc patch.

Eiji Tokuya
2002-03-05 05:52:50 +00:00
Tatsuo Ishii 933761e7b1 Simplify pg_convert() in that it calls pg_convert2 using new fmgr interface. 2001-11-20 01:32:29 +00:00
Tatsuo Ishii 5590d5fe99 Fix nasty bugs in pg_convert() and pg_convert2().
o they sometimes returns a result garbage string appended.
    o they do not work if client encoding is different from server
      encoding
2001-11-19 06:48:39 +00:00
Bruce Momjian ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Bruce Momjian 6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Tatsuo Ishii cfe01796e6 Ok, here is the modified encoding table (column1 is the standard name,
2 is our "official" name, and 3 is alias). If there's no objection, I
will change them.

ASCII		SQL_ASCII
UTF-8		UNICODE		UTF_8
MULE-INTERNAL	MULE_INTERNAL
ISO-8859-1	LATIN1		ISO_8859_1
ISO-8859-2	LATIN2		ISO_8859_2
ISO-8859-3	LATIN3		ISO_8859_3
ISO-8859-4	LATIN4		ISO_8859_4
ISO-8859-5	ISO_8859_5
ISO-8859-6	ISO_8859_6
ISO-8859-7	ISO_8859_7
ISO-8859-8	ISO_8859_8
ISO-8859-9	LATIN5		ISO_8859_9
ISO-8859-10	LATIN6		ISO_8859_10
ISO-8859-13	LATIN7		ISO_8859_13
ISO-8859-14	LATIN8		ISO_8859_14
ISO-8859-15	LATIN9		ISO_8859_15
ISO-8859-16	LATIN10		ISO_8859_16
2001-10-16 10:09:17 +00:00
Tatsuo Ishii d07bacd54a Add UTF-8 char >= 0x10000 check 2001-10-15 01:19:15 +00:00
Tatsuo Ishii f426465ba9 Add a new function "pg_client_encoding" which returns the current client
side encoding name. This is necessary for client API's such as JDBC
to perform correct encoding conversions. See my email "[HACKERS]
pg_client_encoding" 10 Sep 2001.
2001-10-12 02:08:34 +00:00
Tatsuo Ishii 51053d3216 Add support for ISO-8859-6 to 16 2001-10-11 14:20:35 +00:00
Tatsuo Ishii 1b20315008 Fix bug in mic2ascii(). It does not handle correctly if none ASCII
chars are in the input.
2001-09-25 01:27:03 +00:00
Tatsuo Ishii be629abfc8 Add pg_database_encoding_max_length() function. 2001-09-23 10:59:45 +00:00
Tatsuo Ishii 8ebdac0ed5 Remove test drivers
Also fix comment in conv.c.
2001-09-22 08:44:49 +00:00
Tom Lane e3f5bc3492 Fix type_maximum_size() to give the right answer in MULTIBYTE cases.
Avoid use of prototype-less function pointers in MB code.
2001-09-21 15:27:38 +00:00
Peter Eisentraut fd5e95971e Remove old file. 2001-09-19 21:28:55 +00:00
Tatsuo Ishii e1de3e0833 Implement following item in TODO:
* Reject character sequences those are not valid in their charset
2001-09-11 04:50:36 +00:00
Tatsuo Ishii d330f09a56 Backout Karel's patch 2001-09-09 01:15:11 +00:00
Bruce Momjian fdbf796f36 > > A simple and robus solution is in the begin of mbutils.c set default
> > ClientEncoding to SQL_ASCII (like default DatabaseEncoding). Bruce, can
> > you change it? It's one line change. Again thanks.

 Forget it! A default client encoding must be set by actual database encoding...
Please apply the small attached patch that solve it better.

Karel Zak
2001-09-08 14:30:15 +00:00
Bruce Momjian d9044b5637 Remove file, per Karel. 2001-09-07 15:14:16 +00:00
Bruce Momjian 4ea26bf354 Remove variable length macros used in debugging, per Karel. 2001-09-07 15:01:45 +00:00
Bruce Momjian 7bfc83f673 Remove unused files for Karel's patch. 2001-09-07 14:17:17 +00:00
Bruce Momjian 9f5185cf63 Remove common.c, removed in Karal's patch. 2001-09-07 14:00:25 +00:00
Tatsuo Ishii 3bdd67a203 Add missing files. 2001-09-07 03:32:11 +00:00
Tatsuo Ishii 227767112c Commit Karel's patch.
-------------------------------------------------------------------
Subject: Re: [PATCHES] encoding names
From: Karel Zak <zakkr@zf.jcu.cz>
To: Peter Eisentraut <peter_e@gmx.net>
Cc: pgsql-patches <pgsql-patches@postgresql.org>
Date: Fri, 31 Aug 2001 17:24:38 +0200

On Thu, Aug 30, 2001 at 01:30:40AM +0200, Peter Eisentraut wrote:
> > 		- convert encoding 'name' to 'id'
>
> I thought we decided not to add functions returning "new" names until we
> know exactly what the new names should be, and pending schema

 Ok, the patch not to add functions.

> better
>
>     ...(): encoding name too long

 Fixed.

 I found new bug in command/variable.c in parse_client_encoding(), nobody
probably never see this error:

if (pg_set_client_encoding(encoding))
{
	elog(ERROR, "Conversion between %s and %s is not supported",
                     value, GetDatabaseEncodingName());
}

because pg_set_client_encoding() returns -1 for error and 0 as true.
It's fixed too.

 IMHO it can be apply.

		Karel
PS:

    * following files are renamed:

src/utils/mb/Unicode/KOI8_to_utf8.map  -->
        src/utils/mb/Unicode/koi8r_to_utf8.map

src/utils/mb/Unicode/WIN_to_utf8.map  -->
        src/utils/mb/Unicode/win1251_to_utf8.map

src/utils/mb/Unicode/utf8_to_KOI8.map -->
        src/utils/mb/Unicode/utf8_to_koi8r.map

src/utils/mb/Unicode/utf8_to_WIN.map -->
        src/utils/mb/Unicode/utf8_to_win1251.map

   * new file:

src/utils/mb/encname.c

   * removed file:

src/utils/mb/common.c

--
 Karel Zak  <zakkr@zf.jcu.cz>
 http://home.zf.jcu.cz/~zakkr/

 C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
2001-09-06 04:57:30 +00:00
Tatsuo Ishii ab9b6c45cf Add conver/convert2 functions. They are similar to the SQL99's convert. 2001-08-15 07:07:40 +00:00
Tatsuo Ishii 1032445e5d TODO item:
* Make n of CHAR(n)/VARCHAR(n) the number of letters, not bytes
2001-07-15 11:07:37 +00:00
Tatsuo Ishii e23f8c4557 Fix a message error in utf_to_local 2001-05-28 01:00:25 +00:00
Bruce Momjian 0cec2bb0cd BTW it does not add encodign it just patches existing one (KOI8) to
support two - KOI8-R and KOI8-U (latter is superset of the former if
not to take to the account pseudographics)

Andy Rysin
2001-05-03 21:38:45 +00:00
Tatsuo Ishii c527366b60 Add missing Unicode support for Cyrillic encodings.
Patches contributed by Victor Wagner.
2001-04-29 07:27:38 +00:00
Tatsuo Ishii b9be04e63d Add a crash gurard to pg_encoding_mblen in case of an invalid encoding
given.
2001-04-19 02:34:35 +00:00
Tatsuo Ishii 722f7efdd9 Correction for mathematical properties in Unicode converison maps.
Patches contributed by Eiji Tokuya (e-tokuya@sankyo-unyu.co.jp)
2001-04-16 06:10:19 +00:00
Tom Lane fbee97664e getdatabaseencoding() and PG_encoding_to_char() were being sloppy about
converting char* strings to type 'name'.  Imagine my surprise when 7.1
release coredumped upon start when compiled --enable-multibyte ...
2001-04-16 02:42:01 +00:00
Tom Lane ccd415c63f Fix unportable assumptions about alignment of local char[n] variables. 2001-03-25 23:23:59 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Tom Lane 572fda2711 Modify wchar conversion routines to not fetch the next byte past the end
of a counted input string.  Marinos Yannikos' recent crash report turns
out to be due to applying pg_ascii2wchar_with_len to a TEXT object that
is smack up against the end of memory.  This is the second just-barely-
reproducible bug report I have seen that traces to some bit of code
fetching one more byte than it is allowed to.  Let's be more careful
out there, boys and girls.
While at it, I changed the code to not risk a similar crash when there
is a truncated multibyte character at the end of an input string.  The
output in this case might not be the most reasonable output possible;
if anyone wants to improve it further, step right up...
2001-03-08 00:24:34 +00:00
Tatsuo Ishii 5735c4cf3d Enhanced UTF-8/SJIS mapping generator, contributed by
Eiji Tokuya" <e-tokuya@Mail.Sankyo-Unyu.co.jp>
2001-02-23 08:44:33 +00:00
Tatsuo Ishii 5c90733558 Unicode <-> SJIS new mapping tables (based on CP932.TXT) contributed by
Eiji Tokuya" <e-tokuya@Mail.Sankyo-Unyu.co.jp>
2001-02-15 01:56:29 +00:00
Tatsuo Ishii 8f17e53f0e Move pg_encoding_mblen() from common.c to wchar.c. 2001-02-11 01:59:22 +00:00
Tatsuo Ishii f54c02d2eb conv.c did not compile anymore. Fix wrong header file inclusion. 2001-02-11 01:56:58 +00:00
Tom Lane d08741eab5 Restructure the key include files per recent pghackers discussion: there
are now separate files "postgres.h" and "postgres_fe.h", which are meant
to be the primary include files for backend .c files and frontend .c files
respectively.  By default, only include files meant for frontend use are
installed into the installation include directory.  There is a new make
target 'make install-all-headers' that adds the whole content of the
src/include tree to the installed fileset, for use by people who want to
develop server-side code without keeping the complete source tree on hand.
Cleaned up a whole lot of crufty and inconsistent header inclusions.
2001-02-10 02:31:31 +00:00
Tatsuo Ishii cfe26c0fb1 Fix a bug in conversion from big5 to EUC_TW (CNS 11643-1992 Plane 3)
Thanks Chih-Chang Hsieh <cch@cc.kmu.edu.tw> for finding the bug.
2000-12-09 04:27:36 +00:00
Peter Eisentraut e5ba2fc5b5 Make all commands that link a program look like
$(CC) $(CFLAGS) $(LDFLAGS) <object files> <extra-libraries> $(LIBS) -o $@

This form seemed to be the most portable, readable, and logical, but in any
case it's better than having a dozen different ones in the tree.
2000-11-30 20:36:13 +00:00
Tatsuo Ishii 188065cb5c Unicode conversion fix suggested by Jan Varga...
--------------------------------------------------
Subject: Bug in unicode conversion ...
From: Jan Varga <varga@utcru.sk>
To: t-ishii@sra.co.jp
Date: Sat, 18 Nov 2000 17:41:20 +0100 (CET)


Hi,

I tried this new feature in PostgreSQL. I found one bug.
Script UCS_to_8859.pl skips input lines which
1. code <0x80 or
2. ucs <0x100

I think second one is not good idea because some codes in ISO8859-2
have ucs <0x100 (e.g. 0xE9 - 0x00E9)
--------------------------------------------------
2000-11-26 10:40:43 +00:00
Tatsuo Ishii 8a35ac24f8 Fix bugs in EUC_TW support. This fix includes patches contributed
by Chih-Chang Hsi. See "A Patch for MIC to EUC_TW code converting in
mb support" posting in pgsql-patches list dated 09 Nov 2000.
2000-11-17 04:42:10 +00:00
Tom Lane 2cf48ca04b Extend CREATE DATABASE to allow selection of a template database to be
cloned, rather than always cloning template1.  Modify initdb to generate
two identical databases rather than one, template0 and template1.
Connections to template0 are disallowed, so that it will always remain
in its virgin as-initdb'd state.  pg_dumpall now dumps databases with
restore commands that say CREATE DATABASE foo WITH TEMPLATE = template0.
This allows proper behavior when there is user-added data in template1.
initdb forced!
2000-11-14 18:37:49 +00:00
Tatsuo Ishii 1acf6f9c8e Add support for code conversion between Unicode and other encodings.
Supported encodings are: EUC_JP, EUC_CN, EUC_KR, EUC_TW, Shift JIS,
Big5, ISO8859-[1-5].
TODO: testings! and documentations...
2000-10-30 10:41:05 +00:00
Tatsuo Ishii 2969c01d55 Remove gcc-only macro definition 2000-10-27 02:23:51 +00:00
Tom Lane 0a63b6d066 Support SET/SHOW/RESET client_encoding and server_encoding even when
MULTIBYTE support is not compiled (you just can't set them to anything
but SQL_ASCII).  This should reduce interoperability problems between
MB-enabled clients and non-MB-enabled servers.
2000-10-25 19:44:44 +00:00
Peter Eisentraut 805e431a38 Add support for VPATH builds, that is, building somewhere else than in the
source directory.  This involves mostly makefiles using $(srcdir) when they
might have used ".".  (Regression tests don't work with this, yet.)

Sort out usage of CPPFLAGS, CFLAGS (and CXXFLAGS).  Add "override" keyword
in most places, to preserve necessary flags even when the user overrode the
flags.
2000-10-20 21:04:27 +00:00
Tatsuo Ishii de53ce8131 Support for conversion between UNICODE and other encodings
currently ISO8859-[1-5] and EUC_JP are supported.
support for other encodings will be coming soon.
2000-10-12 06:06:50 +00:00
Peter Eisentraut 424f0edcb8 Fix relative path references so that make knowns which dependencies refer
to one another. Sort out builddir vs srcdir variable namings. Remove some
now obsoleted make variables.
2000-08-31 16:12:35 +00:00
Tatsuo Ishii bfdd6a716d Change pg_mblen and pg_encoding_mblen return types from void
to int so that they return the number of whcars.
2000-08-27 10:40:48 +00:00
Tom Lane 1aebc3618a First phase of memory management rewrite (see backend/utils/mmgr/README
for details).  It doesn't really do that much yet, since there are no
short-term memory contexts in the executor, but the infrastructure is
in place and long-term contexts are handled reasonably.  A few long-
standing bugs have been fixed, such as 'VACUUM; anything' in a single
query string crashing.  Also, out-of-memory is now considered a
recoverable ERROR, not FATAL.
Eliminate a large amount of crufty, now-dead code in and around
memory management.
Fix problem with holding off SIGTRAP, SIGSEGV, etc in postmaster and
backend startup.
2000-06-28 03:33:33 +00:00
Tom Lane f2d1205322 Another batch of fmgr updates. I think I have gotten all old-style
functions that take pass-by-value datatypes.  Should be ready for
port testing ...
2000-06-13 07:35:40 +00:00
Tom Lane 091126fa28 Generated header files parse.h and fmgroids.h are now copied into
the src/include tree, so that -I backend is no longer necessary anywhere.
Also, clean up some bit rot in contrib tree.
2000-05-29 05:45:56 +00:00
Tatsuo Ishii 1a6daef70d Enhance multibyte support.
SJIS UDC (NEC selection IBM kanji) support contributed by Eiji Tokuya
2000-05-20 13:12:26 +00:00
Tom Lane 87e701b8d5 Clean up const-vs-not-const compiler warning in MULTIBYTE code.
'Twas my fault, I think.
2000-04-20 22:40:18 +00:00
Bruce Momjian 52f77df613 Ye-old pgindent run. Same 4-space tabs. 2000-04-12 17:17:23 +00:00
Tatsuo Ishii 6f843e8dd8 Fix pg_euccn_mblen() so that it always returns 2 if data is not ascii.
(EUC_CN does have only code set 0 and 1)
2000-01-25 02:12:27 +00:00
Peter Eisentraut 533d516629 Removed MBFLAGS from makefiles since it's now done in include/config.h. 2000-01-19 02:59:03 +00:00
Tatsuo Ishii 716fb90bf6 Fix minor comping errors 2000-01-18 13:44:48 +00:00
Tatsuo Ishii b1e891dbd4 Remove compiler warnings 2000-01-18 05:14:24 +00:00
Tatsuo Ishii 1f9d535aca Add UDC (User Defined Characters) support to SJIS/EUC_JP conversion
Update README so that it reflects all source file names
Add an entry to make sjistest (testing between SJIS/EUC_JP conversion)
2000-01-13 01:08:14 +00:00
Bruce Momjian a82f9ffde6 New LDOUT makefile variable for QNX os. 1999-12-13 22:35:27 +00:00
Bruce Momjian 3ffd3d82db Make LD -r as macros that can be changed for QNX. 1999-12-09 19:15:45 +00:00
Tom Lane 4644fc8071 Eliminate query length limitation imposed by pg_client_to_server
and pg_server_to_client.  Eliminate copy.c's restriction on the length
of a single attribute.
1999-09-11 22:28:11 +00:00
Bruce Momjian e259780b13 Enable WIN32 compilation of libpq. 1999-07-19 06:25:40 +00:00
Bruce Momjian e44c931801 Re-add getopt.h check, remove NT-specific tests for it. 1999-07-19 02:27:16 +00:00
Bruce Momjian 3406901a29 Move some system includes into c.h, and remove duplicates. 1999-07-17 20:18:55 +00:00
Bruce Momjian 33e826d167 Fix for multi-byte includes. 1999-07-17 16:25:28 +00:00
Bruce Momjian a9591ce66a Change #include's to use <> and "" as appropriate. 1999-07-15 23:04:24 +00:00
Tatsuo Ishii 8f02f2252d Fix some compiler warnings (Tomoaki Nishiyama), add WIN1250 support (Pavel Behal) 1999-07-11 22:47:21 +00:00
Bruce Momjian 9c56b408c4 Add fix for 0x7fU constants to pgindent 1999-05-26 15:20:04 +00:00
Bruce Momjian fcff1cdf4e Another pgindent run. Sorry folks. 1999-05-25 22:43:53 +00:00
Bruce Momjian 4eadfe8754 Make 0x007f -> (unsigned)0x7f to make pgindent happy. 1999-05-25 22:04:56 +00:00
Bruce Momjian 07842084fe pgindent run over code. 1999-05-25 16:15:34 +00:00
Tatsuo Ishii 0c1e2e493d set client_encoding to <nothing> crashes backend. 1999-05-13 10:28:26 +00:00
Tom Lane 0d99c95388 Correct potential infinite loop in pg_utf2wchar_with_len;
it failed to cover the case where high bits of char are 100 or 101.
Not sure if fix is right, but it agrees with pg_utf_mblen ... and it
doesn't lock up ...
1999-04-25 20:35:51 +00:00
Tom Lane 6f668ee108 ifdef out some unused routines to suppress gcc warnings. 1999-04-25 18:09:54 +00:00
Tatsuo Ishii 5ae9d85f77 Add KOI8/WIN/ALT support 1999-03-24 07:02:17 +00:00
Tatsuo Ishii eb42c1c762 These small utilities are for generating internal tables from
rcode encoding tables.
1999-03-24 07:01:37 +00:00
Bruce Momjian a7ad43cd18 Included patches make some enhancements to the multi-byte support.
o allow to use Big5 (a Chinese encoding used in Taiwan) as a client
  encoding. In this case the server side encoding should be EUC_TW

o add EUC_TW and Big5 test cases to the regression and the mb test
  (contributed by Jonah Kuo)

o fix mistake in include/mb/pg_wchar.h. An encoding id for EUC_TW was
  not correct (was 3 and now is 4)

o update documents (doc/README.mb and README.mb.jp)

o update psql helpfile (bin/psql/psqlHelp.h)

--
Tatsuo Ishii
t-ishii@sra.co.jp
1999-02-02 18:51:40 +00:00
Bruce Momjian ffb90a01fd Current multi-byte related codes have a bug with SQL_ASCII
support. Included patches will solve it and should be applied to
both trees.  Also, it fix the problem with \c command of psql when
switching different encoding databases.

Regression tests passed.
--
Tatsuo Ishii
t-ishii@sra.co.jp
1998-12-14 04:59:58 +00:00
Bruce Momjian e1ebac319d Here are the patches against the current source tree. I have run the
regression test on a FreeBSD box with both non-MULTIBYTE and
MULTIBYTE-enabled, and confirmed that the results are same.

However I do not tested on PCs(I don't have access to win). Please let
me know if the patches break anything on PCs.

Also please note that the patch for varchar.c is a fix for a nasty bug
of char(n) types that I introduced and I believe at least this should
be applied.

Tatsuo Ishii
1998-10-06 03:02:29 +00:00
Bruce Momjian f52e7346ea MB patches from Tatsuo Ishii 1998-09-25 01:46:25 +00:00
Bruce Momjian fa1a8d6a97 OK, folks, here is the pgindent output. 1998-09-01 04:40:42 +00:00
Bruce Momjian af74855a60 Renaming cleanup, no pgindent yet. 1998-09-01 03:29:17 +00:00