Commit Graph

46 Commits

Author SHA1 Message Date
Noah Misch ec79978dd0 Encoding PG_UHC is code page 949.
This fixes presentation of non-ASCII messages to the Windows event log
and console in rare cases involving Korean locale.  Processes like the
postmaster and checkpointer, but not processes attached to databases,
were affected.  Back-patch to 9.4, where MessageEncoding was introduced.
The problem exists in all supported versions, but this change has no
effect in the absence of the code recognizing PG_UHC MessageEncoding.

Noticed while investigating bug #13427 from Dmitri Bourlatchkov.
2015-08-14 20:23:13 -04:00
Noah Misch d098b236f3 Fix typos in comments. 2014-06-11 19:50:29 -04:00
Tom Lane 0d79c0a8cc Make various variables const (read-only).
These changes should generally improve correctness/maintainability.
A nice side benefit is that several kilobytes move from initialized
data to text segment, allowing them to be shared across processes and
probably reducing copy-on-write overhead while forking a new backend.
Unfortunately this doesn't seem to help libpq in the same way (at least
not when it's compiled with -fpic on x86_64), but we can hope the linker
at least collects all nominally-const data together even if it's not
actually part of the text segment.

Also, make pg_encname_tbl[] static in encnames.c, since there seems
no very good reason for any other code to use it; per a suggestion
from Wim Lewis, who independently submitted a patch that was mostly
a subset of this one.

Oskari Saarenmaa, with some editorialization by me
2014-01-18 16:04:32 -05:00
Noah Misch 5f538ad004 Renovate display of non-ASCII messages on Windows.
GNU gettext selects a default encoding for the messages it emits in a
platform-specific manner; it uses the Windows ANSI code page on Windows
and follows LC_CTYPE on other platforms.  This is inconvenient for
PostgreSQL server processes, so realize consistent cross-platform
behavior by calling bind_textdomain_codeset() on Windows each time we
permanently change LC_CTYPE.  This primarily affects SQL_ASCII databases
and processes like the postmaster that do not attach to a database,
making their behavior consistent with PostgreSQL on non-Windows
platforms.  Messages from SQL_ASCII databases use the encoding implied
by the database LC_CTYPE, and messages from non-database processes use
LC_CTYPE from the postmaster system environment.  PlatformEncoding
becomes unused, so remove it.

Make write_console() prefer WriteConsoleW() to write() regardless of the
encodings in use.  In this situation, write() will invariably mishandle
non-ASCII characters.

elog.c has assumed that messages conform to the database encoding.
While usually true, this does not hold for SQL_ASCII and MULE_INTERNAL.
Introduce MessageEncoding to track the actual encoding of message text.
The present consumers are Windows-specific code for converting messages
to UTF16 for use in system interfaces.  This fixes the appearance in
Windows event logs and consoles of translated messages from SQL_ASCII
processes like the postmaster.  Note that SQL_ASCII inherently disclaims
a strong notion of encoding, so non-ASCII byte sequences interpolated
into messages by %s may yet yield a nonsensical message.  MULE_INTERNAL
has similar problems at present, albeit for a different reason: its lack
of libiconv support or a conversion to UTF8.

Consequently, one need no longer restart Windows with a different
Windows ANSI code page to broadly test backend logging under a given
language.  Changing the user's locale ("Format") is enough.  Several
accounts can simultaneously run postmasters under different locales, all
correctly logging localized messages to Windows event logs and consoles.

Alexander Law and Noah Misch
2013-06-26 11:17:33 -04:00
Andrew Dunstan 3717f0837b Tidy up from frontend Assert change.
Quiet compiler warnings noted by Peter Eisentraut.
2012-12-16 12:22:57 -05:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Magnus Hagander 748771379b Write to the Windows eventlog in UTF16, converting the message encoding
as necessary.

Itagaki Takahiro with some changes from me
2009-10-17 00:24:51 +00:00
Magnus Hagander 420ea68817 Move gettext encoding names into encnames.c, so we only have one place to update.
Per discussion.
2009-04-24 08:43:51 +00:00
Peter Eisentraut 8b9dd6b5fd Support for KOI8U encoding 2009-02-10 19:29:39 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane febd60bf5d Fix pg_wchar_table[] to match revised ordering of the encoding ID enum.
Add some comments so hopefully the next poor sod doesn't fall into the
same trap.  (Wrong comments are worse than none at all...)
2007-10-15 22:46:27 +00:00
Tom Lane 8468146b03 Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so.  For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway.  (This does force initdb
unfortunately.)

Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using.  To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.

It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs.  However the code is now prepared to avoid this
type of problem in future.

Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly.  The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
2007-10-13 20:18:42 +00:00
Tom Lane 274dfdb513 Tweak clean_encoding_name() API to avoid need to cast away const.
Kris Jurka
2007-04-16 18:50:49 +00:00
Tatsuo Ishii 6041b92238 Make JOHAB client only encoding per discussions in pgsql-hackers
"Server-side support of all encodings" around 2007/3/26.
initdb required.
2007-04-15 10:56:30 +00:00
Tatsuo Ishii 75c6519ff6 Add new encoding EUC_JIS_2004 and SHIFT_JIS_2004,
along with new conversions among EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8.
catalog version has been bump up.
2007-03-25 11:56:04 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian 399a36a75d Prepare code to be built by MSVC:
o  remove many WIN32_CLIENT_ONLY defines
	o  add WIN32_ONLY_COMPILER define
	o  add 3rd argument to open() for portability
	o  add include/port/win32_msvc directory for
	   system includes

Magnus Hagander
2006-06-07 22:24:46 +00:00
Peter Eisentraut 1b658473ea Add support for Windows codepages 1253, 1254, 1255, and 1257 and clean
up a bunch of the support utilities.

In src/backend/utils/mb/Unicode remove nearly duplicate copies of the
UCS_to_XXX perl script and replace with one version to handle all generic
files.  Update the Makefile so that it knows about all the map files.
This produces a slight difference in some of the map files, using a
uniform naming convention and not mapping the null character.

In src/backend/utils/mb/conversion_procs create a master utf8<->win
codepage function like the ISO 8859 versions instead of having a separate
handler for each conversion.

There is an externally visible change in the name of the win1258 to utf8
conversion.  According to the documentation notes, it was named
incorrectly and this changes it to a standard name.

Running the Unicode mapping perl scripts has shown some additional mapping
changes in koi8r and iso8859-7.
2006-02-18 16:15:23 +00:00
Tom Lane 226a980bb0 Fix bug that allowed any logged-in user to SET ROLE to any other database user
id (CVE-2006-0553).  Also fix related bug in SET SESSION AUTHORIZATION that
allows unprivileged users to crash the server, if it has been compiled with
Asserts enabled.  The escalation-of-privilege risk exists only in 8.1.0-8.1.2.
However, the Assert-crash risk exists in all releases back to 7.3.
Thanks to Akio Ishida for reporting this problem.
2006-02-12 22:32:43 +00:00
Neil Conway fb627b76cc Cosmetic code cleanup: fix a bunch of places that used "return (expr);"
rather than "return expr;" -- the latter style is used in most of the
tree. I kept the parentheses when they were necessary or useful because
the return expression was complex.
2006-01-11 08:43:13 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Bruce Momjian e7fb9f18bf Add support for Win1252 encoding.
Roland Volkmann
2005-03-14 18:31:25 +00:00
Bruce Momjian ee1bd33dd0 Document aliases for our supported encodings.
Add a few encodings that were not documented.
2005-03-13 01:26:30 +00:00
Bruce Momjian e3d7de6b99 Rename canonical encodings, per Peter:
UNICODE => UTF8
	ALT => WIN866
	WIN => WIN1251
	TCVN => WIN1258

The old codes continue to work.
2005-03-07 04:30:55 +00:00
Bruce Momjian e09567d850 Back out addition of Win1252 encoding. 2004-12-04 18:19:33 +00:00
Bruce Momjian 7af770d005 Add Charset WIN1252 support.
Roland Volkmann
2004-12-02 22:14:38 +00:00
Bruce Momjian e1c8b37afb Add new macro as shorthand for MS VC and Borland C++:
+ #if   defined(_MSC_VER) || defined(__BORLANDC__)
+ #define       WIN32_CLIENT_ONLY
+ #endif
2004-09-27 23:24:45 +00:00
Peter Eisentraut 152a101f2b Allow WIN1250 as server encoding. 2004-09-17 21:59:57 +00:00
PostgreSQL Daemon 55b113257c make sure the $Id tags are converted to $PostgreSQL as well ... 2003-11-29 22:41:33 +00:00
Tom Lane 689eb53e47 Error message editing in backend/utils (except /adt). 2003-07-25 20:18:01 +00:00
Bruce Momjian b14295cfe4 Attached is the complete diff against current CVS.
Compiles on BCC 5.5 and VC++ 6.0 (with warnings).

Karl Waclawek
2003-06-12 08:15:29 +00:00
Bruce Momjian dc4ee8a833 Back out patch that got bundled into another patch. 2003-06-12 08:11:07 +00:00
Bruce Momjian a647e30ba3 New patch with corrected README attached.
Also quickly added mention that it may be a qualified schema name.

Rod Taylor
2003-06-12 08:02:57 +00:00
Bruce Momjian 12c9423832 Allow Win32 to compile under MinGW. Major changes are:
Win32 port is now called 'win32' rather than 'win'
        add -lwsock32 on Win32
        make gethostname() be only used when kerberos4 is enabled
        use /port/getopt.c
        new /port/opendir.c routines
        disable GUC unix_socket_group on Win32
        convert some keywords.c symbols to KEYWORD_P to prevent conflict
        create new FCNTL_NONBLOCK macro to turn off socket blocking
        create new /include/port.h file that has /port prototypes, move
          out of c.h
        new /include/port/win32_include dir to hold missing include files
        work around ERROR being defined in Win32 includes
2003-05-15 16:35:30 +00:00
Tom Lane e4704001ea This patch fixes a bunch of spelling mistakes in comments throughout the
PostgreSQL source code.

Neil Conway
2003-03-10 22:28:22 +00:00
Bruce Momjian ceab6f7283 As far as I figured from the source code this function only deals with
cleaning up locale names and nothing else. Since all the locale names
are in plain  ASCII I think it will be safe to use ASCII-only lower-case
conversion.

Nicolai Tufar
2002-12-05 23:21:07 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Peter Eisentraut 77f7763b55 Remove all traces of multibyte and locale options. Clean up comments
referring to "multibyte" where it really means character encoding.
2002-09-03 21:45:44 +00:00
Tatsuo Ishii 14f72b9a4d Add GB18030 support. Contributed by Bill Huang <bill_huanghb@ybb.ne.jp>
(ODBC support has not been committed yet. left for Hiroshi...)
2002-06-13 08:30:22 +00:00
Bruce Momjian a8bd7e1c6e > Tatsuo Ishii wrote:
> > > > It was made to cope with encoding such as an Asian bloc in 7.2Beta2.
> > > >
> > > > Added ServerEncoding
> > > >         Korean (JOHAB), Thai (WIN874),
> > > >         Vietnamese (TCVN), Arabic (WIN1256)
> > > >
> > > > Added ClientEncoding
> > > >         Simplified Chinese (GBK), Korean (UHC)
> > > >
> > > >
> > > >
> http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2b2.newencoding.diff.tar.gz
> > > > (608K)
> > >
> > > Looks good.  I need some people to review this for me.
> >
> > For me they look good too. The only missing part is a
> > documentation. I will ask him to write it up. If he couldn't, I will
> > do it for him.
> > > The diff is 3mb
> > > but appears to address only additions to multibyte.  I have attached a
> > > list of files it modifies.  Also, look at the sizes of the mb/
> > > directory.  It is getting large:
> > >
> > >   4       ./CVS
> > >   6       ./Unicode/CVS
> > >   3433    ./Unicode
> > >   6197    .
> >
> > Yes. We definitely need the on-the-fly encoding addition capability:
> > i.e. CREATE CHRACTER SET in the future...
> > --
> > Tatsuo Ishii
> >
> >

Address chainge.

http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2.newencoding.diff.gz

Add PsqlODBC and document ...etc patch.

Eiji Tokuya
2002-03-05 05:52:50 +00:00
Bruce Momjian 6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Tatsuo Ishii cfe01796e6 Ok, here is the modified encoding table (column1 is the standard name,
2 is our "official" name, and 3 is alias). If there's no objection, I
will change them.

ASCII		SQL_ASCII
UTF-8		UNICODE		UTF_8
MULE-INTERNAL	MULE_INTERNAL
ISO-8859-1	LATIN1		ISO_8859_1
ISO-8859-2	LATIN2		ISO_8859_2
ISO-8859-3	LATIN3		ISO_8859_3
ISO-8859-4	LATIN4		ISO_8859_4
ISO-8859-5	ISO_8859_5
ISO-8859-6	ISO_8859_6
ISO-8859-7	ISO_8859_7
ISO-8859-8	ISO_8859_8
ISO-8859-9	LATIN5		ISO_8859_9
ISO-8859-10	LATIN6		ISO_8859_10
ISO-8859-13	LATIN7		ISO_8859_13
ISO-8859-14	LATIN8		ISO_8859_14
ISO-8859-15	LATIN9		ISO_8859_15
ISO-8859-16	LATIN10		ISO_8859_16
2001-10-16 10:09:17 +00:00
Tatsuo Ishii 51053d3216 Add support for ISO-8859-6 to 16 2001-10-11 14:20:35 +00:00
Bruce Momjian 4ea26bf354 Remove variable length macros used in debugging, per Karel. 2001-09-07 15:01:45 +00:00
Tatsuo Ishii 3bdd67a203 Add missing files. 2001-09-07 03:32:11 +00:00