Commit Graph

159 Commits

Author SHA1 Message Date
Bruce Momjian 1d96cf9404 In documentation, change "recommendable" to "recommended", per
consultation with word definitions.

Backpatch to 9.2.
2012-08-14 12:36:48 -04:00
Tom Lane 1e16a8107d Teach regular expression operators to honor collations.
This involves getting the character classification and case-folding
functions in the regex library to use the collations infrastructure.
Most of this work had been done already in connection with the upper/lower
and LIKE logic, so it was a simple matter of transposition.

While at it, split out these functions into a separate source file
regc_pg_locale.c, so that they can be correctly labeled with the Postgres
project's license rather than the Scriptics license.  These functions are
100% Postgres-written code whereas what remains in regc_locale.c is still
mostly not ours, so lumping them both under the same copyright notice was
getting more and more misleading.
2011-04-10 18:03:09 -04:00
Tom Lane 9b19c12e1d Document collation handling in SQL and plpgsql functions.
This is pretty minimal but covers the bare facts.
2011-03-25 18:21:25 -04:00
Tom Lane 176d5bae1d Fix up handling of C/POSIX collations.
Install just one instance of the "C" and "POSIX" collations into
pg_collation, rather than one per encoding.  Make these instances exist
and do something useful even in machines without locale_t support: to wit,
it's now possible to force comparisons and case-folding functions to use C
locale in an otherwise non-C database, whether or not the platform has
support for using any additional collations.

Fix up severely broken upper/lower/initcap functions, too: the C/POSIX
fastpath now does what it is supposed to, and non-default collations are
handled correctly in single-byte database encodings.

Merge the two separate collation hashtables that were being maintained in
pg_locale.c, and be more wary of the possibility that we fail partway
through filling a cache entry.
2011-03-20 12:44:13 -04:00
Tom Lane a612b17120 Assorted editing for collation documentation.
I made a pass over this to familiarize myself with the feature, and found
some things that could be improved.
2011-03-08 17:10:59 -05:00
Peter Eisentraut b313bca0af DDL support for collations
- collowner field
- CREATE COLLATION
- ALTER COLLATION
- DROP COLLATION
- COMMENT ON COLLATION
- integration with extensions
- pg_dump support for the above
- dependency management
- psql tab completion
- psql \dO command
2011-02-12 15:55:18 +02:00
Peter Eisentraut 414c5a2ea6 Per-column collation support
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.

Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
2011-02-08 23:04:18 +02:00
Bruce Momjian 5d5678d7c3 Properly capitalize documentation headings; some only had initial-word
capitalization.
2011-01-29 13:01:48 -05:00
Peter Eisentraut fc946c39ae Remove useless whitespace at end of lines 2010-11-23 22:34:55 +02:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Peter Eisentraut a95e51962d Update broken and permanently moved links 2010-03-17 17:12:31 +00:00
Bruce Momjian bd7246f65e Update complex locale example in the documentation. 2010-02-28 02:20:40 +00:00
Bruce Momjian 0ff1c3e547 *** empty log message *** 2010-02-28 02:19:47 +00:00
Bruce Momjian bf62b1a078 Proofreading improvements for the Administration documentation book. 2010-02-03 17:25:06 +00:00
Peter Eisentraut 263144140f Some documentation cleanup for the addition of the KOI8U encoding. Change
all (remaining) mentions of KOI8 to the new canonical form KOI8R.  Add
information about the available conversions for KOI8U.
2009-05-18 08:59:29 +00:00
Tom Lane 421c66b76c Modify CREATE DATABASE to enforce that the source database's encoding setting
must be used for the new database, except when copying from template0.
This is the same rule that we now enforce for locale settings, and it has
the same motivation: databases other than template0 might contain data that
would be invalid according to a different setting.  This represents another
step in a continuing process of locking down ways in which encoding violations
could occur inside the backend.  Per discussion of a few days ago.

In passing, fix pre-existing breakage of mbregress.sh, and fix up a couple
of ereport() calls in dbcommands.c that failed to specify sqlstate codes.
2009-05-06 16:15:21 +00:00
Heikki Linnakangas 1eef90d0a2 Rename the new CREATE DATABASE options to set collation and ctype into
LC_COLLATE and LC_CTYPE, per discussion on pgsql-hackers.
2009-04-06 08:42:53 +00:00
Tom Lane 845693f70f Fix a couple of places that still claimed LC_COLLATE and LC_CTYPE can't
be changed after initdb.
2009-03-26 20:55:49 +00:00
Peter Eisentraut 8b9dd6b5fd Support for KOI8U encoding 2009-02-10 19:29:39 +00:00
Bruce Momjian 93eab311e4 Fix markup tag error, envvar -> envar. 2008-09-24 16:30:26 +00:00
Heikki Linnakangas c2d4526495 Tighten the check in initdb and CREATE DATABASE that the chosen encoding
matches the encoding of the locale. LC_COLLATE is now checked in addition
to LC_CTYPE.
2008-09-23 10:58:03 +00:00
Heikki Linnakangas 61d9674988 Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.

This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
2008-09-23 09:20:39 +00:00
Bruce Momjian bd53eb4b05 Add Swedish_Sweden.1252 Windows locale example to docs. 2008-07-15 17:45:03 +00:00
Bruce Momjian da0a9f1d5a Clarify that locale names on Windows are more verbose.
Report from Martin Saschek
2008-07-15 01:35:23 +00:00
Bruce Momjian 51c3727903 Move client encoding libpq function docs into libpq doc section, and
just reference them from the localization doc section.

Backpatch to 8.3.X.
2008-03-06 15:37:56 +00:00
Tom Lane 70b9b9b788 Change initdb and CREATE DATABASE to actively reject attempts to create
databases with encodings that are incompatible with the server's LC_CTYPE
locale, when we can determine that (which we can on most modern platforms,
I believe).  C/POSIX locale is compatible with all encodings, of course,
so there is still some usefulness to CREATE DATABASE's ENCODING option,
but this will insulate us against all sorts of recurring complaints
caused by mismatched settings.

I moved initdb's existing LC_CTYPE-to-encoding mapping knowledge into
a new src/port/ file so it could be shared by CREATE DATABASE.
2007-09-28 22:25:49 +00:00
Tatsuo Ishii 6041b92238 Make JOHAB client only encoding per discussions in pgsql-hackers
"Server-side support of all encodings" around 2007/3/26.
initdb required.
2007-04-15 10:56:30 +00:00
Tatsuo Ishii 75c6519ff6 Add new encoding EUC_JIS_2004 and SHIFT_JIS_2004,
along with new conversions among EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8.
catalog version has been bump up.
2007-03-25 11:56:04 +00:00
Bruce Momjian a134ee3379 Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:

        may - permission, "You may borrow my rake."

        can - ability, "I can lift that log."

        might - possibility, "It might rain today."

Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice.  Similarly, "It may crash" is better stated, "It might crash".

Also update two error messages mentioned in the documenation to match.
2007-01-31 20:56:20 +00:00
Bruce Momjian 5b7be582e2 Update the UTF-8 RFC reference. RFC 2044 was obsoleted by RFC 2279,
which was obsoleted by RFC 3629.

Michael Fuhr
2007-01-09 22:22:55 +00:00
Bruce Momjian 32cebaecff Remove emacs info from footer of SGML files. 2006-09-16 00:30:20 +00:00
Tom Lane 0fd087af83 Fix table title. 2006-07-28 16:21:57 +00:00
Tom Lane b8cd6b4f25 Make it clearer that not every Postgres character set can be used as a
server-side character set.
2006-07-28 15:33:17 +00:00
Peter Eisentraut 1b658473ea Add support for Windows codepages 1253, 1254, 1255, and 1257 and clean
up a bunch of the support utilities.

In src/backend/utils/mb/Unicode remove nearly duplicate copies of the
UCS_to_XXX perl script and replace with one version to handle all generic
files.  Update the Makefile so that it knows about all the map files.
This produces a slight difference in some of the map files, using a
uniform naming convention and not mapping the null character.

In src/backend/utils/mb/conversion_procs create a master utf8<->win
codepage function like the ISO 8859 versions instead of having a separate
handler for each conversion.

There is an externally visible change in the name of the win1258 to utf8
conversion.  According to the documentation notes, it was named
incorrectly and this changes it to a standard name.

Running the Unicode mapping perl scripts has shown some additional mapping
changes in koi8r and iso8859-7.
2006-02-18 16:15:23 +00:00
Peter Eisentraut 39dfbe5791 Spellchecking run, final cleanups 2005-11-04 23:14:02 +00:00
Tom Lane a9980ec37b Describe the behavior of the SQL_ASCII encoding more accurately. 2005-10-13 21:43:43 +00:00
Tom Lane 6f7fc0bade Cause initdb to create a third standard database "postgres", which
unlike template0 and template1 does not have any special status in
terms of backend functionality.  However, all external utilities such
as createuser and createdb now connect to "postgres" instead of
template1, and the documentation is changed to encourage people to use
"postgres" instead of template1 as a play area.  This should fix some
longstanding gotchas involving unexpected propagation of database
objects by createdb (when you used template1 without understanding
the implications), as well as ameliorating the problem that CREATE
DATABASE is unhappy if anyone else is connected to template1.
Patch by Dave Page, minor editing by Tom Lane.  All per recent
pghackers discussions.
2005-06-21 04:02:34 +00:00
Tom Lane 85eee28cec Minor improvements to locale documentation. 2005-04-16 16:50:01 +00:00
Neil Conway 957f51ea6b Add a reference to the documentation on alternate index operator classes in
the locale docs. Patch from Chris KL, editorialization by Neil Conway.
2005-03-17 00:22:24 +00:00
Bruce Momjian 17c8276d24 Clean up win1252 documentation. Mention how we determine the number of
bytes/character for each encoding.
2005-03-15 02:30:33 +00:00
Bruce Momjian e7fb9f18bf Add support for Win1252 encoding.
Roland Volkmann
2005-03-14 18:31:25 +00:00
Neil Conway 9abced035d Fix mistakes in SGML markup. From David Fetter. 2005-03-14 06:49:48 +00:00
Bruce Momjian d1022ce3a1 Document client-only encodings. 2005-03-14 03:59:22 +00:00
Bruce Momjian a03bb609b3 Finalize character set documentation changes. 2005-03-14 02:14:42 +00:00
Bruce Momjian cbc100af66 Increment all major version numbers in 8.0.X to force recompile of
client aplications so 7.4.X releases can be installed on the same
machine as 8.0.X.
2005-03-13 22:04:29 +00:00
Bruce Momjian 0edc2f14e0 More ordering adjustments. 2005-03-13 05:31:04 +00:00
Bruce Momjian c151e6374c Fix markup. 2005-03-13 05:16:33 +00:00
Bruce Momjian 119807e397 More markup changes. 2005-03-13 05:11:49 +00:00
Bruce Momjian 1c0aeec65b More cleanups. 2005-03-13 04:35:06 +00:00
Bruce Momjian cbe4b4163e More improvements. 2005-03-13 04:10:23 +00:00
Bruce Momjian a717ab6fa6 More additions to the table. 2005-03-13 03:44:51 +00:00
Bruce Momjian 1fa8445233 Keep changing the markup until I like it. :-) 2005-03-13 03:02:08 +00:00
Bruce Momjian 382f24b187 More table markup improvements. 2005-03-13 02:54:34 +00:00
Bruce Momjian 7b7abb7ccb More table markup fixes. 2005-03-13 02:33:03 +00:00
Bruce Momjian 6109a1ce18 Rework "aliases" column for encodings. 2005-03-13 02:20:50 +00:00
Bruce Momjian 074ba31e41 Fix markup typo. 2005-03-13 02:07:04 +00:00
Bruce Momjian f949baf9a2 Add missing conversion documentation for certain encodings. 2005-03-13 02:02:44 +00:00
Bruce Momjian e42e3b6c56 Reorder documented encodings to be alphabetical.
Remove warning about pre-7.2 LATIN5 usage.
2005-03-13 01:30:59 +00:00
Bruce Momjian ee1bd33dd0 Document aliases for our supported encodings.
Add a few encodings that were not documented.
2005-03-13 01:26:30 +00:00
Bruce Momjian 852ef58da9 Documention all our supported encodings. 2005-03-12 06:28:17 +00:00
Bruce Momjian e3d7de6b99 Rename canonical encodings, per Peter:
UNICODE => UTF8
	ALT => WIN866
	WIN => WIN1251
	TCVN => WIN1258

The old codes continue to work.
2005-03-07 04:30:55 +00:00
Bruce Momjian 246be304a5 Add mention of performance impact on LIKE of non-C locales. 2005-01-04 00:05:45 +00:00
Tom Lane 008e9e452f More minor updates and copy-editing. 2004-12-27 22:30:10 +00:00
Neil Conway ec7a6bd9a2 Replace "--" and "---" with "&mdash;" as appropriate, for better-looking
output.
2004-11-15 06:32:15 +00:00
Peter Eisentraut 152a101f2b Allow WIN1250 as server encoding. 2004-09-17 21:59:57 +00:00
Neil Conway fd4f3b3b62 Improve the locale and character set docs, add some <xref>s pointing
to the character set docs where appropriate, and improve the postmaster
reference page. Character set cross-refs suggested by Gavin Kistner.
2004-03-23 02:47:35 +00:00
Neil Conway 80ec228389 Refer to GUC variables using <xref> tags rather than <varname> tags,
where appropriate. Add "id" and "xreflabel" tags to the descriptions
of the GUC variables to facilitate this. Also make a few minor docs
cleanups.
2004-03-09 16:57:47 +00:00
Tom Lane 5e54515167 Recommend SHOW, instead of pg_controldata, for checking LC_COLLATE and
LC_CTYPE settings of a database.
2003-12-30 23:36:19 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Peter Eisentraut 1d27de4cf4 Random copy-editing. 2003-11-04 09:55:39 +00:00
Bruce Momjian 188eda0df2 Consistenly lowercase GUC variable names, in docs and error messages. 2003-09-11 18:30:39 +00:00
Peter Eisentraut c326d8f4f2 Add/edit index entries. 2003-08-31 17:32:24 +00:00
Tom Lane 4c3c8c048d Remove --enable-recode feature, since it's been broken by IPv6 changes,
and seems to have too few users to justify maintaining.
2003-08-04 04:03:10 +00:00
Peter Eisentraut 2c0556068f Indexing support for pattern matching operations via separate operator
class when lc_collate is not C.
2003-05-15 15:50:21 +00:00
Peter Eisentraut 35e60ea967 Change names of ISO-8859-x encodings to ISO_8859_x, to match reality. 2003-04-15 13:26:54 +00:00
Peter Eisentraut 5e5c5cd31a Merge documentation into one book. (Build with "make html".) Replace
vague cross-references with real links.
2003-03-25 16:15:44 +00:00
Peter Eisentraut d258ba01ec Another big editing pass for consistent content and presentation. 2003-03-24 14:32:51 +00:00
Peter Eisentraut 706a32cdf6 Big editing for consistent content and presentation. 2003-03-13 01:30:29 +00:00
Bruce Momjian be2b660ecd This patch includes a lot of minor cleanups to the SGML documentation,
including:

- replacing all the appropriate usages of <citetitle>PostgreSQL
...</citetitle> with &cite-user;, &cite-admin;, and so on

- fix an omission in the EXECUTE documentation

- add some more text to the EXPLAIN documentation

- improve the PL/PgSQL RETURN NEXT documentation (more work to do here)

- minor markup fixes


Neil Conway
2003-01-19 00:13:31 +00:00
Bruce Momjian da8149032a SGML improvements.
Neil Conway
2002-11-15 03:11:18 +00:00
Peter Eisentraut bc49968764 Add more appropriate markup. 2002-09-21 18:32:54 +00:00
Peter Eisentraut da123b7c58 Update installation instructions and put mostly everything in one place.
Also, some editing in PL/Perl and PL/Python chapters.
2002-09-18 20:09:32 +00:00
Tatsuo Ishii 969e0246ed Add Cyrillic and other encodings for encoding conversion.
Patches submitted by Kaori Inaba (i-kaori@sra.co.jp).
2002-08-14 02:45:10 +00:00
Tatsuo Ishii f84002176f Fix bug in encoding conversion table 2002-08-08 08:21:52 +00:00
Tatsuo Ishii 88b74dcddf Add pg_conversion system catalog. Update description for multibyte support. 2002-07-24 05:51:56 +00:00
Peter Eisentraut 867901db9e Locale support is on by default. The choice of locale is done in initdb
and/or with GUC variables.
2002-04-03 05:39:33 +00:00
Peter Eisentraut b6ea172ace Spell checking and markup additions 2002-03-22 19:20:45 +00:00
Bruce Momjian a8bd7e1c6e > Tatsuo Ishii wrote:
> > > > It was made to cope with encoding such as an Asian bloc in 7.2Beta2.
> > > >
> > > > Added ServerEncoding
> > > >         Korean (JOHAB), Thai (WIN874),
> > > >         Vietnamese (TCVN), Arabic (WIN1256)
> > > >
> > > > Added ClientEncoding
> > > >         Simplified Chinese (GBK), Korean (UHC)
> > > >
> > > >
> > > >
> http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2b2.newencoding.diff.tar.gz
> > > > (608K)
> > >
> > > Looks good.  I need some people to review this for me.
> >
> > For me they look good too. The only missing part is a
> > documentation. I will ask him to write it up. If he couldn't, I will
> > do it for him.
> > > The diff is 3mb
> > > but appears to address only additions to multibyte.  I have attached a
> > > list of files it modifies.  Also, look at the sizes of the mb/
> > > directory.  It is getting large:
> > >
> > >   4       ./CVS
> > >   6       ./Unicode/CVS
> > >   3433    ./Unicode
> > >   6197    .
> >
> > Yes. We definitely need the on-the-fly encoding addition capability:
> > i.e. CREATE CHRACTER SET in the future...
> > --
> > Tatsuo Ishii
> >
> >

Address chainge.

http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2.newencoding.diff.gz

Add PsqlODBC and document ...etc patch.

Eiji Tokuya
2002-03-05 05:52:50 +00:00
Peter Eisentraut bf43bed848 Spell-check and markup police 2002-01-20 22:19:57 +00:00
Bruce Momjian 055d4f9208 Email address no longer valid. 2002-01-08 05:45:19 +00:00
Peter Eisentraut 651a639b8b proof-reading 2001-11-28 20:49:10 +00:00
Thomas G. Lockhart 2475e87481 Deprecate 'current' for date/time input.
Fix up references to "PostgreSQL" rather than "Postgres". Was roughly
 evenly split between the two before. ref/ files not yet done.
2001-11-21 05:53:41 +00:00
Tom Lane 9b03776ff2 A bunch of small doco updates motivated by scanning the comments on
the interactive docs.
2001-11-19 03:58:25 +00:00
Peter Eisentraut 31578cdeac Updates about NLS 2001-11-18 20:33:32 +00:00
Tatsuo Ishii 7a546eb985 Add changes for multibyte support in 7.2. 2001-11-15 06:15:34 +00:00
Peter Eisentraut 3c879e3738 Add some more index entries. 2001-11-12 19:19:39 +00:00
Peter Eisentraut d6b2d6b640 Empty ulinks show the url string as hot text; no need to repeat the url as
element content.
2001-10-31 20:35:02 +00:00
Peter Eisentraut ffb8f73890 Bunch of copy fitting and style sheet tweakage to get decent looking print
output (from pdfjadetex).  Also updated instructions to install documentation
processing toolchain.
2001-10-09 18:46:00 +00:00
Peter Eisentraut 536394e73a Fix spacing to get proper URL formatting in print output 2001-10-04 22:26:27 +00:00
Peter Eisentraut 351a0c1736 Replace ASCII-quotes with proper markup. 2001-09-13 15:55:24 +00:00