Commit Graph

177 Commits

Author SHA1 Message Date
Tom Lane 6e197cb2e5 Improve reporting of run-time-detected indeterminate-collation errors.
pg_newlocale_from_collation does not have enough context to give an error
message that's even a little bit useful, so move the responsibility for
complaining up to its callers.  Also, reword ERRCODE_INDETERMINATE_COLLATION
error messages in a less jargony, more message-style-guide-compliant
fashion.
2011-03-22 16:55:32 -04:00
Tom Lane 176d5bae1d Fix up handling of C/POSIX collations.
Install just one instance of the "C" and "POSIX" collations into
pg_collation, rather than one per encoding.  Make these instances exist
and do something useful even in machines without locale_t support: to wit,
it's now possible to force comparisons and case-folding functions to use C
locale in an otherwise non-C database, whether or not the platform has
support for using any additional collations.

Fix up severely broken upper/lower/initcap functions, too: the C/POSIX
fastpath now does what it is supposed to, and non-default collations are
handled correctly in single-byte database encodings.

Merge the two separate collation hashtables that were being maintained in
pg_locale.c, and be more wary of the possibility that we fail partway
through filling a cache entry.
2011-03-20 12:44:13 -04:00
Bruce Momjian 3a3f39fdc0 Use macros for time-based constants, rather than constants. 2011-03-12 09:35:56 -05:00
Peter Eisentraut 414c5a2ea6 Per-column collation support
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.

Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
2011-02-08 23:04:18 +02:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Bruce Momjian 239d769e7e pgindent run for 9.0, second run 2010-07-06 19:19:02 +00:00
Tom Lane 2c0870ff7a Fix to_char YYY, YY, Y format codes so that FM zero-suppression really works,
rather than only sort-of working as the previous attempt had left it.
Clean up some unnecessary differences between the way these were coded and
the way the YYYY case was coded.  Update the regression test cases that
proved that it wasn't working.
2010-04-07 21:41:53 +00:00
Bruce Momjian ea066f87c3 Document that "Q" is ignored by to_date and to_timestamp. Add C comment
about the behavior.

Document that quotes in to_date, to_timestamp, to_number skip input
characters.
2010-03-03 22:28:42 +00:00
Bruce Momjian 65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Bruce Momjian 89ce2bfc13 Add C comment that do_to_timestamp() lacks error checking. 2010-02-25 18:36:14 +00:00
Bruce Momjian a54803149a Revert recent change of to_char('HH12') handling for intervals; instead
improve documentation, and add C comment.
2010-02-23 16:14:26 +00:00
Bruce Momjian 4f56dc3fb4 Secondary patch to fix interval to_char() for "HH" where hours >= 12. 2010-02-23 06:29:01 +00:00
Bruce Momjian 7cdadc62ea Supress convertion of zero hours to '12' for intervals when using
to_char with HH, e.g.

	to_char(interval '0d 0h 12m 44s', 'DD HH24 MI SS');

now returns:

	 00 00 12 44

not:

	 00 12 12 44
2010-02-23 01:42:19 +00:00
Bruce Momjian 70d8a2c29e Honor to_char() "FM" specification in YYY, YY, and Y; it was already
honored by YYYY.  Also document Oracle "toggle" FM behavior.

Per report from Guy Rouillier
2010-02-16 21:18:02 +00:00
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Alvaro Herrera 55f927a46e Refactor NUM_cache_remove calls in error report path to a PG_TRY block.
The code in the new block was not reindented; it will be fixed by pgindent
eventually.
2009-08-10 20:16:05 +00:00
Tom Lane e61fd4ac74 Support EEEE (scientific notation) in to_char().
Pavel Stehule, Brendan Jurd
2009-08-10 18:29:27 +00:00
Heikki Linnakangas 44886bd878 Fix ancient bug in handling of to_char modifier 'TH', when used with HH.
In what seems like an oversight, we used to treat 'TH' the same as lowercase
'th', but only with HH/HH12.
2009-07-06 19:11:39 +00:00
Tom Lane 3f1e529e78 Make to_timestamp and friends skip leading spaces before an integer field,
even when not in FM mode.  This improves compatibility with Oracle and with
our pre-8.4 behavior, as per bug #4862.

Brendan Jurd

Add a couple of regression test cases for this.  In passing, get rid of the
labeling of the individual test cases; doesn't seem to be good for anything
except causing extra work when inserting a test...

Tom Lane
2009-06-22 17:54:30 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Tom Lane 7a52a8f829 Clean up the code for to_timestamp's conversion of year plus ISO day number
to date, as per bug #4702 and subsequent discussion.  In particular, make it
work for years specified using AD/BC or CC fields, and fix the test for "no
year specified" so that it doesn't trigger inappropriately for 1 BC (which it
was doing even in code paths that had nothing to do with to_timestamp).  I
also did some minor code beautification in the non-ISO-day-number code path.

This area has been busted all along, but because the code has been rewritten
repeatedly, it would be considerable trouble to back-patch.  It's such a
corner case that it doesn't seem worth the effort.
2009-03-15 20:31:19 +00:00
Tom Lane 2cdec8b308 Fix core dump due to null-pointer dereference in to_char() when datetime
format codes are misapplied to a numeric argument.  (The code still produces
a pretty bogus error message in such cases, but I'll settle for stopping the
crash for now.)  Per bug #4700 from Sergey Burladyan.

Problem exists in all supported branches, so patch all the way back.
In HEAD, also clean up some ugly coding in the nearby cache management
code.
2009-03-12 00:53:25 +00:00
Bruce Momjian 65b731bd9d Fix to_timestamp() to not require upper/lower case matching for meridian
designations (AM/PM).  Also separate out matching of a meridian with
periods (e.g. A.M.) and with those without.

Do the same for AD/BC.

Brendan Jurd
2009-02-07 14:16:46 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Tom Lane b4d64a6d48 Remove our dependencies on MB_CUR_MAX in favor of believing that
pg_database_encoding_max_length() predicts the maximum character length
returned by wchar2char().  Per Hiroshi Inoue, MB_CUR_MAX isn't usable on
Windows because we allow encoding = UTF8 when the locale says differently;
and getting rid of it seems a good idea on general principles because it
narrows our dependence on libc's locale API just a little bit more.

Also install a check for overflow of the buffer size computation.
2008-12-15 14:55:50 +00:00
Heikki Linnakangas 7fb27531e8 Modify the new to_timestamp implementation so that end-of-format-string
is treated like a non-digit separator. This fixes the inconsistency in
examples like:

to_timestamp('2008-01-2', 'YYYY-MM-DD') -- didn't work

and

to_timestamp('2008-1-02', 'YYYY-MM-DD') -- did work
2008-12-01 17:11:18 +00:00
Heikki Linnakangas 45d146a6db Fix 'Q' format char parsing in the new to_timestamp() code. Used to crash. 2008-11-10 17:36:53 +00:00
Tom Lane 557faa4fb3 Random speculation about the reason for PPC64 buildfarm failures:
maybe isalnum is returning a value with the low-order byte all zero?
2008-10-06 05:03:27 +00:00
Tom Lane b1e929f295 Fix pointer-advancement bugs in MS and US cases of new to_timestamp() code.
Alex Hunsaker
2008-09-26 15:35:28 +00:00
Tom Lane 06edce4c3f Tighten up to_date/to_timestamp so that they are more likely to reject
erroneous input, rather than silently producing bizarre results as formerly
happened.

Brendan Jurd
2008-09-11 17:32:34 +00:00
Bruce Momjian 6152de97d3 Minor patch on pgbench
1. -i option should run vacuum analyze only on pgbench tables, not *all*
tables in database.

2. pre-run cleanup step was DELETE FROM HISTORY then VACUUM HISTORY.
This is just a slow version of TRUNCATE HISTORY.

Simon Riggs
2008-08-22 17:57:34 +00:00
Tom Lane 960af47efd Const-ify the arguments of str_tolower() and friends to suppress compile
warnings.  Clean up various unneeded cruft that was left behind after
creating those routines.  Introduce some convenience functions str_tolower_z
etc to eliminate tedious and error-prone double arguments in formatting.c.
(Currently there seems no need to export the latter, but maybe reconsider
this later.)
2008-07-12 00:44:38 +00:00
Teodor Sigaev 5ff9899933 Fix bug "select lower('asd') = 'asd'" returns false with multibyte encoding
and non-C locale. Fix is just to use correct source's length for char2wchar
call.
2008-06-26 16:06:37 +00:00
Bruce Momjian f6ec7430f9 Merge duplicate upper/lower/initcap() routines in oracle_compat.c and
formatting.c to use common code;  remove duplicate functions and support
routines that are no longer needed.
2008-06-23 19:27:19 +00:00
Bruce Momjian dc69c0362f Move USE_WIDE_UPPER_LOWER define to c.h, and remove TS_USE_WIDE and use
USE_WIDE_UPPER_LOWER instead.
2008-06-17 16:09:06 +00:00
Bruce Momjian 9f19470966 Simplify code in formatting.c now that to upper/lower/initcase do not
modify the passed string.
2008-05-20 01:41:02 +00:00
Tom Lane 07a5606735 Make to_char()'s localized month/day names depend on LC_TIME, not LC_MESSAGES.
Euler Taveira de Oliveira
2008-05-19 18:08:16 +00:00
Tom Lane 220db7ccd8 Simplify and standardize conversions between TEXT datums and ordinary C
strings.  This patch introduces four support functions cstring_to_text,
cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and
two macros CStringGetTextDatum and TextDatumGetCString.  A number of
existing macros that provided variants on these themes were removed.

Most of the places that need to make such conversions now require just one
function or macro call, in place of the multiple notational layers that used
to be needed.  There are no longer any direct calls of textout or textin,
and we got most of the places that were using handmade conversions via
memcpy (there may be a few still lurking, though).

This commit doesn't make any serious effort to eliminate transient memory
leaks caused by detoasting toasted text objects before they reach
text_to_cstring.  We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few
places where it was easy, but much more could be done.

Brendan Jurd and Tom Lane
2008-03-25 22:42:46 +00:00
Tom Lane 19595835c3 Refactor to_char/to_date formatting code; primarily, replace DCH_processor
with two new functions DCH_to_char and DCH_from_char that have less confusing
APIs.  Brendan Jurd
2008-03-22 22:32:19 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Bruce Momjian b85cf684f7 Add more comments about thousands separator handling. 2007-11-22 17:51:39 +00:00
Bruce Momjian d9bc7a3946 Add comments about thousands separator logic. 2007-11-22 15:10:05 +00:00
Bruce Momjian 3894e7cc55 When setting default thousands separator when locale has "", use logic
so new thousands separator doesn't match decimal symbol.
2007-11-21 22:28:18 +00:00
Bruce Momjian 6f3149e464 Fix typo in comment. 2007-11-21 21:49:22 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane bdd6b62245 Switch over to using the src/timezone functions for formatting timestamps
displayed in the postmaster log.  This avoids Windows-specific problems with
localized time zone names that are in the wrong encoding, and generally seems
like a good idea to forestall other potential platform-dependent issues.
To preserve the existing behavior that all backends will log in the same time
zone, create a new GUC variable log_timezone that can only be changed on a
system-wide basis, and reference log-related calculations to that zone instead
of the TimeZone variable.

This fixes the issue reported by Hiroshi Saito that timestamps printed by
xlog.c startup could be improperly localized on Windows.  We still need a
simpler patch for that problem in the back branches, however.
2007-08-04 01:26:54 +00:00
Tom Lane 6faf795662 Fix a passel of ancient bugs in to_char(), including two distinct buffer
overruns (neither of which seem likely to be exploitable as security holes,
fortunately, since the provoker can't control the data written).  One of
these is due to choosing to stomp on the output of a called function, which
is bad news in any case; make it treat the called functions' results as
read-only.  Avoid some unnecessary palloc/pfree traffic too; it's not
really helpful to free small temporary objects, and again this is presuming
more than it ought to about the nature of the results of called functions.
Per report from Patrick Welche and additional code-reading by Imad.
2007-06-29 01:51:35 +00:00
Tom Lane 234a02b2a8 Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len).
Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with
VARSIZE and VARDATA, and as a consequence almost no code was using the
longer names.  Rename the length fields of struct varlena and various
derived structures to catch anyplace that was accessing them directly;
and clean up various places so caught.  In itself this patch doesn't
change any behavior at all, but it is necessary infrastructure if we hope
to play any games with the representation of varlena headers.
Greg Stark and Tom Lane
2007-02-27 23:48:10 +00:00
Bruce Momjian 4fe1a12c54 Remove rint() for to_char MS and US output. We can't us rint() because
we can't overflow to the next higher units, and we might print the lower
units for MS.
2007-02-17 03:11:32 +00:00