Commit Graph

196 Commits

Author SHA1 Message Date
Tom Lane 80b011ef0a Fix to_char() to use ASCII-only case-folding rules where appropriate.
formatting.c used locale-dependent case folding rules in some code paths
where the result isn't supposed to be locale-dependent, for example
to_char(timestamp, 'DAY').  Since the source data is always just ASCII
in these cases, that usually didn't matter ... but it does matter in
Turkish locales, which have unusual treatment of "i" and "I".  To confuse
matters even more, the misbehavior was only visible in UTF8 encoding,
because in single-byte encodings we used pg_toupper/pg_tolower which
don't have locale-specific behavior for ASCII characters.  Fix by providing
intentionally ASCII-only case-folding functions and using these where
appropriate.  Per bug #7913 from Adnan Dursun.  Back-patch to all active
branches, since it's been like this for a long time.
2013-03-05 13:02:30 -05:00
Tom Lane 5c4eb9166e Reject out-of-range dates in to_date().
Dates outside the supported range could be entered, but would not print
reasonably, and operations such as conversion to timestamp wouldn't behave
sanely either.  Since this has the potential to result in undumpable table
data, it seems worth back-patching.

Hitoshi Harada
2013-01-14 15:19:48 -05:00
Bruce Momjian bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Bruce Momjian 015722fb36 Fix to_date() and to_timestamp() to allow specification of the day of
the week via ISO or Gregorian designations.  The fix is to store the
day-of-week consistently as 1-7, Sunday = 1.

Fixes bug reported by Marc Munro
2012-09-03 22:52:44 -04:00
Bruce Momjian 2751740ab5 Add additional C comments for to_date/to_char() fixes. 2012-08-08 13:27:01 -04:00
Bruce Momjian ac78c4178b Fix to_char(), to_date(), and to_timestamp() to handle negative/BC
century specifications just like positive/AD centuries.  Previously the
behavior was either wrong or inconsistent with positive/AD handling.

Centuries without years now always assume the first year of the century,
which is now documented.
2012-08-07 13:34:44 -04:00
Peter Eisentraut dd16f9480a Remove unreachable code
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc.  But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
2012-07-16 22:15:03 +03:00
Tom Lane 41f4a0ab78 Fix to_date's handling of year 519.
A thinko in commit 029dfdf115 caused the year
519 to be handled differently from either adjacent year, which was not the
intention AFAICS.  Report and diagnosis by Marc Cousin.

In passing, remove redundant re-tests of year value.
2012-07-02 11:35:35 -04:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Robert Haas 5d4b60f2f2 Lots of doc corrections.
Josh Kupershmidt
2012-04-23 22:43:09 -04:00
Peter Eisentraut eb990a2b9e Add const qualifier to tzn returned by timestamp2tm()
The tzn value might come from tm->tm_zone, which libc declares as
const, so it's prudent that the upper layers know about this as well.
2012-03-15 21:17:19 +02:00
Peter Eisentraut c9f310d377 Add comment for missing break in switch
For clarity, following other sites, and to silence Coverity.
2012-03-12 20:55:09 +02:00
Bruce Momjian e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
Peter Eisentraut 037a82704c Standardize treatment of strcmp() return value
Always compare the return value to 0, don't use cute tricks like
if (!strcmp(...)).
2011-12-27 21:19:09 +02:00
Andrew Dunstan 0f44335122 Miscellaneous cleanup to silence compiler warnings seen on Mingw.
Remove some dead code, conditionally declare some items or call
some code, and fix one or two declarations.
2011-12-10 18:15:15 -05:00
Tom Lane f0bedf3e45 Fix corner case bug in numeric to_char().
Trailing-zero stripping applied by the FM specifier could strip zeroes
to the left of the decimal point, for a format with no digit positions
after the decimal point (such as "FM999.").

Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
2011-09-07 17:07:20 -04:00
Bruce Momjian 029dfdf115 Fix to_date() and to_timestamp() to handle year masks of length < 4 so
they wrap toward year 2020, rather than the inconsistent behavior we had
before.
2011-09-07 09:47:51 -04:00
Tom Lane 2ab0796d7a Fix char2wchar/wchar2char to support collations properly.
These functions should take a pg_locale_t, not a collation OID, and should
call mbstowcs_l/wcstombs_l where available.  Where those functions are not
available, temporarily select the correct locale with uselocale().

This change removes the bogus assumption that all locales selectable in
a given database have the same wide-character conversion method; in
particular, the collate.linux.utf8 regression test now passes with
LC_CTYPE=C, so long as the database encoding is UTF8.

I decided to move the char2wchar/wchar2char functions out of mbutils.c and
into pg_locale.c, because they work on wchar_t not pg_wchar_t and thus
don't really belong with the mbutils.c functions.  Keeping them where they
were would have required importing pg_locale_t into pg_wchar.h somehow,
which did not seem like a good plan.
2011-04-23 12:35:41 -04:00
Bruce Momjian bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Tom Lane 6e197cb2e5 Improve reporting of run-time-detected indeterminate-collation errors.
pg_newlocale_from_collation does not have enough context to give an error
message that's even a little bit useful, so move the responsibility for
complaining up to its callers.  Also, reword ERRCODE_INDETERMINATE_COLLATION
error messages in a less jargony, more message-style-guide-compliant
fashion.
2011-03-22 16:55:32 -04:00
Tom Lane 176d5bae1d Fix up handling of C/POSIX collations.
Install just one instance of the "C" and "POSIX" collations into
pg_collation, rather than one per encoding.  Make these instances exist
and do something useful even in machines without locale_t support: to wit,
it's now possible to force comparisons and case-folding functions to use C
locale in an otherwise non-C database, whether or not the platform has
support for using any additional collations.

Fix up severely broken upper/lower/initcap functions, too: the C/POSIX
fastpath now does what it is supposed to, and non-default collations are
handled correctly in single-byte database encodings.

Merge the two separate collation hashtables that were being maintained in
pg_locale.c, and be more wary of the possibility that we fail partway
through filling a cache entry.
2011-03-20 12:44:13 -04:00
Bruce Momjian 3a3f39fdc0 Use macros for time-based constants, rather than constants. 2011-03-12 09:35:56 -05:00
Peter Eisentraut 414c5a2ea6 Per-column collation support
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.

Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
2011-02-08 23:04:18 +02:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Bruce Momjian 239d769e7e pgindent run for 9.0, second run 2010-07-06 19:19:02 +00:00
Tom Lane 2c0870ff7a Fix to_char YYY, YY, Y format codes so that FM zero-suppression really works,
rather than only sort-of working as the previous attempt had left it.
Clean up some unnecessary differences between the way these were coded and
the way the YYYY case was coded.  Update the regression test cases that
proved that it wasn't working.
2010-04-07 21:41:53 +00:00
Bruce Momjian ea066f87c3 Document that "Q" is ignored by to_date and to_timestamp. Add C comment
about the behavior.

Document that quotes in to_date, to_timestamp, to_number skip input
characters.
2010-03-03 22:28:42 +00:00
Bruce Momjian 65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Bruce Momjian 89ce2bfc13 Add C comment that do_to_timestamp() lacks error checking. 2010-02-25 18:36:14 +00:00
Bruce Momjian a54803149a Revert recent change of to_char('HH12') handling for intervals; instead
improve documentation, and add C comment.
2010-02-23 16:14:26 +00:00
Bruce Momjian 4f56dc3fb4 Secondary patch to fix interval to_char() for "HH" where hours >= 12. 2010-02-23 06:29:01 +00:00
Bruce Momjian 7cdadc62ea Supress convertion of zero hours to '12' for intervals when using
to_char with HH, e.g.

	to_char(interval '0d 0h 12m 44s', 'DD HH24 MI SS');

now returns:

	 00 00 12 44

not:

	 00 12 12 44
2010-02-23 01:42:19 +00:00
Bruce Momjian 70d8a2c29e Honor to_char() "FM" specification in YYY, YY, and Y; it was already
honored by YYYY.  Also document Oracle "toggle" FM behavior.

Per report from Guy Rouillier
2010-02-16 21:18:02 +00:00
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Alvaro Herrera 55f927a46e Refactor NUM_cache_remove calls in error report path to a PG_TRY block.
The code in the new block was not reindented; it will be fixed by pgindent
eventually.
2009-08-10 20:16:05 +00:00
Tom Lane e61fd4ac74 Support EEEE (scientific notation) in to_char().
Pavel Stehule, Brendan Jurd
2009-08-10 18:29:27 +00:00
Heikki Linnakangas 44886bd878 Fix ancient bug in handling of to_char modifier 'TH', when used with HH.
In what seems like an oversight, we used to treat 'TH' the same as lowercase
'th', but only with HH/HH12.
2009-07-06 19:11:39 +00:00
Tom Lane 3f1e529e78 Make to_timestamp and friends skip leading spaces before an integer field,
even when not in FM mode.  This improves compatibility with Oracle and with
our pre-8.4 behavior, as per bug #4862.

Brendan Jurd

Add a couple of regression test cases for this.  In passing, get rid of the
labeling of the individual test cases; doesn't seem to be good for anything
except causing extra work when inserting a test...

Tom Lane
2009-06-22 17:54:30 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Tom Lane 7a52a8f829 Clean up the code for to_timestamp's conversion of year plus ISO day number
to date, as per bug #4702 and subsequent discussion.  In particular, make it
work for years specified using AD/BC or CC fields, and fix the test for "no
year specified" so that it doesn't trigger inappropriately for 1 BC (which it
was doing even in code paths that had nothing to do with to_timestamp).  I
also did some minor code beautification in the non-ISO-day-number code path.

This area has been busted all along, but because the code has been rewritten
repeatedly, it would be considerable trouble to back-patch.  It's such a
corner case that it doesn't seem worth the effort.
2009-03-15 20:31:19 +00:00
Tom Lane 2cdec8b308 Fix core dump due to null-pointer dereference in to_char() when datetime
format codes are misapplied to a numeric argument.  (The code still produces
a pretty bogus error message in such cases, but I'll settle for stopping the
crash for now.)  Per bug #4700 from Sergey Burladyan.

Problem exists in all supported branches, so patch all the way back.
In HEAD, also clean up some ugly coding in the nearby cache management
code.
2009-03-12 00:53:25 +00:00
Bruce Momjian 65b731bd9d Fix to_timestamp() to not require upper/lower case matching for meridian
designations (AM/PM).  Also separate out matching of a meridian with
periods (e.g. A.M.) and with those without.

Do the same for AD/BC.

Brendan Jurd
2009-02-07 14:16:46 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Tom Lane b4d64a6d48 Remove our dependencies on MB_CUR_MAX in favor of believing that
pg_database_encoding_max_length() predicts the maximum character length
returned by wchar2char().  Per Hiroshi Inoue, MB_CUR_MAX isn't usable on
Windows because we allow encoding = UTF8 when the locale says differently;
and getting rid of it seems a good idea on general principles because it
narrows our dependence on libc's locale API just a little bit more.

Also install a check for overflow of the buffer size computation.
2008-12-15 14:55:50 +00:00
Heikki Linnakangas 7fb27531e8 Modify the new to_timestamp implementation so that end-of-format-string
is treated like a non-digit separator. This fixes the inconsistency in
examples like:

to_timestamp('2008-01-2', 'YYYY-MM-DD') -- didn't work

and

to_timestamp('2008-1-02', 'YYYY-MM-DD') -- did work
2008-12-01 17:11:18 +00:00
Heikki Linnakangas 45d146a6db Fix 'Q' format char parsing in the new to_timestamp() code. Used to crash. 2008-11-10 17:36:53 +00:00
Tom Lane 557faa4fb3 Random speculation about the reason for PPC64 buildfarm failures:
maybe isalnum is returning a value with the low-order byte all zero?
2008-10-06 05:03:27 +00:00
Tom Lane b1e929f295 Fix pointer-advancement bugs in MS and US cases of new to_timestamp() code.
Alex Hunsaker
2008-09-26 15:35:28 +00:00
Tom Lane 06edce4c3f Tighten up to_date/to_timestamp so that they are more likely to reject
erroneous input, rather than silently producing bizarre results as formerly
happened.

Brendan Jurd
2008-09-11 17:32:34 +00:00