postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	eccfef81e1	ICU support Add a column collprovider to pg_collation that determines which library provides the collation data. The existing choices are default and libc, and this adds an icu choice, which uses the ICU4C library. The pg_locale_t type is changed to a union that contains the provider-specific locale handles. Users of locale information are changed to look into that struct for the appropriate handle to use. Also add a collversion column that records the version of the collation when it is created, and check at run time whether it is still the same. This detects potentially incompatible library upgrades that can corrupt indexes and other structures. This is currently only supported by ICU-provided collations. initdb initializes the default collation set as before from the `locale -a` output but also adds all available ICU locales with a "-x-icu" appended. Currently, ICU-provided collations can only be explicitly named collations. The global database locales are still always libc-provided. ICU support is enabled by configure --with-icu. Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2017-03-23 15:28:48 -04:00
Noah Misch	3a0d473192	Use wrappers of PG_DETOAST_DATUM_PACKED() more. This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.	2017-03-12 19:35:34 -04:00
Tom Lane	b9d092c962	Remove now-dead code for !HAVE_INT64_TIMESTAMP. This is a basically mechanical removal of #ifdef HAVE_INT64_TIMESTAMP tests and the negative-case controlled code. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us	2017-02-23 14:04:43 -05:00
Tom Lane	1c073505e8	Improve error message for misuse of TZ, tz, OF formatting patterns. Be specific about which pattern is being complained of, and avoid saying "it's not supported in to_date", which is just confusing if the error is actually coming out of to_timestamp. We can phrase it as "is only supported in to_char", instead. Also, use the term "formatting field" not "format pattern", because other error messages in the same file prefer that terminology. (This isn't terribly consistent with the documentation, so maybe we should change all these error messages?)	2017-02-20 10:27:48 -05:00
Heikki Linnakangas	181bdb90ba	Fix typos in comments. Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com	2017-02-06 11:33:58 +02:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Tom Lane	83bed06be4	Rationalize format-picture caching logic in formatting.c. Add a validity flag to DCHCacheEntry and NUMCacheEntry entries, and do not set it true until after we've parsed the supplied format string. This allows dealing with possible errors while parsing the format without the baroque hack that was there before (which only covered errors within NUMDesc_prepare, anyway). We can get rid of the PG_TRY in NUMDesc_prepare, as well as last_NUMCacheEntry and NUM_cache_remove. (Essentially, this reverts commit `ff783fbae` in favor of a less fragile solution; the problems with that approach are well illustrated by later hacking such as 55f927a46.) In passing, define the size of these caches as DCH_CACHE_ENTRIES not DCH_CACHE_FIELDS + 1 (whoever thought that was a good definition?) and likewise for the NUM cache. Also const-ify format string parameters where convenient, and merge duplicated cache lookup logic. This is primarily driven by a proposed patch from Artur Zakirov, which introduced some ereport's into format string parsing for the datetime case. He proposed preventing the creation of invalid cache entries by parsing the format string first into a local-variable array, and then copying that to a cache entry. That seemed a bit ugly to me, and anyway randomly different from the way the identical problem had been solved for the numeric case. Let's make the two sets of code more similar not less so. I'm not sure whether we'll adopt the new error conditions Artur proposes, but this patch seems like good code cleanup and future-proofing in any case. The existing code is critically (and undocumented-ly) dependent on no elog being thrown out of several nontrivial functions, which is trouble waiting to happen, though it doesn't seem to be actively broken today. Discussion: <b2a39359-3282-b402-f4a3-057aae500ee7@postgrespro.ru>	2016-09-28 17:08:40 -04:00
Tom Lane	d3cd36a133	Make to_timestamp() and to_date() range-check fields of their input. Historically, something like to_date('2009-06-40','YYYY-MM-DD') would return '2009-07-10' because there was no prohibition on out-of-range month or day numbers. This has been widely panned, and it also turns out that Oracle throws an error in such cases. Since these functions are nominally Oracle-compatibility features, let's change that. There's no particular restriction on year (modulo the fact that the scanner may not believe that more than 4 digits are year digits, a matter to be addressed separately if at all). But we now check month, day, hour, minute, second, and fractional-second fields, as well as day-of-year and second-of-day fields if those are used. Currently, no checks are made on ISO-8601-style week numbers or day numbers; it's not very clear what the appropriate rules would be there, and they're probably so little used that it's not worth sweating over. Artur Zakirov, reviewed by Amul Sul, further adjustments by me Discussion: <1873520224.1784572.1465833145330.JavaMail.yahoo@mail.yahoo.com> See-Also: <57786490.9010201@wars-nicht.de>	2016-09-28 14:36:17 -04:00
Peter Eisentraut	9a46324fd4	Fix several one-byte buffer over-reads in to_number Several places in NUM_numpart_from_char(), which is called from the SQL function to_number(text, text), could accidentally read one byte past the end of the input buffer (which comes from the input text datum and is not null-terminated). 1. One leading space character would be skipped, but there was no check that the input was at least one byte long. This does not happen in practice, but for defensiveness, add a check anyway. 2. Commit `4a3a1e2cf` apparently accidentally doubled that code that skips one space character (so that two spaces might be skipped), but there was no overflow check before skipping the second byte. Fix by removing that duplicate code. 3. A logic error would allow a one-byte over-read when looking for a trailing sign (S) placeholder. In each case, the extra byte cannot be read out directly, but looking at it might cause a crash. The third item was discovered by Piotr Stefaniak, the first two were found and analyzed by Tom Lane and Peter Eisentraut.	2016-08-08 11:12:59 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Greg Stark	e1623c3959	Fix various common mispellings. Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings	2016-06-03 16:08:45 +01:00
Tom Lane	d136d600f9	Fix possible read past end of string in to_timestamp(). to_timestamp() handles the TH/th format codes by advancing over two input characters, whatever those are. It failed to notice whether there were two characters available to be skipped, making it possible to advance the pointer past the end of the input string and keep on parsing. A similar risk existed in the handling of "Y,YYY" format: it would advance over three characters after the "," whether or not three characters were available. In principle this might be exploitable to disclose contents of server memory. But the security team concluded that it would be very hard to use that way, because the parsing loop would stop upon hitting any zero byte, and TH/th format codes can't be consecutive --- they have to follow some other format code, which would have to match whatever data is there. So it seems impractical to examine memory very much beyond the end of the input string via this bug; and the input string will always be in local memory not in disk buffers, making it unlikely that anything very interesting is close to it in a predictable way. So this doesn't quite rise to the level of needing a CVE. Thanks to Wolf Roediger for reporting this bug.	2016-05-06 12:09:20 -04:00
Tom Lane	55c3a04d60	Fix assorted breakage in to_char()'s OF format option. In HEAD, fix incorrect field width for hours part of OF when tm_gmtoff is negative. This was introduced by commit `2d87eedc1d` as a result of falsely applying a pattern that's correct when + signs are omitted, which is not the case for OF. In 9.4, fix missing abs() call that allowed a sign to be attached to the minutes part of OF. This was fixed in 9.5 by `9b43d73b3f`, but for inscrutable reasons not back-patched. In all three versions, ensure that the sign of tm_gmtoff is correctly reported even when the GMT offset is less than 1 hour. Add regression tests, which evidently we desperately need here. Thomas Munro and Tom Lane, per report from David Fetter	2016-03-17 15:50:33 -04:00
Tom Lane	a70e13a39e	Be more careful about out-of-range dates and timestamps. Tighten the semantics of boundary-case timestamptz so that we allow timestamps >= '4714-11-24 00:00+00 BC' and < 'ENDYEAR-01-01 00:00+00 AD' exactly, no more and no less, but it is allowed to enter timestamps within that range using non-GMT timezone offsets (which could make the nominal date 4714-11-23 BC or ENDYEAR-01-01 AD). This eliminates dump/reload failure conditions for timestamps near the endpoints. To do this, separate checking of the inputs for date2j() from the final range check, and allow the Julian date code to handle a range slightly wider than the nominal range of the datatypes. Also add a bunch of checks to detect out-of-range dates and timestamps that formerly could be returned by operations such as date-plus-integer. All C-level functions that return date, timestamp, or timestamptz should now be proof against returning a value that doesn't pass IS_VALID_DATE() or IS_VALID_TIMESTAMP(). Vitaly Burovoy, reviewed by Anastasia Lubennikova, and substantially whacked around by me	2016-03-16 19:09:28 -04:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Bruce Momjian	28b3a3d41a	to_number(): allow 'V' to divide by 10^(the number of digits) to_char('V') already multiplied in a similar manner. Report by Jeremy Lowery	2015-10-05 21:03:38 -04:00
Bruce Momjian	2d87eedc1d	to_char(): Do not count negative sign as a digit for time values For time masks, like HH24, MI, SS, CC, MM, do not count the negative sign as part of the zero-padding length specified by the mask, e.g. have to_char('-4 years'::interval, 'YY') return '-04', not '-4'. Report by Craig Ringer	2015-10-05 20:51:46 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Tom Lane	c90b85e4d9	Second try at fixing warnings caused by commit `9b43d73b3f`. Commit `ef3f9e642d` suppressed one cause of warnings here, but recent clang on OS X is still unhappy because we're passing a "long" to abs(). The fact that tm_gmtoff is declared as long is no doubt a hangover from days when int might be only 16 bits; but Postgres has never been able to run on such machines, so we can just cast it to int with no worries. For consistency, also cast to int in the other uses of tm_gmtoff in this stanza. Note: this code is still broken on machines that don't follow C99 integer-division-truncates-towards-zero rules. Given the lack of complaints about it, I don't feel a large desire to complicate things enough to cope with the pre-C99 rules.	2015-05-03 23:44:52 -04:00
Robert Haas	ef3f9e642d	Attempt to fix some compiler warnings.	2015-04-29 14:02:27 -04:00
Bruce Momjian	9b43d73b3f	to_char(): have format 'OF' only show the leading negative sign Previously both hours and minutes displayed as negative. Report by David Pozsar	2015-04-28 21:02:57 -04:00
Bruce Momjian	33a2c5ecd6	to_char: revert `cc0d90b73b` Revert "to_char(float4/8): zero pad to specified length". There are too many platform-specific problems, and the proper rounding is missing. Also revert companion patch `9d61b9953c`.	2015-03-22 22:56:56 -04:00
Bruce Momjian	cc0d90b73b	to_char(float4/8): zero pad to specified length Previously, zero padding was limited to the internal length, rather than the specified length. This allows it to match to_char(int/numeric), which always padded to the specified length. Regression tests added. BACKWARD INCOMPATIBILITY	2015-03-21 21:43:36 -04:00
Noah Misch	237795a7b4	Check DCH_MAX_ITEM_SIZ limits with <=, not <. We reserve space for the full amount, not one less. The affected checks deal with localized month and day names. Today's DCH_MAX_ITEM_SIZ value would suffice for a 60-byte day name, while the longest known is the 49-byte mn_CN.utf-8 word for "Saturday." Thus, the upshot of this change is merely to avoid misdirecting future readers of the code; users are not expected to see errors either way.	2015-02-06 23:39:52 -05:00
Bruce Momjian	9241c84cbc	to_char(): prevent writing beyond the allocated buffer Previously very long localized month and weekday strings could overflow the allocated buffers, causing a server crash. Reported and patch reviewed by Noah Misch. Backpatch to all supported versions. Security: CVE-2015-0241	2015-02-02 10:00:45 -05:00
Bruce Momjian	0150ab567b	to_char(): prevent accesses beyond the allocated buffer Previously very long field masks for floats could access memory beyond the existing buffer allocated to hold the result. Reported by Andres Freund and Peter Geoghegan. Backpatch to all supported versions. Security: CVE-2015-0241	2015-02-02 10:00:44 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Bruce Momjian	95c38a9895	Revert `f68dc5d86b` Renaming will have to be more comprehensive, so I need approval.	2014-09-12 20:42:19 -04:00
Bruce Momjian	f68dc5d86b	More formatting.c variable renaming, for clarity	2014-09-12 20:35:07 -04:00
Bruce Momjian	a9c22d1480	Rename C variables in formatting.c, for clarity Also add C comments. This should help future debugging of this notorious file.	2014-09-05 09:52:31 -04:00
Heikki Linnakangas	51f41e8c0a	Fix typos in comments.	2014-05-21 23:19:01 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Tom Lane	9a8f5729b4	Fix to_timestamp/to_date's handling of consecutive spaces in format string. When there are consecutive spaces (or other non-format-code characters) in the format, we should advance over exactly that many characters of input. The previous coding mistakenly did a "skip whitespace" action between such characters, possibly allowing more input to be skipped than the user intended. We only need to skip whitespace just before an actual field. This is really a bug fix, but given the minimal number of field complaints and the risk of breaking applications coded to expect the old behavior, let's not back-patch it. Jeevan Chalke	2014-01-20 13:45:51 -05:00
Tom Lane	0d79c0a8cc	Make various variables const (read-only). These changes should generally improve correctness/maintainability. A nice side benefit is that several kilobytes move from initialized data to text segment, allowing them to be shared across processes and probably reducing copy-on-write overhead while forking a new backend. Unfortunately this doesn't seem to help libpq in the same way (at least not when it's compiled with -fpic on x86_64), but we can hope the linker at least collects all nominally-const data together even if it's not actually part of the text segment. Also, make pg_encname_tbl[] static in encnames.c, since there seems no very good reason for any other code to use it; per a suggestion from Wim Lewis, who independently submitted a patch that was mostly a subset of this one. Oskari Saarenmaa, with some editorialization by me	2014-01-18 16:04:32 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Robert Haas	a8656a3ab0	Make NUM_TOCHAR_prepare and NUM_TOCHAR_finish macros declare "len". Remove the variable from the enclosing scopes so that nothing can be relying on it. The net result of this refactoring is that we get rid of a few unnecessary strlen() calls. Original patch from Greg Jaskiewicz, substantially expanded by me.	2013-12-02 10:51:06 -05:00
Bruce Momjian	7408c5d29b	Add timezone offset output option to to_char() Add ability for to_char() to output the timezone's UTC offset (OF). We already have the ability to return the timezone abbeviation (TZ/tz). Per request from Andrew Dunstan	2013-07-01 13:40:32 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Tom Lane	35d50b527a	Fix to_number() to correctly ignore thousands separator when it's '.'. The existing code in NUM_numpart_from_char has hard-wired logic to treat '.' as decimal point, even when we're using a locale-aware format string and the locale says that '.' is the thousands separator. This results in clearly wrong answers in FM mode (where we must be able to identify the decimal point location), as per bug report from Patryk Kordylewski. Since the initialization code in NUM_prepare_locale already sets up Np->decimal as either the locale decimal-point string or "." depending on which decimal-point format code was used, there's really no need to have any extra logic at all in NUM_numpart_from_char: we only need to test for a match to Np->decimal. (Note: AFAICS there's nothing in here that explicitly checks for thousands separators --- rather, any unmatched character is silently skipped over. That's pretty bogus IMO but it's not the issue being complained of.) This is a longstanding bug, but it's possible that some existing apps are depending on '.' being recognized as decimal point even when using a D format code. Hence, no back-patch. We should probably list this as a potential incompatibility in the 9.3 release notes.	2013-05-11 16:35:03 -04:00
Tom Lane	80b011ef0a	Fix to_char() to use ASCII-only case-folding rules where appropriate. formatting.c used locale-dependent case folding rules in some code paths where the result isn't supposed to be locale-dependent, for example to_char(timestamp, 'DAY'). Since the source data is always just ASCII in these cases, that usually didn't matter ... but it does matter in Turkish locales, which have unusual treatment of "i" and "I". To confuse matters even more, the misbehavior was only visible in UTF8 encoding, because in single-byte encodings we used pg_toupper/pg_tolower which don't have locale-specific behavior for ASCII characters. Fix by providing intentionally ASCII-only case-folding functions and using these where appropriate. Per bug #7913 from Adnan Dursun. Back-patch to all active branches, since it's been like this for a long time.	2013-03-05 13:02:30 -05:00
Tom Lane	5c4eb9166e	Reject out-of-range dates in to_date(). Dates outside the supported range could be entered, but would not print reasonably, and operations such as conversion to timestamp wouldn't behave sanely either. Since this has the potential to result in undumpable table data, it seems worth back-patching. Hitoshi Harada	2013-01-14 15:19:48 -05:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Bruce Momjian	015722fb36	Fix to_date() and to_timestamp() to allow specification of the day of the week via ISO or Gregorian designations. The fix is to store the day-of-week consistently as 1-7, Sunday = 1. Fixes bug reported by Marc Munro	2012-09-03 22:52:44 -04:00
Bruce Momjian	2751740ab5	Add additional C comments for to_date/to_char() fixes.	2012-08-08 13:27:01 -04:00
Bruce Momjian	ac78c4178b	Fix to_char(), to_date(), and to_timestamp() to handle negative/BC century specifications just like positive/AD centuries. Previously the behavior was either wrong or inconsistent with positive/AD handling. Centuries without years now always assume the first year of the century, which is now documented.	2012-08-07 13:34:44 -04:00
Peter Eisentraut	dd16f9480a	Remove unreachable code The Solaris Studio compiler warns about these instances, unlike more mainstream compilers such as gcc. But manual inspection showed that the code is clearly not reachable, and we hope no worthy compiler will complain about removing this code.	2012-07-16 22:15:03 +03:00
Tom Lane	41f4a0ab78	Fix to_date's handling of year 519. A thinko in commit `029dfdf115` caused the year 519 to be handled differently from either adjacent year, which was not the intention AFAICS. Report and diagnosis by Marc Cousin. In passing, remove redundant re-tests of year value.	2012-07-02 11:35:35 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Robert Haas	5d4b60f2f2	Lots of doc corrections. Josh Kupershmidt	2012-04-23 22:43:09 -04:00
Peter Eisentraut	eb990a2b9e	Add const qualifier to tzn returned by timestamp2tm() The tzn value might come from tm->tm_zone, which libc declares as const, so it's prudent that the upper layers know about this as well.	2012-03-15 21:17:19 +02:00

1 2 3 4 5

235 Commits