postgresql/src/timezone/localtime.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

1907 lines
44 KiB
C
Raw Normal View History

/* Convert timestamp from pg_time_t to struct pg_tm. */
2004-04-30 06:44:06 +02:00
/*
* This file is in the public domain, so clarified as of
* 1996-06-05 by Arthur David Olson.
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/timezone/localtime.c
*/
2004-04-30 06:44:06 +02:00
/*
* Leap second handling from Bradley White.
* POSIX-style TZ environment variable handling from Guy Harris.
*/
2004-04-30 06:44:06 +02:00
/* this file needs to build in both frontend and backend contexts */
#include "c.h"
2004-04-30 06:44:06 +02:00
#include <fcntl.h>
#include "datatype/timestamp.h"
#include "pgtz.h"
#include "private.h"
2004-04-30 06:44:06 +02:00
#include "tzfile.h"
#ifndef WILDABBR
/*
* Someone might make incorrect use of a time zone abbreviation:
* 1. They might reference tzname[0] before calling tzset (explicitly
* or implicitly).
* 2. They might reference tzname[1] before calling tzset (explicitly
* or implicitly).
* 3. They might reference tzname[1] after setting to a time zone
* in which Daylight Saving Time is never observed.
* 4. They might reference tzname[0] after setting to a time zone
* in which Standard Time is never observed.
* 5. They might reference tm.tm_zone after calling offtime.
* What's best to do in the above cases is open to debate;
* for now, we just set things up so that in any of the five cases
* WILDABBR is used. Another possibility: initialize tzname[0] to the
* string "tzname[0] used before set", and similarly for the other cases.
* And another: initialize tzname[0] to "ERA", with an explanation in the
* manual page of what this "time zone abbreviation" means (doing this so
* that tzname[0] has the "normal" length of three characters).
*/
2004-04-30 06:44:06 +02:00
#define WILDABBR " "
#endif /* !defined WILDABBR */
2004-04-30 06:44:06 +02:00
static const char wildabbr[] = WILDABBR;
2004-04-30 06:44:06 +02:00
static const char gmt[] = "GMT";
2004-04-30 06:44:06 +02:00
Improve performance of timezone loading, especially pg_timezone_names view. tzparse() would attempt to load the "posixrules" timezone database file on each call. That might seem like it would only be an issue when selecting a POSIX-style zone name rather than a zone defined in the timezone database, but it turns out that each zone definition file contains a POSIX-style zone string and tzload() will call tzparse() to parse that. Thus, when scanning the whole timezone file tree as we do in the pg_timezone_names view, "posixrules" was read repetitively for each zone definition file. Fix that by caching the file on first use within any given process. (We cache other zone definitions for the life of the process, so there seems little reason not to cache this one as well.) This probably won't help much in processes that never run pg_timezone_names, but even one additional SET of the timezone GUC would come out ahead. An even worse problem for pg_timezone_names is that pg_open_tzfile() has an inefficient way of identifying the canonical case of a zone name: it basically re-descends the directory tree to the zone file. That's not awful for an individual "SET timezone" operation, but it's pretty horrid when we're inspecting every zone in the database. And it's pointless too because we already know the canonical spelling, having just read it from the filesystem. Fix by teaching pg_open_tzfile() to avoid the directory search if it's not asked for the canonical name, and backfilling the proper result in pg_tzenumerate_next(). In combination these changes seem to make the pg_timezone_names view about 3x faster to read, for me. Since a scan of pg_timezone_names has up to now been one of the slowest queries in the regression tests, this should help some little bit for buildfarm cycle times. Back-patch to all supported branches, not so much because it's likely that users will care much about the view's performance as because tracking changes in the upstream IANA timezone code is really painful if we don't keep all the branches in sync. Discussion: https://postgr.es/m/27962.1493671706@sss.pgh.pa.us
2017-05-03 03:50:35 +02:00
/*
Remove support for timezone "posixrules" file. The IANA tzcode library has a feature to read a time zone file named "posixrules" and apply the daylight-savings transition dates and times therein, when it is given a POSIX-style time zone specification that lacks an explicit transition rule. However, there's a problem with that code: it doesn't work for dates past the Y2038 time_t rollover. (Effectively, all times beyond that point are treated as standard time.) The IANA crew regard this feature as legacy, so their plan is to remove it not fix it. The time frame in which that will happen is unclear, but presumably it'll happen well before 2038. Moreover, effective with the next IANA data update (probably this fall), the recommended default will be to not install a "posixrules" file in the first place. The time frame in which tzdata packagers might adopt that suggestion is likewise unclear, but at least some platforms will probably do it in the next year or so. While we could ignore that recommendation so far as PG-supplied tzdata trees are concerned, builds using --with-system-tzdata will be subject to whatever the platform's tzdata packager decides to do. Thus, whether or not we do anything, some increasing fraction of Postgres users will be exposed to the behavior observed when there is no "posixrules" file; and if we do nothing, we'll have essentially no control over the timing of that change. The best thing to do to ameliorate the uncertainty seems to be to proactively remove the posixrules-reading feature. If we do that in a scheduled release then at least we can release-note the behavioral change, rather than having users be surprised by it after a routine tzdata update. The change in question is fairly minor anyway: to be affected, you have to be using a POSIX-style timezone spec, it has to not have an explicit rule, and it has to not be one of the four traditional continental-USA zone names (EST5EDT, CST6CDT, MST7MDT, or PST8PDT), as those are special-cased. Since the default "posixrules" file provides USA DST rules, the number of people who are likely to find such a zone spec useful is probably quite small. Moreover, the fallback behavior with no explicit rule and no "posixrules" file is to apply current USA rules, so the only thing that really breaks is the DST transitions in years before 2007 (and you get the countervailing fix that transitions after 2038 will be applied). Now, some installations might have replaced the "posixrules" file, allowing e.g. EU rules to be applied to a POSIX-style timezone spec. That won't work anymore. But it's not exactly clear why this solution would be preferable to using a regular named zone. In any case, given the Y2038 issue, we need to be pushing users to stop depending on this. Back-patch into v13; it hasn't been released yet, so it seems OK to change its behavior. (Personally I think we ought to back-patch further, but I've been outvoted.) Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us Discussion: https://postgr.es/m/20200621211855.6211-1-eggert@cs.ucla.edu
2020-06-30 00:55:01 +02:00
* The DST rules to use if a POSIX TZ string has no rules.
* Default to US rules as of 2017-05-07.
* POSIX does not specify the default DST rules;
* for historical reasons, US rules are a common default.
*/
#define TZDEFRULESTRING ",M3.2.0,M11.1.0"
2004-04-30 06:44:06 +02:00
/* structs ttinfo, lsinfo, state have been moved to pgtz.h */
enum r_type
{
JULIAN_DAY, /* Jn = Julian day */
DAY_OF_YEAR, /* n = day of year */
MONTH_NTH_DAY_OF_WEEK, /* Mm.n.d = month, week, day of week */
};
struct rule
{
enum r_type r_type; /* type of rule */
int r_day; /* day number of rule */
int r_week; /* week number of rule */
int r_mon; /* month number of rule */
int32 r_time; /* transition time of rule */
2004-04-30 06:44:06 +02:00
};
/*
* Prototypes for static functions.
*/
2004-04-30 06:44:06 +02:00
static struct pg_tm *gmtsub(pg_time_t const *timep, int32 offset,
struct pg_tm *tmp);
static bool increment_overflow(int *ip, int j);
static bool increment_overflow_time(pg_time_t *tp, int32 j);
static int64 leapcorr(struct state const *sp, pg_time_t t);
static struct pg_tm *timesub(pg_time_t const *timep,
int32 offset, struct state const *sp,
struct pg_tm *tmp);
static bool typesequiv(struct state const *sp, int a, int b);
2004-04-30 06:44:06 +02:00
/*
* Section 4.12.3 of X3.159-1989 requires that
* Except for the strftime function, these functions [asctime,
* ctime, gmtime, localtime] return values in one of two static
* objects: a broken-down time structure and an array of char.
* Thanks to Paul Eggert for noting this.
*/
2004-04-30 06:44:06 +02:00
static struct pg_tm tm;
2004-04-30 06:44:06 +02:00
/* Initialize *S to a value based on UTOFF, ISDST, and DESIGIDX. */
static void
init_ttinfo(struct ttinfo *s, int32 utoff, bool isdst, int desigidx)
{
s->tt_utoff = utoff;
s->tt_isdst = isdst;
s->tt_desigidx = desigidx;
s->tt_ttisstd = false;
s->tt_ttisut = false;
}
2004-04-30 06:44:06 +02:00
static int32
detzcode(const char *const codep)
2004-04-30 06:44:06 +02:00
{
int32 result;
int i;
int32 one = 1;
int32 halfmaxval = one << (32 - 2);
int32 maxval = halfmaxval - 1 + halfmaxval;
int32 minval = -1 - maxval;
2004-04-30 06:44:06 +02:00
result = codep[0] & 0x7f;
for (i = 1; i < 4; ++i)
2004-04-30 06:44:06 +02:00
result = (result << 8) | (codep[i] & 0xff);
if (codep[0] & 0x80)
{
/*
* Do two's-complement negation even on non-two's-complement machines.
* If the result would be minval - 1, return minval.
*/
result -= !TWOS_COMPLEMENT(int32) && result != 0;
result += minval;
}
2004-04-30 06:44:06 +02:00
return result;
}
static int64
detzcode64(const char *const codep)
{
uint64 result;
int i;
int64 one = 1;
int64 halfmaxval = one << (64 - 2);
int64 maxval = halfmaxval - 1 + halfmaxval;
int64 minval = -TWOS_COMPLEMENT(int64) - maxval;
result = codep[0] & 0x7f;
for (i = 1; i < 8; ++i)
result = (result << 8) | (codep[i] & 0xff);
if (codep[0] & 0x80)
{
/*
* Do two's-complement negation even on non-two's-complement machines.
* If the result would be minval - 1, return minval.
*/
result -= !TWOS_COMPLEMENT(int64) && result != 0;
result += minval;
}
return result;
}
static bool
differ_by_repeat(const pg_time_t t1, const pg_time_t t0)
{
if (TYPE_BIT(pg_time_t) - TYPE_SIGNED(pg_time_t) < SECSPERREPEAT_BITS)
return 0;
return t1 - t0 == SECSPERREPEAT;
}
/* Input buffer for data read from a compiled tz file. */
union input_buffer
{
/* The first part of the buffer, interpreted as a header. */
struct tzhead tzhead;
/* The entire buffer. */
char buf[2 * sizeof(struct tzhead) + 2 * sizeof(struct state)
+ 4 * TZ_MAX_TIMES];
};
/* Local storage needed for 'tzloadbody'. */
union local_storage
{
/* The results of analyzing the file's contents after it is opened. */
struct file_analysis
{
/* The input buffer. */
union input_buffer u;
/* A temporary state used for parsing a TZ string in the file. */
struct state st;
} u;
/* We don't need the "fullname" member */
};
/* Load tz data from the file named NAME into *SP. Read extended
* format if DOEXTEND. Use *LSP for temporary storage. Return 0 on
* success, an errno value on failure.
* PG: If "canonname" is not NULL, then on success the canonical spelling of
* given name is stored there (the buffer must be > TZ_STRLEN_MAX bytes!).
*/
static int
tzloadbody(char const *name, char *canonname, struct state *sp, bool doextend,
union local_storage *lsp)
2004-04-30 06:44:06 +02:00
{
int i;
int fid;
int stored;
ssize_t nread;
union input_buffer *up = &lsp->u.u;
int tzheadsize = sizeof(struct tzhead);
sp->goback = sp->goahead = false;
if (!name)
{
name = TZDEFAULT;
if (!name)
return EINVAL;
}
2004-04-30 06:44:06 +02:00
if (name[0] == ':')
++name;
fid = pg_open_tzfile(name, canonname);
if (fid < 0)
return ENOENT; /* pg_open_tzfile may not set errno */
nread = read(fid, up->buf, sizeof up->buf);
if (nread < tzheadsize)
{
int err = nread < 0 ? errno : EINVAL;
close(fid);
return err;
}
if (close(fid) < 0)
return errno;
for (stored = 4; stored <= 8; stored *= 2)
2004-04-30 06:44:06 +02:00
{
int32 ttisstdcnt = detzcode(up->tzhead.tzh_ttisstdcnt);
int32 ttisutcnt = detzcode(up->tzhead.tzh_ttisutcnt);
int64 prevtr = 0;
int32 prevcorr = 0;
int32 leapcnt = detzcode(up->tzhead.tzh_leapcnt);
int32 timecnt = detzcode(up->tzhead.tzh_timecnt);
int32 typecnt = detzcode(up->tzhead.tzh_typecnt);
int32 charcnt = detzcode(up->tzhead.tzh_charcnt);
char const *p = up->buf + tzheadsize;
/*
* Although tzfile(5) currently requires typecnt to be nonzero,
* support future formats that may allow zero typecnt in files that
* have a TZ string and no transitions.
*/
if (!(0 <= leapcnt && leapcnt < TZ_MAX_LEAPS
&& 0 <= typecnt && typecnt < TZ_MAX_TYPES
&& 0 <= timecnt && timecnt < TZ_MAX_TIMES
&& 0 <= charcnt && charcnt < TZ_MAX_CHARS
&& (ttisstdcnt == typecnt || ttisstdcnt == 0)
&& (ttisutcnt == typecnt || ttisutcnt == 0)))
return EINVAL;
if (nread
< (tzheadsize /* struct tzhead */
+ timecnt * stored /* ats */
+ timecnt /* types */
+ typecnt * 6 /* ttinfos */
+ charcnt /* chars */
+ leapcnt * (stored + 4) /* lsinfos */
+ ttisstdcnt /* ttisstds */
+ ttisutcnt)) /* ttisuts */
return EINVAL;
sp->leapcnt = leapcnt;
sp->timecnt = timecnt;
sp->typecnt = typecnt;
sp->charcnt = charcnt;
/*
* Read transitions, discarding those out of pg_time_t range. But
* pretend the last transition before TIME_T_MIN occurred at
* TIME_T_MIN.
*/
timecnt = 0;
for (i = 0; i < sp->timecnt; ++i)
{
int64 at
= stored == 4 ? detzcode(p) : detzcode64(p);
sp->types[i] = at <= TIME_T_MAX;
if (sp->types[i])
{
pg_time_t attime
= ((TYPE_SIGNED(pg_time_t) ? at < TIME_T_MIN : at < 0)
? TIME_T_MIN : at);
if (timecnt && attime <= sp->ats[timecnt - 1])
{
if (attime < sp->ats[timecnt - 1])
return EINVAL;
sp->types[i - 1] = 0;
timecnt--;
}
sp->ats[timecnt++] = attime;
}
p += stored;
2004-04-30 06:44:06 +02:00
}
timecnt = 0;
for (i = 0; i < sp->timecnt; ++i)
{
unsigned char typ = *p++;
if (sp->typecnt <= typ)
return EINVAL;
if (sp->types[i])
sp->types[timecnt++] = typ;
2004-04-30 06:44:06 +02:00
}
sp->timecnt = timecnt;
for (i = 0; i < sp->typecnt; ++i)
{
struct ttinfo *ttisp;
unsigned char isdst,
desigidx;
2004-04-30 06:44:06 +02:00
ttisp = &sp->ttis[i];
ttisp->tt_utoff = detzcode(p);
2004-04-30 06:44:06 +02:00
p += 4;
isdst = *p++;
if (!(isdst < 2))
return EINVAL;
ttisp->tt_isdst = isdst;
desigidx = *p++;
if (!(desigidx < sp->charcnt))
return EINVAL;
ttisp->tt_desigidx = desigidx;
2004-04-30 06:44:06 +02:00
}
for (i = 0; i < sp->charcnt; ++i)
sp->chars[i] = *p++;
sp->chars[i] = '\0'; /* ensure '\0' at end */
/* Read leap seconds, discarding those out of pg_time_t range. */
leapcnt = 0;
for (i = 0; i < sp->leapcnt; ++i)
{
int64 tr = stored == 4 ? detzcode(p) : detzcode64(p);
int32 corr = detzcode(p + stored);
2004-04-30 06:44:06 +02:00
p += stored + 4;
/* Leap seconds cannot occur before the Epoch. */
if (tr < 0)
return EINVAL;
if (tr <= TIME_T_MAX)
{
/*
* Leap seconds cannot occur more than once per UTC month, and
* UTC months are at least 28 days long (minus 1 second for a
* negative leap second). Each leap second's correction must
* differ from the previous one's by 1 second.
*/
if (tr - prevtr < 28 * SECSPERDAY - 1
|| (corr != prevcorr - 1 && corr != prevcorr + 1))
return EINVAL;
sp->lsis[leapcnt].ls_trans = prevtr = tr;
sp->lsis[leapcnt].ls_corr = prevcorr = corr;
leapcnt++;
}
2004-04-30 06:44:06 +02:00
}
sp->leapcnt = leapcnt;
for (i = 0; i < sp->typecnt; ++i)
{
struct ttinfo *ttisp;
2004-04-30 06:44:06 +02:00
ttisp = &sp->ttis[i];
if (ttisstdcnt == 0)
ttisp->tt_ttisstd = false;
else
{
if (*p != true && *p != false)
return EINVAL;
2004-04-30 06:44:06 +02:00
ttisp->tt_ttisstd = *p++;
}
}
for (i = 0; i < sp->typecnt; ++i)
{
struct ttinfo *ttisp;
2004-04-30 06:44:06 +02:00
ttisp = &sp->ttis[i];
if (ttisutcnt == 0)
ttisp->tt_ttisut = false;
else
{
if (*p != true && *p != false)
return EINVAL;
ttisp->tt_ttisut = *p++;
2004-04-30 06:44:06 +02:00
}
}
/*
* If this is an old file, we're done.
*/
if (up->tzhead.tzh_version[0] == '\0')
break;
nread -= p - up->buf;
memmove(up->buf, p, nread);
2004-04-30 06:44:06 +02:00
}
if (doextend && nread > 2 &&
up->buf[0] == '\n' && up->buf[nread - 1] == '\n' &&
sp->typecnt + 2 <= TZ_MAX_TYPES)
{
struct state *ts = &lsp->u.st;
up->buf[nread - 1] = '\0';
if (tzparse(&up->buf[1], ts, false))
{
/*
* Attempt to reuse existing abbreviations. Without this,
* America/Anchorage would be right on the edge after 2037 when
* TZ_MAX_CHARS is 50, as sp->charcnt equals 40 (for LMT AST AWT
* APT AHST AHDT YST AKDT AKST) and ts->charcnt equals 10 (for
* AKST AKDT). Reusing means sp->charcnt can stay 40 in this
* example.
*/
int gotabbr = 0;
int charcnt = sp->charcnt;
for (i = 0; i < ts->typecnt; i++)
{
char *tsabbr = ts->chars + ts->ttis[i].tt_desigidx;
int j;
for (j = 0; j < charcnt; j++)
if (strcmp(sp->chars + j, tsabbr) == 0)
{
ts->ttis[i].tt_desigidx = j;
gotabbr++;
break;
}
if (!(j < charcnt))
{
int tsabbrlen = strlen(tsabbr);
if (j + tsabbrlen < TZ_MAX_CHARS)
{
strcpy(sp->chars + j, tsabbr);
charcnt = j + tsabbrlen + 1;
ts->ttis[i].tt_desigidx = j;
gotabbr++;
}
}
}
if (gotabbr == ts->typecnt)
{
sp->charcnt = charcnt;
/*
* Ignore any trailing, no-op transitions generated by zic as
* they don't help here and can run afoul of bugs in zic 2016j
* or earlier.
*/
while (1 < sp->timecnt
&& (sp->types[sp->timecnt - 1]
== sp->types[sp->timecnt - 2]))
sp->timecnt--;
for (i = 0; i < ts->timecnt; i++)
if (sp->timecnt == 0
|| (sp->ats[sp->timecnt - 1]
< ts->ats[i] + leapcorr(sp, ts->ats[i])))
break;
while (i < ts->timecnt
&& sp->timecnt < TZ_MAX_TIMES)
{
sp->ats[sp->timecnt]
= ts->ats[i] + leapcorr(sp, ts->ats[i]);
sp->types[sp->timecnt] = (sp->typecnt
+ ts->types[i]);
sp->timecnt++;
i++;
}
for (i = 0; i < ts->typecnt; i++)
sp->ttis[sp->typecnt++] = ts->ttis[i];
}
}
}
if (sp->typecnt == 0)
return EINVAL;
if (sp->timecnt > 1)
{
for (i = 1; i < sp->timecnt; ++i)
if (typesequiv(sp, sp->types[i], sp->types[0]) &&
differ_by_repeat(sp->ats[i], sp->ats[0]))
{
sp->goback = true;
break;
}
for (i = sp->timecnt - 2; i >= 0; --i)
if (typesequiv(sp, sp->types[sp->timecnt - 1],
sp->types[i]) &&
differ_by_repeat(sp->ats[sp->timecnt - 1],
sp->ats[i]))
{
sp->goahead = true;
break;
}
}
/*
* Infer sp->defaulttype from the data. Although this default type is
* always zero for data from recent tzdb releases, things are trickier for
* data from tzdb 2018e or earlier.
*
* The first set of heuristics work around bugs in 32-bit data generated
* by tzdb 2013c or earlier. The workaround is for zones like
* Australia/Macquarie where timestamps before the first transition have a
* time type that is not the earliest standard-time type. See:
* https://mm.icann.org/pipermail/tz/2013-May/019368.html
*/
/*
2016-08-29 09:06:40 +02:00
* If type 0 is unused in transitions, it's the type to use for early
* times.
*/
for (i = 0; i < sp->timecnt; ++i)
if (sp->types[i] == 0)
break;
i = i < sp->timecnt ? -1 : 0;
/*
* Absent the above, if there are transition times and the first
* transition is to a daylight time find the standard type less than and
* closest to the type of the first transition.
*/
if (i < 0 && sp->timecnt > 0 && sp->ttis[sp->types[0]].tt_isdst)
{
i = sp->types[0];
while (--i >= 0)
if (!sp->ttis[i].tt_isdst)
break;
}
/*
* The next heuristics are for data generated by tzdb 2018e or earlier,
* for zones like EST5EDT where the first transition is to DST.
*/
/*
* If no result yet, find the first standard type. If there is none, punt
* to type zero.
*/
if (i < 0)
{
i = 0;
while (sp->ttis[i].tt_isdst)
if (++i >= sp->typecnt)
{
i = 0;
break;
}
}
/*
* A simple 'sp->defaulttype = 0;' would suffice here if we didn't have to
* worry about 2018e-or-earlier data. Even simpler would be to remove the
* defaulttype member and just use 0 in its place.
*/
sp->defaulttype = i;
2004-04-30 06:44:06 +02:00
return 0;
}
/* Load tz data from the file named NAME into *SP. Read extended
* format if DOEXTEND. Return 0 on success, an errno value on failure.
* PG: If "canonname" is not NULL, then on success the canonical spelling of
* given name is stored there (the buffer must be > TZ_STRLEN_MAX bytes!).
*/
int
tzload(const char *name, char *canonname, struct state *sp, bool doextend)
{
Reduce stack space consumption in tzload(). While syncing our timezone code with IANA's updates in commit 1c1a7cbd6, I'd chosen not to adopt the code they conditionally compile under #ifdef ALL_STATE. The main thing that that drives is that the space for gmtime and localtime timezone definitions isn't statically allocated, but is malloc'd on first use. I reasoned we didn't need that logic: we don't have localtime() at all, and we always initialize TimeZone to GMT so we always need that one. But there is one other thing ALL_STATE does, which is to make tzload() malloc its transient workspace instead of just declaring it as a local variable. It turns out that that local variable occupies 78K. Even worse is that, at least for common US timezone settings, there's a recursive call to parse the "posixrules" zone name, making peak stack consumption to select a time zone upwards of 150K. That's an uncomfortably large fraction of our STACK_DEPTH_SLOP safety margin, and could result in outright crashes if we try to reduce STACK_DEPTH_SLOP as has been discussed recently. Furthermore, this means that the postmaster's peak stack consumption is several times that of a backend running typical queries (since, except on Windows, backends inherit the timezone GUC values and don't ever run this code themselves unless you do SET TIMEZONE). That's completely backwards from a safety perspective. Hence, adopt the ALL_STATE rather than non-ALL_STATE variant of tzload(), while not changing the other code aspects that symbol controls. The risk of an ENOMEM error from malloc() seems less than that of a SIGSEGV from stack overrun. This should probably get back-patched along with 1c1a7cbd6 and followon fixes, whenever we decide we have enough confidence in the updates to do that.
2016-07-07 17:28:17 +02:00
union local_storage *lsp = malloc(sizeof *lsp);
Reduce stack space consumption in tzload(). While syncing our timezone code with IANA's updates in commit 1c1a7cbd6, I'd chosen not to adopt the code they conditionally compile under #ifdef ALL_STATE. The main thing that that drives is that the space for gmtime and localtime timezone definitions isn't statically allocated, but is malloc'd on first use. I reasoned we didn't need that logic: we don't have localtime() at all, and we always initialize TimeZone to GMT so we always need that one. But there is one other thing ALL_STATE does, which is to make tzload() malloc its transient workspace instead of just declaring it as a local variable. It turns out that that local variable occupies 78K. Even worse is that, at least for common US timezone settings, there's a recursive call to parse the "posixrules" zone name, making peak stack consumption to select a time zone upwards of 150K. That's an uncomfortably large fraction of our STACK_DEPTH_SLOP safety margin, and could result in outright crashes if we try to reduce STACK_DEPTH_SLOP as has been discussed recently. Furthermore, this means that the postmaster's peak stack consumption is several times that of a backend running typical queries (since, except on Windows, backends inherit the timezone GUC values and don't ever run this code themselves unless you do SET TIMEZONE). That's completely backwards from a safety perspective. Hence, adopt the ALL_STATE rather than non-ALL_STATE variant of tzload(), while not changing the other code aspects that symbol controls. The risk of an ENOMEM error from malloc() seems less than that of a SIGSEGV from stack overrun. This should probably get back-patched along with 1c1a7cbd6 and followon fixes, whenever we decide we have enough confidence in the updates to do that.
2016-07-07 17:28:17 +02:00
if (!lsp)
return errno;
else
{
int err = tzloadbody(name, canonname, sp, doextend, lsp);
free(lsp);
return err;
}
}
static bool
typesequiv(const struct state *sp, int a, int b)
{
bool result;
if (sp == NULL ||
a < 0 || a >= sp->typecnt ||
b < 0 || b >= sp->typecnt)
result = false;
else
{
const struct ttinfo *ap = &sp->ttis[a];
const struct ttinfo *bp = &sp->ttis[b];
result = (ap->tt_utoff == bp->tt_utoff
&& ap->tt_isdst == bp->tt_isdst
&& ap->tt_ttisstd == bp->tt_ttisstd
&& ap->tt_ttisut == bp->tt_ttisut
&& (strcmp(&sp->chars[ap->tt_desigidx],
&sp->chars[bp->tt_desigidx])
== 0));
}
return result;
}
static const int mon_lengths[2][MONSPERYEAR] = {
{31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31},
{31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}
2004-04-30 06:44:06 +02:00
};
static const int year_lengths[2] = {
2004-04-30 06:44:06 +02:00
DAYSPERNYEAR, DAYSPERLYEAR
};
/*
* Given a pointer into a timezone string, scan until a character that is not
* a valid character in a time zone abbreviation is found.
* Return a pointer to that character.
*/
static const char *
getzname(const char *strp)
2004-04-30 06:44:06 +02:00
{
char c;
2004-04-30 06:44:06 +02:00
while ((c = *strp) != '\0' && !is_digit(c) && c != ',' && c != '-' &&
c != '+')
++strp;
2004-04-30 06:44:06 +02:00
return strp;
}
/*
* Given a pointer into an extended timezone string, scan until the ending
* delimiter of the time zone abbreviation is located.
* Return a pointer to the delimiter.
*
* As with getzname above, the legal character set is actually quite
* restricted, with other characters producing undefined results.
* We don't do any checking here; checking is done later in common-case code.
*/
static const char *
getqzname(const char *strp, const int delim)
{
int c;
while ((c = *strp) != '\0' && c != delim)
++strp;
return strp;
}
2004-04-30 06:44:06 +02:00
/*
* Given a pointer into a timezone string, extract a number from that string.
* Check that the number is within a specified range; if it is not, return
* NULL.
* Otherwise, return a pointer to the first character not part of the number.
*/
static const char *
getnum(const char *strp, int *const nump, const int min, const int max)
2004-04-30 06:44:06 +02:00
{
char c;
int num;
2004-04-30 06:44:06 +02:00
if (strp == NULL || !is_digit(c = *strp))
return NULL;
num = 0;
do
{
2004-04-30 06:44:06 +02:00
num = num * 10 + (c - '0');
if (num > max)
return NULL; /* illegal value */
2004-04-30 06:44:06 +02:00
c = *++strp;
} while (is_digit(c));
if (num < min)
return NULL; /* illegal value */
2004-04-30 06:44:06 +02:00
*nump = num;
return strp;
}
/*
* Given a pointer into a timezone string, extract a number of seconds,
* in hh[:mm[:ss]] form, from the string.
* If any error occurs, return NULL.
* Otherwise, return a pointer to the first character not part of the number
* of seconds.
*/
static const char *
getsecs(const char *strp, int32 *const secsp)
2004-04-30 06:44:06 +02:00
{
int num;
2004-04-30 06:44:06 +02:00
/*
* 'HOURSPERDAY * DAYSPERWEEK - 1' allows quasi-Posix rules like
* "M10.4.6/26", which does not conform to Posix, but which specifies the
* equivalent of "02:00 on the first Sunday on or after 23 Oct".
*/
2004-04-30 06:44:06 +02:00
strp = getnum(strp, &num, 0, HOURSPERDAY * DAYSPERWEEK - 1);
if (strp == NULL)
return NULL;
*secsp = num * (int32) SECSPERHOUR;
if (*strp == ':')
{
2004-04-30 06:44:06 +02:00
++strp;
strp = getnum(strp, &num, 0, MINSPERHOUR - 1);
if (strp == NULL)
return NULL;
*secsp += num * SECSPERMIN;
if (*strp == ':')
{
2004-04-30 06:44:06 +02:00
++strp;
/* 'SECSPERMIN' allows for leap seconds. */
2004-04-30 06:44:06 +02:00
strp = getnum(strp, &num, 0, SECSPERMIN);
if (strp == NULL)
return NULL;
*secsp += num;
}
}
return strp;
}
/*
* Given a pointer into a timezone string, extract an offset, in
* [+-]hh[:mm[:ss]] form, from the string.
* If any error occurs, return NULL.
* Otherwise, return a pointer to the first character not part of the time.
*/
static const char *
getoffset(const char *strp, int32 *const offsetp)
2004-04-30 06:44:06 +02:00
{
bool neg = false;
2004-04-30 06:44:06 +02:00
if (*strp == '-')
{
neg = true;
2004-04-30 06:44:06 +02:00
++strp;
}
else if (*strp == '+')
2004-04-30 06:44:06 +02:00
++strp;
strp = getsecs(strp, offsetp);
if (strp == NULL)
return NULL; /* illegal time */
2004-04-30 06:44:06 +02:00
if (neg)
*offsetp = -*offsetp;
return strp;
}
/*
* Given a pointer into a timezone string, extract a rule in the form
* date[/time]. See POSIX section 8 for the format of "date" and "time".
* If a valid rule is not found, return NULL.
* Otherwise, return a pointer to the first character not part of the rule.
*/
static const char *
getrule(const char *strp, struct rule *const rulep)
2004-04-30 06:44:06 +02:00
{
if (*strp == 'J')
{
2004-04-30 06:44:06 +02:00
/*
* Julian day.
*/
2004-04-30 06:44:06 +02:00
rulep->r_type = JULIAN_DAY;
++strp;
strp = getnum(strp, &rulep->r_day, 1, DAYSPERNYEAR);
}
else if (*strp == 'M')
{
2004-04-30 06:44:06 +02:00
/*
* Month, week, day.
*/
2004-04-30 06:44:06 +02:00
rulep->r_type = MONTH_NTH_DAY_OF_WEEK;
++strp;
strp = getnum(strp, &rulep->r_mon, 1, MONSPERYEAR);
if (strp == NULL)
return NULL;
if (*strp++ != '.')
return NULL;
strp = getnum(strp, &rulep->r_week, 1, 5);
if (strp == NULL)
return NULL;
if (*strp++ != '.')
return NULL;
strp = getnum(strp, &rulep->r_day, 0, DAYSPERWEEK - 1);
}
else if (is_digit(*strp))
{
2004-04-30 06:44:06 +02:00
/*
* Day of year.
*/
2004-04-30 06:44:06 +02:00
rulep->r_type = DAY_OF_YEAR;
strp = getnum(strp, &rulep->r_day, 0, DAYSPERLYEAR - 1);
}
else
return NULL; /* invalid format */
2004-04-30 06:44:06 +02:00
if (strp == NULL)
return NULL;
if (*strp == '/')
{
2004-04-30 06:44:06 +02:00
/*
* Time specified.
*/
2004-04-30 06:44:06 +02:00
++strp;
strp = getoffset(strp, &rulep->r_time);
}
else
rulep->r_time = 2 * SECSPERHOUR; /* default = 2:00:00 */
2004-04-30 06:44:06 +02:00
return strp;
}
/*
* Given a year, a rule, and the offset from UT at the time that rule takes
* effect, calculate the year-relative time that rule takes effect.
*/
static int32
transtime(const int year, const struct rule *const rulep,
const int32 offset)
2004-04-30 06:44:06 +02:00
{
bool leapyear;
int32 value;
int i;
int d,
m1,
yy0,
yy1,
yy2,
dow;
2004-04-30 06:44:06 +02:00
INITIALIZE(value);
2004-04-30 06:44:06 +02:00
leapyear = isleap(year);
switch (rulep->r_type)
{
2004-04-30 06:44:06 +02:00
case JULIAN_DAY:
2004-04-30 06:44:06 +02:00
/*
* Jn - Julian day, 1 == January 1, 60 == March 1 even in leap
* years. In non-leap years, or if the day number is 59 or less,
* just add SECSPERDAY times the day number-1 to the time of
* January 1, midnight, to get the day.
*/
value = (rulep->r_day - 1) * SECSPERDAY;
if (leapyear && rulep->r_day >= 60)
value += SECSPERDAY;
break;
2004-04-30 06:44:06 +02:00
case DAY_OF_YEAR:
/*
* n - day of year. Just add SECSPERDAY times the day number to
* the time of January 1, midnight, to get the day.
*/
value = rulep->r_day * SECSPERDAY;
break;
case MONTH_NTH_DAY_OF_WEEK:
/*
* Mm.n.d - nth "dth day" of month m.
*/
/*
* Use Zeller's Congruence to get day-of-week of first day of
* month.
*/
m1 = (rulep->r_mon + 9) % 12 + 1;
yy0 = (rulep->r_mon <= 2) ? (year - 1) : year;
yy1 = yy0 / 100;
yy2 = yy0 % 100;
dow = ((26 * m1 - 2) / 10 +
1 + yy2 + yy2 / 4 + yy1 / 4 - 2 * yy1) % 7;
if (dow < 0)
dow += DAYSPERWEEK;
/*
* "dow" is the day-of-week of the first day of the month. Get the
* day-of-month (zero-origin) of the first "dow" day of the month.
*/
d = rulep->r_day - dow;
if (d < 0)
d += DAYSPERWEEK;
for (i = 1; i < rulep->r_week; ++i)
{
if (d + DAYSPERWEEK >=
mon_lengths[(int) leapyear][rulep->r_mon - 1])
2004-04-30 06:44:06 +02:00
break;
d += DAYSPERWEEK;
}
2004-04-30 06:44:06 +02:00
/*
* "d" is the day-of-month (zero-origin) of the day we want.
*/
value = d * SECSPERDAY;
for (i = 0; i < rulep->r_mon - 1; ++i)
value += mon_lengths[(int) leapyear][i] * SECSPERDAY;
break;
2004-04-30 06:44:06 +02:00
}
/*
* "value" is the year-relative time of 00:00:00 UT on the day in
* question. To get the year-relative time of the specified local time on
* that day, add the transition time and the current offset from UT.
*/
2004-04-30 06:44:06 +02:00
return value + rulep->r_time + offset;
}
/*
* Given a POSIX section 8-style TZ string, fill in the rule tables as
* appropriate.
* Returns true on success, false on failure.
*/
bool
tzparse(const char *name, struct state *sp, bool lastditch)
2004-04-30 06:44:06 +02:00
{
const char *stdname;
const char *dstname = NULL;
size_t stdlen;
size_t dstlen;
size_t charcnt;
int32 stdoffset;
int32 dstoffset;
char *cp;
bool load_ok;
2004-04-30 06:44:06 +02:00
stdname = name;
if (lastditch)
{
Improve tzparse's handling of TZDEFRULES ("posixrules") zone data. In the IANA timezone code, tzparse() always tries to load the zone file named by TZDEFRULES ("posixrules"). Previously, we'd hacked that logic to skip the load in the "lastditch" code path, which we use only to initialize the default "GMT" zone during GUC initialization. That's critical for a couple of reasons: since we do not support leap seconds, we *must not* allow "GMT" to have leap seconds, and since this case runs before the GUC subsystem is fully alive, we'd really rather not take the risk of pg_open_tzfile throwing any errors. However, that still left the code reading TZDEFRULES on every other call, something we'd noticed to the extent of having added code to cache the result so it was only done once per process not a lot of times. Andres Freund complained about the static data space used up for the cache; but as long as the logic was like this, there was no point in trying to get rid of that space. We can improve matters by looking a bit more closely at what the IANA code actually needs the TZDEFRULES data for. One thing it does is that if "posixrules" is a leap-second-aware zone, the leap-second behavior will be absorbed into every POSIX-style zone specification. However, that's a behavior we'd really prefer to do without, since for our purposes the end effect is to render every POSIX-style zone name unsupported. Otherwise, the TZDEFRULES data is used only if the POSIX zone name specifies DST but doesn't include a transition date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0"). That is a minority case for our purposes --- in particular, it never happens when tzload() invokes tzparse() to interpret a transition date rule string found in a tzdata zone file. Hence, if we legislate that we're going to ignore leap-second data from "posixrules", we can postpone the TZDEFRULES load into the path where we actually need to substitute for a missing date rule string. That means it will never happen at all in common scenarios, making it reasonable to dynamically allocate the cache space when it does happen. Even when the data is already loaded, this saves some cycles in the common code path since we avoid a memcpy of 23KB or so. And, IMO at least, this is a less ugly hack on the IANA logic than what we had before, since it's not messing with the lastditch-vs-regular code paths. Back-patch to all supported branches, not so much because this is a critical change as that I want to keep all our copies of the IANA timezone code in sync. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-17 18:26:48 +02:00
/* Unlike IANA, don't assume name is exactly "GMT" */
2004-04-30 06:44:06 +02:00
stdlen = strlen(name); /* length of standard zone name */
name += stdlen;
stdoffset = 0;
}
else
{
if (*name == '<')
{
name++;
stdname = name;
name = getqzname(name, '>');
if (*name != '>')
return false;
stdlen = name - stdname;
name++;
}
else
{
name = getzname(name);
stdlen = name - stdname;
}
if (*name == '\0') /* we allow empty STD abbrev, unlike IANA */
return false;
2004-04-30 06:44:06 +02:00
name = getoffset(name, &stdoffset);
if (name == NULL)
return false;
2004-04-30 06:44:06 +02:00
}
Improve tzparse's handling of TZDEFRULES ("posixrules") zone data. In the IANA timezone code, tzparse() always tries to load the zone file named by TZDEFRULES ("posixrules"). Previously, we'd hacked that logic to skip the load in the "lastditch" code path, which we use only to initialize the default "GMT" zone during GUC initialization. That's critical for a couple of reasons: since we do not support leap seconds, we *must not* allow "GMT" to have leap seconds, and since this case runs before the GUC subsystem is fully alive, we'd really rather not take the risk of pg_open_tzfile throwing any errors. However, that still left the code reading TZDEFRULES on every other call, something we'd noticed to the extent of having added code to cache the result so it was only done once per process not a lot of times. Andres Freund complained about the static data space used up for the cache; but as long as the logic was like this, there was no point in trying to get rid of that space. We can improve matters by looking a bit more closely at what the IANA code actually needs the TZDEFRULES data for. One thing it does is that if "posixrules" is a leap-second-aware zone, the leap-second behavior will be absorbed into every POSIX-style zone specification. However, that's a behavior we'd really prefer to do without, since for our purposes the end effect is to render every POSIX-style zone name unsupported. Otherwise, the TZDEFRULES data is used only if the POSIX zone name specifies DST but doesn't include a transition date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0"). That is a minority case for our purposes --- in particular, it never happens when tzload() invokes tzparse() to interpret a transition date rule string found in a tzdata zone file. Hence, if we legislate that we're going to ignore leap-second data from "posixrules", we can postpone the TZDEFRULES load into the path where we actually need to substitute for a missing date rule string. That means it will never happen at all in common scenarios, making it reasonable to dynamically allocate the cache space when it does happen. Even when the data is already loaded, this saves some cycles in the common code path since we avoid a memcpy of 23KB or so. And, IMO at least, this is a less ugly hack on the IANA logic than what we had before, since it's not messing with the lastditch-vs-regular code paths. Back-patch to all supported branches, not so much because this is a critical change as that I want to keep all our copies of the IANA timezone code in sync. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-17 18:26:48 +02:00
charcnt = stdlen + 1;
if (sizeof sp->chars < charcnt)
return false;
/*
Remove support for timezone "posixrules" file. The IANA tzcode library has a feature to read a time zone file named "posixrules" and apply the daylight-savings transition dates and times therein, when it is given a POSIX-style time zone specification that lacks an explicit transition rule. However, there's a problem with that code: it doesn't work for dates past the Y2038 time_t rollover. (Effectively, all times beyond that point are treated as standard time.) The IANA crew regard this feature as legacy, so their plan is to remove it not fix it. The time frame in which that will happen is unclear, but presumably it'll happen well before 2038. Moreover, effective with the next IANA data update (probably this fall), the recommended default will be to not install a "posixrules" file in the first place. The time frame in which tzdata packagers might adopt that suggestion is likewise unclear, but at least some platforms will probably do it in the next year or so. While we could ignore that recommendation so far as PG-supplied tzdata trees are concerned, builds using --with-system-tzdata will be subject to whatever the platform's tzdata packager decides to do. Thus, whether or not we do anything, some increasing fraction of Postgres users will be exposed to the behavior observed when there is no "posixrules" file; and if we do nothing, we'll have essentially no control over the timing of that change. The best thing to do to ameliorate the uncertainty seems to be to proactively remove the posixrules-reading feature. If we do that in a scheduled release then at least we can release-note the behavioral change, rather than having users be surprised by it after a routine tzdata update. The change in question is fairly minor anyway: to be affected, you have to be using a POSIX-style timezone spec, it has to not have an explicit rule, and it has to not be one of the four traditional continental-USA zone names (EST5EDT, CST6CDT, MST7MDT, or PST8PDT), as those are special-cased. Since the default "posixrules" file provides USA DST rules, the number of people who are likely to find such a zone spec useful is probably quite small. Moreover, the fallback behavior with no explicit rule and no "posixrules" file is to apply current USA rules, so the only thing that really breaks is the DST transitions in years before 2007 (and you get the countervailing fix that transitions after 2038 will be applied). Now, some installations might have replaced the "posixrules" file, allowing e.g. EU rules to be applied to a POSIX-style timezone spec. That won't work anymore. But it's not exactly clear why this solution would be preferable to using a regular named zone. In any case, given the Y2038 issue, we need to be pushing users to stop depending on this. Back-patch into v13; it hasn't been released yet, so it seems OK to change its behavior. (Personally I think we ought to back-patch further, but I've been outvoted.) Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us Discussion: https://postgr.es/m/20200621211855.6211-1-eggert@cs.ucla.edu
2020-06-30 00:55:01 +02:00
* The IANA code always tries to tzload(TZDEFRULES) here. We do not want
* to do that; it would be bad news in the lastditch case, where we can't
* assume pg_open_tzfile() is sane yet. Moreover, if we did load it and
* it contains leap-second-dependent info, that would cause problems too.
* Finally, IANA has deprecated the TZDEFRULES feature, so it presumably
* will die at some point. Desupporting it now seems like good
* future-proofing.
Improve tzparse's handling of TZDEFRULES ("posixrules") zone data. In the IANA timezone code, tzparse() always tries to load the zone file named by TZDEFRULES ("posixrules"). Previously, we'd hacked that logic to skip the load in the "lastditch" code path, which we use only to initialize the default "GMT" zone during GUC initialization. That's critical for a couple of reasons: since we do not support leap seconds, we *must not* allow "GMT" to have leap seconds, and since this case runs before the GUC subsystem is fully alive, we'd really rather not take the risk of pg_open_tzfile throwing any errors. However, that still left the code reading TZDEFRULES on every other call, something we'd noticed to the extent of having added code to cache the result so it was only done once per process not a lot of times. Andres Freund complained about the static data space used up for the cache; but as long as the logic was like this, there was no point in trying to get rid of that space. We can improve matters by looking a bit more closely at what the IANA code actually needs the TZDEFRULES data for. One thing it does is that if "posixrules" is a leap-second-aware zone, the leap-second behavior will be absorbed into every POSIX-style zone specification. However, that's a behavior we'd really prefer to do without, since for our purposes the end effect is to render every POSIX-style zone name unsupported. Otherwise, the TZDEFRULES data is used only if the POSIX zone name specifies DST but doesn't include a transition date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0"). That is a minority case for our purposes --- in particular, it never happens when tzload() invokes tzparse() to interpret a transition date rule string found in a tzdata zone file. Hence, if we legislate that we're going to ignore leap-second data from "posixrules", we can postpone the TZDEFRULES load into the path where we actually need to substitute for a missing date rule string. That means it will never happen at all in common scenarios, making it reasonable to dynamically allocate the cache space when it does happen. Even when the data is already loaded, this saves some cycles in the common code path since we avoid a memcpy of 23KB or so. And, IMO at least, this is a less ugly hack on the IANA logic than what we had before, since it's not messing with the lastditch-vs-regular code paths. Back-patch to all supported branches, not so much because this is a critical change as that I want to keep all our copies of the IANA timezone code in sync. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-17 18:26:48 +02:00
*/
Remove support for timezone "posixrules" file. The IANA tzcode library has a feature to read a time zone file named "posixrules" and apply the daylight-savings transition dates and times therein, when it is given a POSIX-style time zone specification that lacks an explicit transition rule. However, there's a problem with that code: it doesn't work for dates past the Y2038 time_t rollover. (Effectively, all times beyond that point are treated as standard time.) The IANA crew regard this feature as legacy, so their plan is to remove it not fix it. The time frame in which that will happen is unclear, but presumably it'll happen well before 2038. Moreover, effective with the next IANA data update (probably this fall), the recommended default will be to not install a "posixrules" file in the first place. The time frame in which tzdata packagers might adopt that suggestion is likewise unclear, but at least some platforms will probably do it in the next year or so. While we could ignore that recommendation so far as PG-supplied tzdata trees are concerned, builds using --with-system-tzdata will be subject to whatever the platform's tzdata packager decides to do. Thus, whether or not we do anything, some increasing fraction of Postgres users will be exposed to the behavior observed when there is no "posixrules" file; and if we do nothing, we'll have essentially no control over the timing of that change. The best thing to do to ameliorate the uncertainty seems to be to proactively remove the posixrules-reading feature. If we do that in a scheduled release then at least we can release-note the behavioral change, rather than having users be surprised by it after a routine tzdata update. The change in question is fairly minor anyway: to be affected, you have to be using a POSIX-style timezone spec, it has to not have an explicit rule, and it has to not be one of the four traditional continental-USA zone names (EST5EDT, CST6CDT, MST7MDT, or PST8PDT), as those are special-cased. Since the default "posixrules" file provides USA DST rules, the number of people who are likely to find such a zone spec useful is probably quite small. Moreover, the fallback behavior with no explicit rule and no "posixrules" file is to apply current USA rules, so the only thing that really breaks is the DST transitions in years before 2007 (and you get the countervailing fix that transitions after 2038 will be applied). Now, some installations might have replaced the "posixrules" file, allowing e.g. EU rules to be applied to a POSIX-style timezone spec. That won't work anymore. But it's not exactly clear why this solution would be preferable to using a regular named zone. In any case, given the Y2038 issue, we need to be pushing users to stop depending on this. Back-patch into v13; it hasn't been released yet, so it seems OK to change its behavior. (Personally I think we ought to back-patch further, but I've been outvoted.) Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us Discussion: https://postgr.es/m/20200621211855.6211-1-eggert@cs.ucla.edu
2020-06-30 00:55:01 +02:00
load_ok = false;
Improve tzparse's handling of TZDEFRULES ("posixrules") zone data. In the IANA timezone code, tzparse() always tries to load the zone file named by TZDEFRULES ("posixrules"). Previously, we'd hacked that logic to skip the load in the "lastditch" code path, which we use only to initialize the default "GMT" zone during GUC initialization. That's critical for a couple of reasons: since we do not support leap seconds, we *must not* allow "GMT" to have leap seconds, and since this case runs before the GUC subsystem is fully alive, we'd really rather not take the risk of pg_open_tzfile throwing any errors. However, that still left the code reading TZDEFRULES on every other call, something we'd noticed to the extent of having added code to cache the result so it was only done once per process not a lot of times. Andres Freund complained about the static data space used up for the cache; but as long as the logic was like this, there was no point in trying to get rid of that space. We can improve matters by looking a bit more closely at what the IANA code actually needs the TZDEFRULES data for. One thing it does is that if "posixrules" is a leap-second-aware zone, the leap-second behavior will be absorbed into every POSIX-style zone specification. However, that's a behavior we'd really prefer to do without, since for our purposes the end effect is to render every POSIX-style zone name unsupported. Otherwise, the TZDEFRULES data is used only if the POSIX zone name specifies DST but doesn't include a transition date rule (e.g., "EST5EDT" rather than "EST5EDT,M3.2.0,M11.1.0"). That is a minority case for our purposes --- in particular, it never happens when tzload() invokes tzparse() to interpret a transition date rule string found in a tzdata zone file. Hence, if we legislate that we're going to ignore leap-second data from "posixrules", we can postpone the TZDEFRULES load into the path where we actually need to substitute for a missing date rule string. That means it will never happen at all in common scenarios, making it reasonable to dynamically allocate the cache space when it does happen. Even when the data is already loaded, this saves some cycles in the common code path since we avoid a memcpy of 23KB or so. And, IMO at least, this is a less ugly hack on the IANA logic than what we had before, since it's not messing with the lastditch-vs-regular code paths. Back-patch to all supported branches, not so much because this is a critical change as that I want to keep all our copies of the IANA timezone code in sync. Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-17 18:26:48 +02:00
sp->goback = sp->goahead = false; /* simulate failed tzload() */
sp->leapcnt = 0; /* intentionally assume no leap seconds */
if (*name != '\0')
{
if (*name == '<')
{
dstname = ++name;
name = getqzname(name, '>');
if (*name != '>')
return false;
dstlen = name - dstname;
name++;
}
else
{
dstname = name;
name = getzname(name);
dstlen = name - dstname; /* length of DST abbr. */
}
if (!dstlen)
return false;
charcnt += dstlen + 1;
if (sizeof sp->chars < charcnt)
return false;
if (*name != '\0' && *name != ',' && *name != ';')
{
2004-04-30 06:44:06 +02:00
name = getoffset(name, &dstoffset);
if (name == NULL)
return false;
}
else
dstoffset = stdoffset - SECSPERHOUR;
Remove support for timezone "posixrules" file. The IANA tzcode library has a feature to read a time zone file named "posixrules" and apply the daylight-savings transition dates and times therein, when it is given a POSIX-style time zone specification that lacks an explicit transition rule. However, there's a problem with that code: it doesn't work for dates past the Y2038 time_t rollover. (Effectively, all times beyond that point are treated as standard time.) The IANA crew regard this feature as legacy, so their plan is to remove it not fix it. The time frame in which that will happen is unclear, but presumably it'll happen well before 2038. Moreover, effective with the next IANA data update (probably this fall), the recommended default will be to not install a "posixrules" file in the first place. The time frame in which tzdata packagers might adopt that suggestion is likewise unclear, but at least some platforms will probably do it in the next year or so. While we could ignore that recommendation so far as PG-supplied tzdata trees are concerned, builds using --with-system-tzdata will be subject to whatever the platform's tzdata packager decides to do. Thus, whether or not we do anything, some increasing fraction of Postgres users will be exposed to the behavior observed when there is no "posixrules" file; and if we do nothing, we'll have essentially no control over the timing of that change. The best thing to do to ameliorate the uncertainty seems to be to proactively remove the posixrules-reading feature. If we do that in a scheduled release then at least we can release-note the behavioral change, rather than having users be surprised by it after a routine tzdata update. The change in question is fairly minor anyway: to be affected, you have to be using a POSIX-style timezone spec, it has to not have an explicit rule, and it has to not be one of the four traditional continental-USA zone names (EST5EDT, CST6CDT, MST7MDT, or PST8PDT), as those are special-cased. Since the default "posixrules" file provides USA DST rules, the number of people who are likely to find such a zone spec useful is probably quite small. Moreover, the fallback behavior with no explicit rule and no "posixrules" file is to apply current USA rules, so the only thing that really breaks is the DST transitions in years before 2007 (and you get the countervailing fix that transitions after 2038 will be applied). Now, some installations might have replaced the "posixrules" file, allowing e.g. EU rules to be applied to a POSIX-style timezone spec. That won't work anymore. But it's not exactly clear why this solution would be preferable to using a regular named zone. In any case, given the Y2038 issue, we need to be pushing users to stop depending on this. Back-patch into v13; it hasn't been released yet, so it seems OK to change its behavior. (Personally I think we ought to back-patch further, but I've been outvoted.) Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us Discussion: https://postgr.es/m/20200621211855.6211-1-eggert@cs.ucla.edu
2020-06-30 00:55:01 +02:00
if (*name == '\0' && !load_ok)
name = TZDEFRULESTRING;
if (*name == ',' || *name == ';')
{
struct rule start;
struct rule end;
int year;
int yearlim;
int timecnt;
pg_time_t janfirst;
int32 janoffset = 0;
int yearbeg;
2004-04-30 06:44:06 +02:00
++name;
if ((name = getrule(name, &start)) == NULL)
return false;
2004-04-30 06:44:06 +02:00
if (*name++ != ',')
return false;
2004-04-30 06:44:06 +02:00
if ((name = getrule(name, &end)) == NULL)
return false;
2004-04-30 06:44:06 +02:00
if (*name != '\0')
return false;
2004-04-30 06:44:06 +02:00
sp->typecnt = 2; /* standard time and DST */
2004-04-30 06:44:06 +02:00
/*
* Two transitions per year, from EPOCH_YEAR forward.
*/
init_ttinfo(&sp->ttis[0], -stdoffset, false, 0);
init_ttinfo(&sp->ttis[1], -dstoffset, true, stdlen + 1);
sp->defaulttype = 0;
timecnt = 0;
2004-04-30 06:44:06 +02:00
janfirst = 0;
yearbeg = EPOCH_YEAR;
do
{
int32 yearsecs
= year_lengths[isleap(yearbeg - 1)] * SECSPERDAY;
yearbeg--;
if (increment_overflow_time(&janfirst, -yearsecs))
{
janoffset = -yearsecs;
break;
}
} while (EPOCH_YEAR - YEARSPERREPEAT / 2 < yearbeg);
yearlim = yearbeg + YEARSPERREPEAT + 1;
for (year = yearbeg; year < yearlim; year++)
{
int32
starttime = transtime(year, &start, stdoffset),
endtime = transtime(year, &end, dstoffset);
int32
yearsecs = (year_lengths[isleap(year)]
* SECSPERDAY);
bool reversed = endtime < starttime;
if (reversed)
{
int32 swap = starttime;
starttime = endtime;
endtime = swap;
}
if (reversed
|| (starttime < endtime
&& (endtime - starttime
< (yearsecs
+ (stdoffset - dstoffset)))))
{
if (TZ_MAX_TIMES - 2 < timecnt)
break;
sp->ats[timecnt] = janfirst;
if (!increment_overflow_time
(&sp->ats[timecnt],
janoffset + starttime))
sp->types[timecnt++] = !reversed;
sp->ats[timecnt] = janfirst;
if (!increment_overflow_time
(&sp->ats[timecnt],
janoffset + endtime))
{
sp->types[timecnt++] = reversed;
yearlim = year + YEARSPERREPEAT + 1;
}
2004-04-30 06:44:06 +02:00
}
if (increment_overflow_time
(&janfirst, janoffset + yearsecs))
break;
janoffset = 0;
2004-04-30 06:44:06 +02:00
}
sp->timecnt = timecnt;
if (!timecnt)
{
sp->ttis[0] = sp->ttis[1];
sp->typecnt = 1; /* Perpetual DST. */
}
else if (YEARSPERREPEAT < year - yearbeg)
sp->goback = sp->goahead = true;
}
else
{
int32 theirstdoffset;
int32 theirdstoffset;
int32 theiroffset;
bool isdst;
int i;
int j;
2004-04-30 06:44:06 +02:00
if (*name != '\0')
return false;
2004-04-30 06:44:06 +02:00
/*
* Initial values of theirstdoffset and theirdstoffset.
*/
2004-04-30 06:44:06 +02:00
theirstdoffset = 0;
for (i = 0; i < sp->timecnt; ++i)
{
2004-04-30 06:44:06 +02:00
j = sp->types[i];
if (!sp->ttis[j].tt_isdst)
{
2004-04-30 06:44:06 +02:00
theirstdoffset =
-sp->ttis[j].tt_utoff;
2004-04-30 06:44:06 +02:00
break;
}
}
theirdstoffset = 0;
for (i = 0; i < sp->timecnt; ++i)
{
2004-04-30 06:44:06 +02:00
j = sp->types[i];
if (sp->ttis[j].tt_isdst)
{
2004-04-30 06:44:06 +02:00
theirdstoffset =
-sp->ttis[j].tt_utoff;
2004-04-30 06:44:06 +02:00
break;
}
}
2004-04-30 06:44:06 +02:00
/*
* Initially we're assumed to be in standard time.
*/
isdst = false;
2004-04-30 06:44:06 +02:00
theiroffset = theirstdoffset;
2004-04-30 06:44:06 +02:00
/*
* Now juggle transition times and types tracking offsets as you
* do.
*/
for (i = 0; i < sp->timecnt; ++i)
{
2004-04-30 06:44:06 +02:00
j = sp->types[i];
sp->types[i] = sp->ttis[j].tt_isdst;
if (sp->ttis[j].tt_ttisut)
{
2004-04-30 06:44:06 +02:00
/* No adjustment to transition time */
}
else
{
2004-04-30 06:44:06 +02:00
/*
* If daylight saving time is in effect, and the
* transition time was not specified as standard time, add
* the daylight saving time offset to the transition time;
* otherwise, add the standard time offset to the
* transition time.
*/
2004-04-30 06:44:06 +02:00
/*
* Transitions from DST to DDST will effectively disappear
* since POSIX provides for only one DST offset.
*/
if (isdst && !sp->ttis[j].tt_ttisstd)
{
2004-04-30 06:44:06 +02:00
sp->ats[i] += dstoffset -
theirdstoffset;
}
else
{
2004-04-30 06:44:06 +02:00
sp->ats[i] += stdoffset -
theirstdoffset;
}
}
theiroffset = -sp->ttis[j].tt_utoff;
2004-04-30 06:44:06 +02:00
if (sp->ttis[j].tt_isdst)
theirdstoffset = theiroffset;
else
theirstdoffset = theiroffset;
2004-04-30 06:44:06 +02:00
}
2004-04-30 06:44:06 +02:00
/*
* Finally, fill in ttis.
*/
init_ttinfo(&sp->ttis[0], -stdoffset, false, 0);
init_ttinfo(&sp->ttis[1], -dstoffset, true, stdlen + 1);
2004-04-30 06:44:06 +02:00
sp->typecnt = 2;
sp->defaulttype = 0;
2004-04-30 06:44:06 +02:00
}
}
else
{
2004-04-30 06:44:06 +02:00
dstlen = 0;
sp->typecnt = 1; /* only standard time */
sp->timecnt = 0;
init_ttinfo(&sp->ttis[0], -stdoffset, false, 0);
sp->defaulttype = 0;
2004-04-30 06:44:06 +02:00
}
sp->charcnt = charcnt;
2004-04-30 06:44:06 +02:00
cp = sp->chars;
memcpy(cp, stdname, stdlen);
2004-04-30 06:44:06 +02:00
cp += stdlen;
*cp++ = '\0';
if (dstlen != 0)
{
memcpy(cp, dstname, dstlen);
2004-04-30 06:44:06 +02:00
*(cp + dstlen) = '\0';
}
return true;
2004-04-30 06:44:06 +02:00
}
static void
gmtload(struct state *const sp)
2004-04-30 06:44:06 +02:00
{
if (tzload(gmt, NULL, sp, true) != 0)
tzparse(gmt, sp, true);
2004-04-30 06:44:06 +02:00
}
/*
* The easy way to behave "as if no library function calls" localtime
* is to not call it, so we drop its guts into "localsub", which can be
* freely called. (And no, the PANS doesn't require the above behavior,
* but it *is* desirable.)
*/
static struct pg_tm *
localsub(struct state const *sp, pg_time_t const *timep,
struct pg_tm *const tmp)
2004-04-30 06:44:06 +02:00
{
const struct ttinfo *ttisp;
int i;
struct pg_tm *result;
const pg_time_t t = *timep;
2004-04-30 06:44:06 +02:00
if (sp == NULL)
return gmtsub(timep, 0, tmp);
if ((sp->goback && t < sp->ats[0]) ||
(sp->goahead && t > sp->ats[sp->timecnt - 1]))
{
pg_time_t newt = t;
pg_time_t seconds;
pg_time_t years;
if (t < sp->ats[0])
seconds = sp->ats[0] - t;
else
seconds = t - sp->ats[sp->timecnt - 1];
--seconds;
years = (seconds / SECSPERREPEAT + 1) * YEARSPERREPEAT;
seconds = years * AVGSECSPERYEAR;
if (t < sp->ats[0])
newt += seconds;
else
newt -= seconds;
if (newt < sp->ats[0] ||
newt > sp->ats[sp->timecnt - 1])
return NULL; /* "cannot happen" */
result = localsub(sp, &newt, tmp);
if (result)
{
int64 newy;
newy = result->tm_year;
if (t < sp->ats[0])
newy -= years;
else
newy += years;
if (!(INT_MIN <= newy && newy <= INT_MAX))
return NULL;
result->tm_year = newy;
}
return result;
}
if (sp->timecnt == 0 || t < sp->ats[0])
{
i = sp->defaulttype;
}
else
{
int lo = 1;
int hi = sp->timecnt;
while (lo < hi)
{
int mid = (lo + hi) >> 1;
if (t < sp->ats[mid])
hi = mid;
else
lo = mid + 1;
}
i = (int) sp->types[lo - 1];
2004-04-30 06:44:06 +02:00
}
ttisp = &sp->ttis[i];
/*
* To get (wrong) behavior that's compatible with System V Release 2.0
* you'd replace the statement below with t += ttisp->tt_utoff;
* timesub(&t, 0L, sp, tmp);
*/
result = timesub(&t, ttisp->tt_utoff, sp, tmp);
if (result)
{
result->tm_isdst = ttisp->tt_isdst;
result->tm_zone = unconstify(char *, &sp->chars[ttisp->tt_desigidx]);
}
return result;
2004-04-30 06:44:06 +02:00
}
struct pg_tm *
pg_localtime(const pg_time_t *timep, const pg_tz *tz)
2004-04-30 06:44:06 +02:00
{
return localsub(&tz->state, timep, &tm);
2004-04-30 06:44:06 +02:00
}
/*
* gmtsub is to gmtime as localsub is to localtime.
*
* Except we have a private "struct state" for GMT, so no sp is passed in.
*/
static struct pg_tm *
gmtsub(pg_time_t const *timep, int32 offset,
struct pg_tm *tmp)
2004-04-30 06:44:06 +02:00
{
struct pg_tm *result;
/* GMT timezone state data is kept here */
static struct state *gmtptr = NULL;
if (gmtptr == NULL)
{
/* Allocate on first use */
gmtptr = (struct state *) malloc(sizeof(struct state));
if (gmtptr == NULL)
return NULL; /* errno should be set by malloc */
gmtload(gmtptr);
2004-04-30 06:44:06 +02:00
}
result = timesub(timep, offset, gmtptr, tmp);
2004-04-30 06:44:06 +02:00
/*
* Could get fancy here and deliver something such as "+xx" or "-xx" if
* offset is non-zero, but this is no time for a treasure hunt.
*/
2004-04-30 06:44:06 +02:00
if (offset != 0)
tmp->tm_zone = wildabbr;
else
tmp->tm_zone = gmtptr->chars;
return result;
2004-04-30 06:44:06 +02:00
}
struct pg_tm *
pg_gmtime(const pg_time_t *timep)
2004-04-30 06:44:06 +02:00
{
return gmtsub(timep, 0, &tm);
2004-04-30 06:44:06 +02:00
}
/*
* Return the number of leap years through the end of the given year
* where, to make the math easy, the answer for year zero is defined as zero.
*/
static int
leaps_thru_end_of_nonneg(int y)
{
return y / 4 - y / 100 + y / 400;
}
static int
leaps_thru_end_of(const int y)
{
return (y < 0
? -1 - leaps_thru_end_of_nonneg(-1 - y)
: leaps_thru_end_of_nonneg(y));
}
2004-04-30 06:44:06 +02:00
static struct pg_tm *
timesub(const pg_time_t *timep, int32 offset,
const struct state *sp, struct pg_tm *tmp)
2004-04-30 06:44:06 +02:00
{
const struct lsinfo *lp;
pg_time_t tdays;
int idays; /* unsigned would be so 2003 */
int64 rem;
int y;
const int *ip;
int64 corr;
bool hit;
int i;
2004-04-30 06:44:06 +02:00
corr = 0;
hit = false;
i = (sp == NULL) ? 0 : sp->leapcnt;
while (--i >= 0)
{
2004-04-30 06:44:06 +02:00
lp = &sp->lsis[i];
if (*timep >= lp->ls_trans)
{
2004-04-30 06:44:06 +02:00
corr = lp->ls_corr;
hit = (*timep == lp->ls_trans
&& (i == 0 ? 0 : lp[-1].ls_corr) < corr);
2004-04-30 06:44:06 +02:00
break;
}
}
y = EPOCH_YEAR;
tdays = *timep / SECSPERDAY;
rem = *timep % SECSPERDAY;
while (tdays < 0 || tdays >= year_lengths[isleap(y)])
{
int newy;
pg_time_t tdelta;
int idelta;
int leapdays;
tdelta = tdays / DAYSPERLYEAR;
if (!((!TYPE_SIGNED(pg_time_t) || INT_MIN <= tdelta)
&& tdelta <= INT_MAX))
goto out_of_range;
idelta = tdelta;
if (idelta == 0)
idelta = (tdays < 0) ? -1 : 1;
newy = y;
if (increment_overflow(&newy, idelta))
goto out_of_range;
leapdays = leaps_thru_end_of(newy - 1) -
leaps_thru_end_of(y - 1);
tdays -= ((pg_time_t) newy - y) * DAYSPERNYEAR;
tdays -= leapdays;
y = newy;
}
/*
* Given the range, we can now fearlessly cast...
*/
idays = tdays;
rem += offset - corr;
while (rem < 0)
{
2004-04-30 06:44:06 +02:00
rem += SECSPERDAY;
--idays;
2004-04-30 06:44:06 +02:00
}
while (rem >= SECSPERDAY)
{
2004-04-30 06:44:06 +02:00
rem -= SECSPERDAY;
++idays;
2004-04-30 06:44:06 +02:00
}
while (idays < 0)
{
if (increment_overflow(&y, -1))
goto out_of_range;
idays += year_lengths[isleap(y)];
}
while (idays >= year_lengths[isleap(y)])
{
idays -= year_lengths[isleap(y)];
if (increment_overflow(&y, 1))
goto out_of_range;
}
tmp->tm_year = y;
if (increment_overflow(&tmp->tm_year, -TM_YEAR_BASE))
goto out_of_range;
tmp->tm_yday = idays;
2004-04-30 06:44:06 +02:00
/*
* The "extra" mods below avoid overflow problems.
*/
tmp->tm_wday = EPOCH_WDAY +
((y - EPOCH_YEAR) % DAYSPERWEEK) *
(DAYSPERNYEAR % DAYSPERWEEK) +
leaps_thru_end_of(y - 1) -
leaps_thru_end_of(EPOCH_YEAR - 1) +
idays;
tmp->tm_wday %= DAYSPERWEEK;
2004-04-30 06:44:06 +02:00
if (tmp->tm_wday < 0)
tmp->tm_wday += DAYSPERWEEK;
tmp->tm_hour = (int) (rem / SECSPERHOUR);
rem %= SECSPERHOUR;
tmp->tm_min = (int) (rem / SECSPERMIN);
2004-08-29 07:07:03 +02:00
/*
* A positive leap second requires a special representation. This uses
* "... ??:59:60" et seq.
*/
tmp->tm_sec = (int) (rem % SECSPERMIN) + hit;
ip = mon_lengths[isleap(y)];
for (tmp->tm_mon = 0; idays >= ip[tmp->tm_mon]; ++(tmp->tm_mon))
idays -= ip[tmp->tm_mon];
tmp->tm_mday = (int) (idays + 1);
2004-04-30 06:44:06 +02:00
tmp->tm_isdst = 0;
tmp->tm_gmtoff = offset;
return tmp;
out_of_range:
errno = EOVERFLOW;
return NULL;
}
/*
* Normalize logic courtesy Paul Eggert.
*/
static bool
increment_overflow(int *ip, int j)
{
int const i = *ip;
/*----------
* If i >= 0 there can only be overflow if i + j > INT_MAX
* or if j > INT_MAX - i; given i >= 0, INT_MAX - i cannot overflow.
* If i < 0 there can only be overflow if i + j < INT_MIN
* or if j < INT_MIN - i; given i < 0, INT_MIN - i cannot overflow.
*----------
*/
if ((i >= 0) ? (j > INT_MAX - i) : (j < INT_MIN - i))
return true;
*ip += j;
return false;
}
static bool
increment_overflow_time(pg_time_t *tp, int32 j)
{
/*----------
* This is like
* 'if (! (TIME_T_MIN <= *tp + j && *tp + j <= TIME_T_MAX)) ...',
* except that it does the right thing even if *tp + j would overflow.
*----------
*/
if (!(j < 0
? (TYPE_SIGNED(pg_time_t) ? TIME_T_MIN - j <= *tp : -1 - j < *tp)
: *tp <= TIME_T_MAX - j))
return true;
*tp += j;
return false;
2004-04-30 06:44:06 +02:00
}
static int64
leapcorr(struct state const *sp, pg_time_t t)
{
struct lsinfo const *lp;
int i;
i = sp->leapcnt;
while (--i >= 0)
{
lp = &sp->lsis[i];
if (t >= lp->ls_trans)
return lp->ls_corr;
}
return 0;
}
/*
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
* Find the next DST transition time in the given zone after the given time
*
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
* *timep and *tz are input arguments, the other parameters are output values.
*
* When the function result is 1, *boundary is set to the pg_time_t
* representation of the next DST transition time after *timep,
* *before_gmtoff and *before_isdst are set to the GMT offset and isdst
* state prevailing just before that boundary (in particular, the state
* prevailing at *timep), and *after_gmtoff and *after_isdst are set to
* the state prevailing just after that boundary.
*
* When the function result is 0, there is no known DST transition
* after *timep, but *before_gmtoff and *before_isdst indicate the GMT
* offset and isdst state prevailing at *timep. (This would occur in
* DST-less time zones, or if a zone has permanently ceased using DST.)
*
* A function result of -1 indicates failure (this case does not actually
* occur in our current implementation).
*/
int
pg_next_dst_boundary(const pg_time_t *timep,
long int *before_gmtoff,
int *before_isdst,
pg_time_t *boundary,
long int *after_gmtoff,
int *after_isdst,
const pg_tz *tz)
{
const struct state *sp;
const struct ttinfo *ttisp;
int i;
int j;
const pg_time_t t = *timep;
sp = &tz->state;
if (sp->timecnt == 0)
{
/* non-DST zone, use lowest-numbered standard type */
i = 0;
while (sp->ttis[i].tt_isdst)
if (++i >= sp->typecnt)
{
i = 0;
break;
}
ttisp = &sp->ttis[i];
*before_gmtoff = ttisp->tt_utoff;
*before_isdst = ttisp->tt_isdst;
return 0;
}
if ((sp->goback && t < sp->ats[0]) ||
(sp->goahead && t > sp->ats[sp->timecnt - 1]))
{
/* For values outside the transition table, extrapolate */
pg_time_t newt = t;
pg_time_t seconds;
pg_time_t tcycles;
int64 icycles;
int result;
if (t < sp->ats[0])
seconds = sp->ats[0] - t;
else
seconds = t - sp->ats[sp->timecnt - 1];
--seconds;
tcycles = seconds / YEARSPERREPEAT / AVGSECSPERYEAR;
++tcycles;
icycles = tcycles;
if (tcycles - icycles >= 1 || icycles - tcycles >= 1)
return -1;
seconds = icycles;
seconds *= YEARSPERREPEAT;
seconds *= AVGSECSPERYEAR;
if (t < sp->ats[0])
newt += seconds;
else
newt -= seconds;
if (newt < sp->ats[0] ||
newt > sp->ats[sp->timecnt - 1])
return -1; /* "cannot happen" */
result = pg_next_dst_boundary(&newt, before_gmtoff,
before_isdst,
boundary,
after_gmtoff,
after_isdst,
tz);
if (t < sp->ats[0])
*boundary -= seconds;
else
*boundary += seconds;
return result;
}
if (t >= sp->ats[sp->timecnt - 1])
{
/* No known transition > t, so use last known segment's type */
i = sp->types[sp->timecnt - 1];
ttisp = &sp->ttis[i];
*before_gmtoff = ttisp->tt_utoff;
*before_isdst = ttisp->tt_isdst;
return 0;
}
if (t < sp->ats[0])
{
/* For "before", use lowest-numbered standard type */
i = 0;
while (sp->ttis[i].tt_isdst)
if (++i >= sp->typecnt)
{
i = 0;
break;
}
ttisp = &sp->ttis[i];
*before_gmtoff = ttisp->tt_utoff;
*before_isdst = ttisp->tt_isdst;
*boundary = sp->ats[0];
/* And for "after", use the first segment's type */
i = sp->types[0];
ttisp = &sp->ttis[i];
*after_gmtoff = ttisp->tt_utoff;
*after_isdst = ttisp->tt_isdst;
return 1;
}
/* Else search to find the boundary following t */
{
int lo = 1;
int hi = sp->timecnt - 1;
while (lo < hi)
{
int mid = (lo + hi) >> 1;
if (t < sp->ats[mid])
hi = mid;
else
lo = mid + 1;
}
i = lo;
}
j = sp->types[i - 1];
ttisp = &sp->ttis[j];
*before_gmtoff = ttisp->tt_utoff;
*before_isdst = ttisp->tt_isdst;
*boundary = sp->ats[i];
j = sp->types[i];
ttisp = &sp->ttis[j];
*after_gmtoff = ttisp->tt_utoff;
*after_isdst = ttisp->tt_isdst;
return 1;
}
2004-04-30 06:44:06 +02:00
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
/*
* Identify a timezone abbreviation's meaning in the given zone
*
* Determine the GMT offset and DST flag associated with the abbreviation.
* This is generally used only when the abbreviation has actually changed
* meaning over time; therefore, we also take a UTC cutoff time, and return
* the meaning in use at or most recently before that time, or the meaning
* in first use after that time if the abbrev was never used before that.
*
* On success, returns true and sets *gmtoff and *isdst. If the abbreviation
* was never used at all in this zone, returns false.
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
*
* Note: abbrev is matched case-sensitively; it should be all-upper-case.
*/
bool
pg_interpret_timezone_abbrev(const char *abbrev,
const pg_time_t *timep,
long int *gmtoff,
int *isdst,
const pg_tz *tz)
{
const struct state *sp;
const char *abbrs;
const struct ttinfo *ttisp;
int abbrind;
int cutoff;
int i;
const pg_time_t t = *timep;
sp = &tz->state;
/*
* Locate the abbreviation in the zone's abbreviation list. We assume
* there are not duplicates in the list.
*/
abbrs = sp->chars;
abbrind = 0;
while (abbrind < sp->charcnt)
{
if (strcmp(abbrev, abbrs + abbrind) == 0)
break;
while (abbrs[abbrind] != '\0')
abbrind++;
abbrind++;
}
if (abbrind >= sp->charcnt)
return false; /* not there! */
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
/*
* Unlike pg_next_dst_boundary, we needn't sweat about extrapolation
* (goback/goahead zones). Finding the newest or oldest meaning of the
* abbreviation should get us what we want, since extrapolation would just
* be repeating the newest or oldest meanings.
*
* Use binary search to locate the first transition > cutoff time.
*/
{
int lo = 0;
int hi = sp->timecnt;
while (lo < hi)
{
int mid = (lo + hi) >> 1;
if (t < sp->ats[mid])
hi = mid;
else
lo = mid + 1;
}
cutoff = lo;
}
/*
* Scan backwards to find the latest interval using the given abbrev
* before the cutoff time.
*/
for (i = cutoff - 1; i >= 0; i--)
{
ttisp = &sp->ttis[sp->types[i]];
if (ttisp->tt_desigidx == abbrind)
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
{
*gmtoff = ttisp->tt_utoff;
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
*isdst = ttisp->tt_isdst;
return true;
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
}
}
/*
* Not there, so scan forwards to find the first one after.
*/
for (i = cutoff; i < sp->timecnt; i++)
{
ttisp = &sp->ttis[sp->types[i]];
if (ttisp->tt_desigidx == abbrind)
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
{
*gmtoff = ttisp->tt_utoff;
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
*isdst = ttisp->tt_isdst;
return true;
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
}
}
return false; /* hm, not actually used in any interval? */
Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
}
/*
* If the given timezone uses only one GMT offset, store that offset
* into *gmtoff and return true, else return false.
*/
bool
pg_get_timezone_offset(const pg_tz *tz, long int *gmtoff)
{
/*
* The zone could have more than one ttinfo, if it's historically used
* more than one abbreviation. We return true as long as they all have
* the same gmtoff.
*/
const struct state *sp;
int i;
sp = &tz->state;
for (i = 1; i < sp->typecnt; i++)
{
if (sp->ttis[i].tt_utoff != sp->ttis[0].tt_utoff)
return false;
}
*gmtoff = sp->ttis[0].tt_utoff;
return true;
}
2004-04-30 06:44:06 +02:00
/*
* Return the name of the current timezone
*/
const char *
pg_get_timezone_name(pg_tz *tz)
{
if (tz)
return tz->TZname;
return NULL;
2004-04-30 06:44:06 +02:00
}
/*
* Check whether timezone is acceptable.
*
* What we are doing here is checking for leap-second-aware timekeeping.
* We need to reject such TZ settings because they'll wreak havoc with our
* date/time arithmetic.
*/
bool
pg_tz_acceptable(pg_tz *tz)
{
struct pg_tm *tt;
pg_time_t time2000;
/*
* To detect leap-second timekeeping, run pg_localtime for what should be
* GMT midnight, 2000-01-01. Insist that the tm_sec value be zero; any
* other result has to be due to leap seconds.
*/
time2000 = (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * SECS_PER_DAY;
tt = pg_localtime(&time2000, tz);
if (!tt || tt->tm_sec != 0)
return false;
return true;
}