mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-10-03 06:46:54 +02:00
c5054da0d7
There's no functional change at all here, but I'm curious to see whether this change successfully shuts up Coverity's warning about a useless strcmp(), which appeared with the previous update. Discussion: http://mm.icann.org/pipermail/tz/2020-October/029370.html
135 lines
6.0 KiB
Plaintext
135 lines
6.0 KiB
Plaintext
src/timezone/README
|
|
|
|
This is a PostgreSQL adapted version of the IANA timezone library from
|
|
|
|
https://www.iana.org/time-zones
|
|
|
|
The latest version of the timezone data and library source code is
|
|
available right from that page. It's best to get the merged file
|
|
tzdb-NNNNX.tar.lz, since the other archive formats omit tzdata.zi.
|
|
Historical versions, as well as release announcements, can be found
|
|
elsewhere on the site.
|
|
|
|
Since time zone rules change frequently in some parts of the world,
|
|
we should endeavor to update the data files before each PostgreSQL
|
|
release. The code need not be updated as often, but we must track
|
|
changes that might affect interpretation of the data files.
|
|
|
|
|
|
Time Zone data
|
|
==============
|
|
|
|
We distribute the time zone source data as-is under src/timezone/data/.
|
|
Currently, we distribute just the abbreviated single-file format
|
|
"tzdata.zi", to reduce the size of our tarballs as well as churn
|
|
in our git repo. Feeding that file to zic produces the same compiled
|
|
output as feeding the bulkier individual data files would do.
|
|
|
|
While data/tzdata.zi can just be duplicated when updating, manual effort
|
|
is needed to update the time zone abbreviation lists under tznames/.
|
|
These need to be changed whenever new abbreviations are invented or the
|
|
UTC offset associated with an existing abbreviation changes. To detect
|
|
if this has happened, after installing new files under data/ do
|
|
make abbrevs.txt
|
|
which will produce a file showing all abbreviations that are in current
|
|
use according to the data/ files. Compare this to known_abbrevs.txt,
|
|
which is the list that existed last time the tznames/ files were updated.
|
|
Update tznames/ as seems appropriate, then replace known_abbrevs.txt
|
|
in the same commit. Usually, if a known abbreviation has changed meaning,
|
|
the appropriate fix is to make it refer to a long-form zone name instead
|
|
of a fixed GMT offset.
|
|
|
|
The core regression test suite does some simple validation of the zone
|
|
data and abbreviations data (notably by checking that the pg_timezone_names
|
|
and pg_timezone_abbrevs views don't throw errors). It's worth running it
|
|
as a cross-check on proposed updates.
|
|
|
|
When there has been a new release of Windows (probably including Service
|
|
Packs), the list of matching timezones need to be updated. Run the
|
|
script in src/tools/win32tzlist.pl on a Windows machine running this new
|
|
release and apply any new timezones that it detects. Never remove any
|
|
mappings in case they are removed in Windows, since we still need to
|
|
match properly on the old version.
|
|
|
|
|
|
Time Zone code
|
|
==============
|
|
|
|
The code in this directory is currently synced with tzcode release 2020d.
|
|
There are many cosmetic (and not so cosmetic) differences from the
|
|
original tzcode library, but diffs in the upstream version should usually
|
|
be propagated to our version. Here are some notes about that.
|
|
|
|
For the most part we want to use the upstream code as-is, but there are
|
|
several considerations preventing an exact match:
|
|
|
|
* For readability/maintainability we reformat the code to match our own
|
|
conventions; this includes pgindent'ing it and getting rid of upstream's
|
|
overuse of "register" declarations. (It used to include conversion of
|
|
old-style function declarations to C89 style, but thank goodness they
|
|
fixed that.)
|
|
|
|
* We need the code to follow Postgres' portability conventions; this
|
|
includes relying on configure's results rather than hand-hacked
|
|
#defines (see private.h in particular).
|
|
|
|
* Similarly, avoid relying on <stdint.h> features that may not exist on old
|
|
systems. In particular this means using Postgres' definitions of the int32
|
|
and int64 typedefs, not int_fast32_t/int_fast64_t. Likewise we use
|
|
PG_INT32_MIN/MAX not INT32_MIN/MAX. (Once we desupport all PG versions
|
|
that don't require C99, it'd be practical to rely on <stdint.h> and remove
|
|
this set of diffs; but that day is not yet.)
|
|
|
|
* Since Postgres is typically built on a system that has its own copy
|
|
of the <time.h> functions, we must avoid conflicting with those. This
|
|
mandates renaming typedef time_t to pg_time_t, and similarly for most
|
|
other exposed names.
|
|
|
|
* zic.c's typedef "lineno" is renamed to "lineno_t", because having
|
|
"lineno" in our typedefs list would cause unfortunate pgindent behavior
|
|
in some other files where we have variables named that.
|
|
|
|
* We have exposed the tzload() and tzparse() internal functions, and
|
|
slightly modified the API of the former, in part because it now relies
|
|
on our own pg_open_tzfile() rather than opening files for itself.
|
|
|
|
* tzparse() is adjusted to never try to load the TZDEFRULES zone.
|
|
|
|
* There's a fair amount of code we don't need and have removed,
|
|
including all the nonstandard optional APIs. We have also added
|
|
a few functions of our own at the bottom of localtime.c.
|
|
|
|
* In zic.c, we have added support for a -P (print_abbrevs) switch, which
|
|
is used to create the "abbrevs.txt" summary of currently-in-use zone
|
|
abbreviations that was described above.
|
|
|
|
|
|
The most convenient way to compare a new tzcode release to our code is
|
|
to first run the tzcode source files through a sed filter like this:
|
|
|
|
sed -r \
|
|
-e 's/^([ \t]*)\*\*([ \t])/\1 *\2/' \
|
|
-e 's/^([ \t]*)\*\*$/\1 */' \
|
|
-e 's|^\*/| */|' \
|
|
-e 's/\bregister[ \t]//g' \
|
|
-e 's/\bATTRIBUTE_PURE[ \t]//g' \
|
|
-e 's/int_fast32_t/int32/g' \
|
|
-e 's/int_fast64_t/int64/g' \
|
|
-e 's/intmax_t/int64/g' \
|
|
-e 's/INT32_MIN/PG_INT32_MIN/g' \
|
|
-e 's/INT32_MAX/PG_INT32_MAX/g' \
|
|
-e 's/INTMAX_MIN/PG_INT64_MIN/g' \
|
|
-e 's/INTMAX_MAX/PG_INT64_MAX/g' \
|
|
-e 's/struct[ \t]+tm\b/struct pg_tm/g' \
|
|
-e 's/\btime_t\b/pg_time_t/g' \
|
|
-e 's/lineno/lineno_t/g' \
|
|
|
|
and then run them through pgindent. (The first three sed patterns deal
|
|
with conversion of their block comment style to something pgindent
|
|
won't make a hash of; the remainder address other points noted above.)
|
|
After that, the files can be diff'd directly against our corresponding
|
|
files. Also, it's typically helpful to diff against the previous tzcode
|
|
release (after processing that the same way), and then try to apply the
|
|
diff to our files. This will take care of most of the changes
|
|
mechanically.
|