mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-10-06 16:37:05 +02:00
022fd99668
input routines. Remove the former "DecodePosixTimezone" function in favor of letting the zic code handle POSIX-style zone specs (see tzparse()). In particular this means that "PST+3" now means the same as "-03", whereas it used to mean "-11" --- the zone abbreviation is effectively just a noise word in this syntax. Make sure that all named and POSIX-style zone names will be parsed as a single token. Fix long-standing bogosities in printing and input of fractional-hour timezone offsets (since the tzparse() code will accept these, we'd better make 'em work). Also correct an error in the original coding of the zic-zone-name patch: in "timestamp without time zone" input, zone names are supposed to be allowed but ignored, but the coding was such that the zone changed the interpretation anyway.
579 lines
17 KiB
Plaintext
579 lines
17 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/datetime.sgml,v 2.55 2006/10/17 21:03:20 tgl Exp $ -->
|
|
|
|
<appendix id="datetime-appendix">
|
|
<title>Date/Time Support</title>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> uses an internal heuristic
|
|
parser for all date/time input support. Dates and times are input as
|
|
strings, and are broken up into distinct fields with a preliminary
|
|
determination of what kind of information may be in the
|
|
field. Each field is interpreted and either assigned a numeric
|
|
value, ignored, or rejected.
|
|
The parser contains internal lookup tables for all textual fields,
|
|
including months, days of the week, and time zones.
|
|
</para>
|
|
|
|
<para>
|
|
This appendix includes information on the content of these
|
|
lookup tables and describes the steps used by the parser to decode
|
|
dates and times.
|
|
</para>
|
|
|
|
<sect1>
|
|
<title>Date/Time Input Interpretation</title>
|
|
|
|
<para>
|
|
The date/time type inputs are all decoded using the following procedure.
|
|
</para>
|
|
|
|
<procedure>
|
|
<step>
|
|
<para>
|
|
Break the input string into tokens and categorize each token as
|
|
a string, time, time zone, or number.
|
|
</para>
|
|
|
|
<substeps>
|
|
<step>
|
|
<para>
|
|
If the numeric token contains a colon (<literal>:</>), this is
|
|
a time string. Include all subsequent digits and colons.
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If the numeric token contains a dash (<literal>-</>), slash
|
|
(<literal>/</>), or two or more dots (<literal>.</>), this is
|
|
a date string which may have a text month. If a date token has
|
|
already been seen, it is instead interpreted as a time zone
|
|
name (e.g., <literal>America/New_York</>).
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If the token is numeric only, then it is either a single field
|
|
or an ISO 8601 concatenated date (e.g.,
|
|
<literal>19990113</literal> for January 13, 1999) or time
|
|
(e.g., <literal>141516</literal> for 14:15:16).
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If the token starts with a plus (<literal>+</>) or minus
|
|
(<literal>-</>), then it is either a numeric time zone or a special
|
|
field.
|
|
</para>
|
|
</step>
|
|
</substeps>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If the token is a text string, match up with possible strings:
|
|
</para>
|
|
|
|
<substeps>
|
|
<step>
|
|
<para>
|
|
Do a binary-search table lookup for the token as a time zone
|
|
abbreviation.
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If not found, do a similar binary-search table lookup to match
|
|
the token as either a special string (e.g., <literal>today</literal>),
|
|
day (e.g., <literal>Thursday</literal>),
|
|
month (e.g., <literal>January</literal>),
|
|
or noise word (e.g., <literal>at</literal>, <literal>on</literal>).
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If still not found, throw an error.
|
|
</para>
|
|
</step>
|
|
</substeps>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
When the token is a number or number field:
|
|
</para>
|
|
|
|
<substeps>
|
|
<step>
|
|
<para>
|
|
If there are eight or six digits,
|
|
and if no other date fields have been previously read, then interpret
|
|
as a <quote>concatenated date</quote> (e.g.,
|
|
<literal>19990118</literal> or <literal>990118</literal>).
|
|
The interpretation is <literal>YYYYMMDD</> or <literal>YYMMDD</>.
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If the token is three digits
|
|
and a year has already been read, then interpret as day of year.
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If four or six digits and a year has already been read, then
|
|
interpret as a time (<literal>HHMM</> or <literal>HHMMSS</>).
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If three or more digits and no date fields have yet been found,
|
|
interpret as a year (this forces yy-mm-dd ordering of the remaining
|
|
date fields).
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
Otherwise the date field ordering is assumed to follow the
|
|
<varname>DateStyle</> setting: mm-dd-yy, dd-mm-yy, or yy-mm-dd.
|
|
Throw an error if a month or day field is found to be out of range.
|
|
</para>
|
|
</step>
|
|
</substeps>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If BC has been specified, negate the year and add one for
|
|
internal storage. (There is no year zero in the Gregorian
|
|
calendar, so numerically 1 BC becomes year zero.)
|
|
</para>
|
|
</step>
|
|
|
|
<step>
|
|
<para>
|
|
If BC was not specified, and if the year field was two digits in length,
|
|
then adjust the year to four digits. If the field is less than 70, then
|
|
add 2000, otherwise add 1900.
|
|
|
|
<tip>
|
|
<para>
|
|
Gregorian years AD 1-99 may be entered by using 4 digits with leading
|
|
zeros (e.g., <literal>0099</> is AD 99).
|
|
</para>
|
|
</tip>
|
|
</para>
|
|
</step>
|
|
</procedure>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="datetime-keywords">
|
|
<title>Date/Time Key Words</title>
|
|
|
|
<para>
|
|
<xref linkend="datetime-month-table"> shows the tokens that are
|
|
recognized as names of months.
|
|
</para>
|
|
|
|
<table id="datetime-month-table">
|
|
<title>Month Names</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Month</entry>
|
|
<entry>Abbreviations</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>January</entry>
|
|
<entry>Jan</entry>
|
|
</row>
|
|
<row>
|
|
<entry>February</entry>
|
|
<entry>Feb</entry>
|
|
</row>
|
|
<row>
|
|
<entry>March</entry>
|
|
<entry>Mar</entry>
|
|
</row>
|
|
<row>
|
|
<entry>April</entry>
|
|
<entry>Apr</entry>
|
|
</row>
|
|
<row>
|
|
<entry>May</entry>
|
|
<entry></entry>
|
|
</row>
|
|
<row>
|
|
<entry>June</entry>
|
|
<entry>Jun</entry>
|
|
</row>
|
|
<row>
|
|
<entry>July</entry>
|
|
<entry>Jul</entry>
|
|
</row>
|
|
<row>
|
|
<entry>August</entry>
|
|
<entry>Aug</entry>
|
|
</row>
|
|
<row>
|
|
<entry>September</entry>
|
|
<entry>Sep, Sept</entry>
|
|
</row>
|
|
<row>
|
|
<entry>October</entry>
|
|
<entry>Oct</entry>
|
|
</row>
|
|
<row>
|
|
<entry>November</entry>
|
|
<entry>Nov</entry>
|
|
</row>
|
|
<row>
|
|
<entry>December</entry>
|
|
<entry>Dec</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
<xref linkend="datetime-dow-table"> shows the tokens that are
|
|
recognized as names of days of the week.
|
|
</para>
|
|
|
|
<table id="datetime-dow-table">
|
|
<title>Day of the Week Names</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Day</entry>
|
|
<entry>Abbreviations</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>Sunday</entry>
|
|
<entry>Sun</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Monday</entry>
|
|
<entry>Mon</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Tuesday</entry>
|
|
<entry>Tue, Tues</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Wednesday</entry>
|
|
<entry>Wed, Weds</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Thursday</entry>
|
|
<entry>Thu, Thur, Thurs</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Friday</entry>
|
|
<entry>Fri</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Saturday</entry>
|
|
<entry>Sat</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
<xref linkend="datetime-mod-table"> shows the tokens that serve
|
|
various modifier purposes.
|
|
</para>
|
|
|
|
<table id="datetime-mod-table">
|
|
<title>Date/Time Field Modifiers</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Identifier</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>ABSTIME</literal></entry>
|
|
<entry>Ignored</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>AM</literal></entry>
|
|
<entry>Time is before 12:00</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>AT</literal></entry>
|
|
<entry>Ignored</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>JULIAN</>, <literal>JD</>, <literal>J</></entry>
|
|
<entry>Next field is Julian Day</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>ON</literal></entry>
|
|
<entry>Ignored</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>PM</literal></entry>
|
|
<entry>Time is on or after 12:00</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>T</literal></entry>
|
|
<entry>Next field is time</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
The key word <literal>ABSTIME</literal> is ignored for historical
|
|
reasons: In very old releases of
|
|
<productname>PostgreSQL</productname>, invalid values of type <type>abstime</type>
|
|
were emitted as <literal>Invalid Abstime</literal>. This is no
|
|
longer the case however and this key word will likely be dropped in
|
|
a future release.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="datetime-config-files">
|
|
<title>Date/Time Configuration Files</title>
|
|
|
|
<indexterm>
|
|
<primary>time zone</primary>
|
|
<secondary>input abbreviations</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Since timezone abbreviations are not well standardized,
|
|
<productname>PostgreSQL</productname> provides a means to customize
|
|
the set of abbreviations accepted by the server. The
|
|
<xref linkend="guc-timezone-abbreviations"> run-time parameter
|
|
determines the active set of abbreviations. While this parameter
|
|
can be altered by any database user, the possible values for it
|
|
are under the control of the database administrator — they
|
|
are in fact names of configuration files stored in
|
|
<filename>.../share/timezonesets/</> of the installation directory.
|
|
By adding or altering files in that directory, the administrator
|
|
can set local policy for timezone abbreviations.
|
|
</para>
|
|
|
|
<para>
|
|
<literal>timezone_abbreviations</> can be set to any file name
|
|
found in <filename>.../share/timezonesets/</>, if the file's name
|
|
is entirely alphabetic. (The prohibition against non-alphabetic
|
|
characters in <literal>timezone_abbreviations</> prevents reading
|
|
files outside the intended directory, as well as reading editor
|
|
backup files and other extraneous files.)
|
|
</para>
|
|
|
|
<para>
|
|
A timezone abbreviation file may contain blank lines and comments
|
|
beginning with <literal>#</>. Non-comment lines must have one of
|
|
these formats:
|
|
|
|
<synopsis>
|
|
<replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable>
|
|
<replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable> D
|
|
@INCLUDE <replaceable>file_name</replaceable>
|
|
@OVERRIDE
|
|
</synopsis>
|
|
</para>
|
|
|
|
<para>
|
|
A <replaceable>time_zone_name</replaceable> is just the abbreviation
|
|
being defined. The <replaceable>offset</replaceable> is the zone's
|
|
offset in seconds from UTC, positive being east from Greenwich and
|
|
negative being west. For example, -18000 would be five hours west
|
|
of Greenwich, or North American east coast standard time. <literal>D</>
|
|
indicates that the zone name represents local daylight-savings time
|
|
rather than standard time. Since all known time zone offsets are on
|
|
15 minute boundaries, the number of seconds has to be a multiple of 900.
|
|
</para>
|
|
|
|
<para>
|
|
The <literal>@INCLUDE</> syntax allows inclusion of another file in the
|
|
<filename>.../share/timezonesets/</> directory. Inclusion can be nested,
|
|
to a limited depth.
|
|
</para>
|
|
|
|
<para>
|
|
The <literal>@OVERRIDE</> syntax indicates that subsequent entries in the
|
|
file may override previous entries (i.e., entries obtained from included
|
|
files). Without this, conflicting definitions of the same timezone
|
|
abbreviation are considered an error.
|
|
</para>
|
|
|
|
<para>
|
|
In an unmodified installation, the file <filename>Default</> contains
|
|
all the non-conflicting time zone abbreviations for most of the world.
|
|
Additional files <filename>Australia</> and <filename>India</> are
|
|
provided for those regions: these files first include the
|
|
<literal>Default</> file and then add or modify timezones as needed.
|
|
</para>
|
|
|
|
<para>
|
|
For reference purposes, a standard installation also contains files
|
|
<filename>Africa.txt</>, <filename>America.txt</>, etc, containing
|
|
information about every time zone abbreviation known to be in use
|
|
according to the <literal>zic</> timezone database. The zone name
|
|
definitions found in these files can be copied and pasted into a custom
|
|
configuration file as needed. Note that these files cannot be directly
|
|
referenced as <literal>timezone_abbreviations</> settings, because of
|
|
the dot embedded in their names.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
If an error occurs while reading the time zone data sets, no new value is
|
|
applied but the old set is kept. If the error occurs while starting the
|
|
database, startup fails.
|
|
</para>
|
|
</note>
|
|
|
|
<caution>
|
|
<para>
|
|
Time zone abbreviations defined in the configuration file override
|
|
non-timezone meanings built into <productname>PostgreSQL</productname>.
|
|
For example, the <filename>Australia</> configuration file defines
|
|
<literal>SAT</> (for South Australian Standard Time). When this
|
|
file is active, <literal>SAT</> will not be recognized as an abbreviation
|
|
for Saturday.
|
|
</para>
|
|
</caution>
|
|
|
|
<caution>
|
|
<para>
|
|
If you modify files in <filename>.../share/timezonesets/</>,
|
|
it is up to you to make backups — a normal database dump
|
|
will not include this directory.
|
|
</para>
|
|
</caution>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="datetime-units-history">
|
|
<title>History of Units</title>
|
|
|
|
<para>
|
|
The Julian Date was invented by the French scholar
|
|
Joseph Justus Scaliger (1540-1609)
|
|
and probably takes its name from Scaliger's father,
|
|
the Italian scholar Julius Caesar Scaliger (1484-1558).
|
|
Astronomers have used the Julian period to assign a unique number to
|
|
every day since 1 January 4713 BC. This is the so-called Julian Date
|
|
(JD). JD 0 designates the 24 hours from noon UTC on 1 January 4713 BC
|
|
to noon UTC on 2 January 4713 BC.
|
|
</para>
|
|
|
|
<para>
|
|
The <quote>Julian Date</quote> is different from the <quote>Julian
|
|
Calendar</quote>. The Julian calendar
|
|
was introduced by Julius Caesar in 45 BC. It was in common use
|
|
until the year 1582, when countries started changing to the Gregorian
|
|
calendar. In the Julian calendar, the tropical year is
|
|
approximated as 365 1/4 days = 365.25 days. This gives an error of
|
|
about 1 day in 128 years.
|
|
</para>
|
|
|
|
<para>
|
|
The accumulating calendar error prompted
|
|
Pope Gregory XIII to reform the calendar in accordance with
|
|
instructions from the Council of Trent.
|
|
In the Gregorian calendar, the tropical year is approximated as
|
|
365 + 97 / 400 days = 365.2425 days. Thus it takes approximately 3300
|
|
years for the tropical year to shift one day with respect to the
|
|
Gregorian calendar.
|
|
</para>
|
|
|
|
<para>
|
|
The approximation 365+97/400 is achieved by having 97 leap years
|
|
every 400 years, using the following rules:
|
|
|
|
<simplelist>
|
|
<member>
|
|
Every year divisible by 4 is a leap year.
|
|
</member>
|
|
<member>
|
|
However, every year divisible by 100 is not a leap year.
|
|
</member>
|
|
<member>
|
|
However, every year divisible by 400 is a leap year after all.
|
|
</member>
|
|
</simplelist>
|
|
|
|
So, 1700, 1800, 1900, 2100, and 2200 are not leap years. But 1600,
|
|
2000, and 2400 are leap years.
|
|
|
|
By contrast, in the older Julian calendar all years divisible by 4 are leap
|
|
years.
|
|
</para>
|
|
|
|
<para>
|
|
The papal bull of February 1582 decreed that 10 days should be dropped
|
|
from October 1582 so that 15 October should follow immediately after
|
|
4 October.
|
|
This was observed in Italy, Poland, Portugal, and Spain. Other Catholic
|
|
countries followed shortly after, but Protestant countries were
|
|
reluctant to change, and the Greek orthodox countries didn't change
|
|
until the start of the 20th century.
|
|
|
|
The reform was observed by Great Britain and Dominions (including what is
|
|
now the USA) in 1752.
|
|
Thus 2 September 1752 was followed by 14 September 1752.
|
|
|
|
This is why Unix systems have the <command>cal</command> program
|
|
produce the following:
|
|
|
|
<screen>
|
|
$ <userinput>cal 9 1752</userinput>
|
|
September 1752
|
|
S M Tu W Th F S
|
|
1 2 14 15 16
|
|
17 18 19 20 21 22 23
|
|
24 25 26 27 28 29 30
|
|
</screen>
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
The SQL standard states that <quote>Within the definition of a
|
|
<quote>datetime literal</quote>, the <quote>datetime
|
|
value</quote>s are constrained by the natural rules for dates and
|
|
times according to the Gregorian calendar</quote>. Dates between
|
|
1752-09-03 and 1752-09-13, although eliminated in some countries
|
|
by Papal fiat, conform to <quote>natural rules</quote> and are
|
|
hence valid dates.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
Different calendars have been developed in various parts of the
|
|
world, many predating the Gregorian system.
|
|
|
|
For example,
|
|
the beginnings of the Chinese calendar can be traced back to the 14th
|
|
century BC. Legend has it that the Emperor Huangdi invented the
|
|
calendar in 2637 BC.
|
|
|
|
The People's Republic of China uses the Gregorian calendar
|
|
for civil purposes. The Chinese calendar is used for determining
|
|
festivals.
|
|
</para>
|
|
</sect1>
|
|
</appendix>
|