mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-09-10 08:49:25 +02:00
a2a8c7a662
Both hex format and the traditional "escape" format are automatically handled on input. The output format is selected by the new GUC variable bytea_output. As committed, bytea_output defaults to HEX, which is an *incompatible change*. We will keep it this way for awhile for testing purposes, but should consider whether to switch to the more backwards-compatible default of ESCAPE before 8.5 is released. Peter Eisentraut
4502 lines
154 KiB
Plaintext
4502 lines
154 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.241 2009/08/04 16:08:35 tgl Exp $ -->
|
|
|
|
<chapter id="datatype">
|
|
<title id="datatype-title">Data Types</title>
|
|
|
|
<indexterm zone="datatype">
|
|
<primary>data type</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>type</primary>
|
|
<see>data type</see>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> has a rich set of native data
|
|
types available to users. Users can add new types to
|
|
<productname>PostgreSQL</productname> using the <xref
|
|
linkend="sql-createtype" endterm="sql-createtype-title"> command.
|
|
</para>
|
|
|
|
<para>
|
|
<xref linkend="datatype-table"> shows all the built-in general-purpose data
|
|
types. Most of the alternative names listed in the
|
|
<quote>Aliases</quote> column are the names used internally by
|
|
<productname>PostgreSQL</productname> for historical reasons. In
|
|
addition, some internally used or deprecated types are available,
|
|
but are not listed here.
|
|
</para>
|
|
|
|
<table id="datatype-table">
|
|
<title>Data Types</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Aliases</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
|
|
<tbody>
|
|
<row>
|
|
<entry><type>bigint</type></entry>
|
|
<entry><type>int8</type></entry>
|
|
<entry>signed eight-byte integer</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>bigserial</type></entry>
|
|
<entry><type>serial8</type></entry>
|
|
<entry>autoincrementing eight-byte integer</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>bit [ (<replaceable>n</replaceable>) ]</type></entry>
|
|
<entry></entry>
|
|
<entry>fixed-length bit string</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>bit varying [ (<replaceable>n</replaceable>) ]</type></entry>
|
|
<entry><type>varbit</type></entry>
|
|
<entry>variable-length bit string</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>boolean</type></entry>
|
|
<entry><type>bool</type></entry>
|
|
<entry>logical Boolean (true/false)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>box</type></entry>
|
|
<entry></entry>
|
|
<entry>rectangular box on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>bytea</type></entry>
|
|
<entry></entry>
|
|
<entry>binary data (<quote>byte array</>)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>character varying [ (<replaceable>n</replaceable>) ]</type></entry>
|
|
<entry><type>varchar [ (<replaceable>n</replaceable>) ]</type></entry>
|
|
<entry>variable-length character string</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>character [ (<replaceable>n</replaceable>) ]</type></entry>
|
|
<entry><type>char [ (<replaceable>n</replaceable>) ]</type></entry>
|
|
<entry>fixed-length character string</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>cidr</type></entry>
|
|
<entry></entry>
|
|
<entry>IPv4 or IPv6 network address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>circle</type></entry>
|
|
<entry></entry>
|
|
<entry>circle on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>date</type></entry>
|
|
<entry></entry>
|
|
<entry>calendar date (year, month, day)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>double precision</type></entry>
|
|
<entry><type>float8</type></entry>
|
|
<entry>double precision floating-point number (8 bytes)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>inet</type></entry>
|
|
<entry></entry>
|
|
<entry>IPv4 or IPv6 host address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>integer</type></entry>
|
|
<entry><type>int</type>, <type>int4</type></entry>
|
|
<entry>signed four-byte integer</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
|
|
<entry></entry>
|
|
<entry>time span</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>line</type></entry>
|
|
<entry></entry>
|
|
<entry>infinite line on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>lseg</type></entry>
|
|
<entry></entry>
|
|
<entry>line segment on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>macaddr</type></entry>
|
|
<entry></entry>
|
|
<entry>MAC (Media Access Control) address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>money</type></entry>
|
|
<entry></entry>
|
|
<entry>currency amount</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>numeric [ (<replaceable>p</replaceable>,
|
|
<replaceable>s</replaceable>) ]</type></entry>
|
|
<entry><type>decimal [ (<replaceable>p</replaceable>,
|
|
<replaceable>s</replaceable>) ]</type></entry>
|
|
<entry>exact numeric of selectable precision</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>path</type></entry>
|
|
<entry></entry>
|
|
<entry>geometric path on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>point</type></entry>
|
|
<entry></entry>
|
|
<entry>geometric point on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>polygon</type></entry>
|
|
<entry></entry>
|
|
<entry>closed geometric path on a plane</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>real</type></entry>
|
|
<entry><type>float4</type></entry>
|
|
<entry>single precision floating-point number (4 bytes)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>smallint</type></entry>
|
|
<entry><type>int2</type></entry>
|
|
<entry>signed two-byte integer</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>serial</type></entry>
|
|
<entry><type>serial4</type></entry>
|
|
<entry>autoincrementing four-byte integer</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>text</type></entry>
|
|
<entry></entry>
|
|
<entry>variable-length character string</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
|
|
<entry></entry>
|
|
<entry>time of day (no time zone)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
|
|
<entry><type>timetz</type></entry>
|
|
<entry>time of day, including time zone</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
|
|
<entry></entry>
|
|
<entry>date and time (no time zone)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
|
|
<entry><type>timestamptz</type></entry>
|
|
<entry>date and time, including time zone</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>tsquery</type></entry>
|
|
<entry></entry>
|
|
<entry>text search query</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>tsvector</type></entry>
|
|
<entry></entry>
|
|
<entry>text search document</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>txid_snapshot</type></entry>
|
|
<entry></entry>
|
|
<entry>user-level transaction ID snapshot</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>uuid</type></entry>
|
|
<entry></entry>
|
|
<entry>universally unique identifier</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>xml</type></entry>
|
|
<entry></entry>
|
|
<entry>XML data</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<note>
|
|
<title>Compatibility</title>
|
|
<para>
|
|
The following types (or spellings thereof) are specified by
|
|
<acronym>SQL</acronym>: <type>bigint</type>, <type>bit</type>, <type>bit
|
|
varying</type>, <type>boolean</type>, <type>char</type>,
|
|
<type>character varying</type>, <type>character</type>,
|
|
<type>varchar</type>, <type>date</type>, <type>double
|
|
precision</type>, <type>integer</type>, <type>interval</type>,
|
|
<type>numeric</type>, <type>decimal</type>, <type>real</type>,
|
|
<type>smallint</type>, <type>time</type> (with or without time zone),
|
|
<type>timestamp</type> (with or without time zone),
|
|
<type>xml</type>.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
Each data type has an external representation determined by its input
|
|
and output functions. Many of the built-in types have
|
|
obvious external formats. However, several types are either unique
|
|
to <productname>PostgreSQL</productname>, such as geometric
|
|
paths, or have several possible formats, such as the date
|
|
and time types.
|
|
Some of the input and output functions are not invertible, i.e.,
|
|
the result of an output function might lose accuracy when compared to
|
|
the original input.
|
|
</para>
|
|
|
|
<sect1 id="datatype-numeric">
|
|
<title>Numeric Types</title>
|
|
|
|
<indexterm zone="datatype-numeric">
|
|
<primary>data type</primary>
|
|
<secondary>numeric</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Numeric types consist of two-, four-, and eight-byte integers,
|
|
four- and eight-byte floating-point numbers, and selectable-precision
|
|
decimals. <xref linkend="datatype-numeric-table"> lists the
|
|
available types.
|
|
</para>
|
|
|
|
<table id="datatype-numeric-table">
|
|
<title>Numeric Types</title>
|
|
<tgroup cols="4">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Description</entry>
|
|
<entry>Range</entry>
|
|
</row>
|
|
</thead>
|
|
|
|
<tbody>
|
|
<row>
|
|
<entry><type>smallint</></entry>
|
|
<entry>2 bytes</entry>
|
|
<entry>small-range integer</entry>
|
|
<entry>-32768 to +32767</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>integer</></entry>
|
|
<entry>4 bytes</entry>
|
|
<entry>typical choice for integer</entry>
|
|
<entry>-2147483648 to +2147483647</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>bigint</></entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>large-range integer</entry>
|
|
<entry>-9223372036854775808 to 9223372036854775807</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>decimal</></entry>
|
|
<entry>variable</entry>
|
|
<entry>user-specified precision, exact</entry>
|
|
<entry>no limit</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>numeric</></entry>
|
|
<entry>variable</entry>
|
|
<entry>user-specified precision, exact</entry>
|
|
<entry>no limit</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>real</></entry>
|
|
<entry>4 bytes</entry>
|
|
<entry>variable-precision, inexact</entry>
|
|
<entry>6 decimal digits precision</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>double precision</></entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>variable-precision, inexact</entry>
|
|
<entry>15 decimal digits precision</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>serial</></entry>
|
|
<entry>4 bytes</entry>
|
|
<entry>autoincrementing integer</entry>
|
|
<entry>1 to 2147483647</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>bigserial</type></entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>large autoincrementing integer</entry>
|
|
<entry>1 to 9223372036854775807</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
The syntax of constants for the numeric types is described in
|
|
<xref linkend="sql-syntax-constants">. The numeric types have a
|
|
full set of corresponding arithmetic operators and
|
|
functions. Refer to <xref linkend="functions"> for more
|
|
information. The following sections describe the types in detail.
|
|
</para>
|
|
|
|
<sect2 id="datatype-int">
|
|
<title>Integer Types</title>
|
|
|
|
<indexterm zone="datatype-int">
|
|
<primary>integer</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-int">
|
|
<primary>smallint</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-int">
|
|
<primary>bigint</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>int4</primary>
|
|
<see>integer</see>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>int2</primary>
|
|
<see>smallint</see>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>int8</primary>
|
|
<see>bigint</see>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The types <type>smallint</type>, <type>integer</type>, and
|
|
<type>bigint</type> store whole numbers, that is, numbers without
|
|
fractional components, of various ranges. Attempts to store
|
|
values outside of the allowed range will result in an error.
|
|
</para>
|
|
|
|
<para>
|
|
The type <type>integer</type> is the common choice, as it offers
|
|
the best balance between range, storage size, and performance.
|
|
The <type>smallint</type> type is generally only used if disk
|
|
space is at a premium. The <type>bigint</type> type should only
|
|
be used if the <type>integer</type> range is insufficient,
|
|
because the latter is definitely faster.
|
|
</para>
|
|
|
|
<para>
|
|
On very minimal operating systems the <type>bigint</type> type
|
|
might not function correctly, because it relies on compiler support
|
|
for eight-byte integers. On such machines, <type>bigint</type>
|
|
acts the same as <type>integer</type>, but still takes up eight
|
|
bytes of storage. (We are not aware of any modern
|
|
platform where this is the case.)
|
|
</para>
|
|
|
|
<para>
|
|
<acronym>SQL</acronym> only specifies the integer types
|
|
<type>integer</type> (or <type>int</type>),
|
|
<type>smallint</type>, and <type>bigint</type>. The
|
|
type names <type>int2</type>, <type>int4</type>, and
|
|
<type>int8</type> are extensions, which are also used by some
|
|
other <acronym>SQL</acronym> database systems.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-numeric-decimal">
|
|
<title>Arbitrary Precision Numbers</title>
|
|
|
|
<indexterm>
|
|
<primary>numeric (data type)</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>arbitrary precision numbers</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>decimal</primary>
|
|
<see>numeric</see>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The type <type>numeric</type> can store numbers with up to 1000
|
|
digits of precision and perform calculations exactly. It is
|
|
especially recommended for storing monetary amounts and other
|
|
quantities where exactness is required. However, arithmetic on
|
|
<type>numeric</type> values is very slow compared to the integer
|
|
types, or to the floating-point types described in the next section.
|
|
</para>
|
|
|
|
<para>
|
|
We use the following terms below: The
|
|
<firstterm>scale</firstterm> of a <type>numeric</type> is the
|
|
count of decimal digits in the fractional part, to the right of
|
|
the decimal point. The <firstterm>precision</firstterm> of a
|
|
<type>numeric</type> is the total count of significant digits in
|
|
the whole number, that is, the number of digits to both sides of
|
|
the decimal point. So the number 23.5141 has a precision of 6
|
|
and a scale of 4. Integers can be considered to have a scale of
|
|
zero.
|
|
</para>
|
|
|
|
<para>
|
|
Both the maximum precision and the maximum scale of a
|
|
<type>numeric</type> column can be
|
|
configured. To declare a column of type <type>numeric</type> use
|
|
the syntax:
|
|
<programlisting>
|
|
NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>)
|
|
</programlisting>
|
|
The precision must be positive, the scale zero or positive.
|
|
Alternatively:
|
|
<programlisting>
|
|
NUMERIC(<replaceable>precision</replaceable>)
|
|
</programlisting>
|
|
selects a scale of 0. Specifying:
|
|
<programlisting>
|
|
NUMERIC
|
|
</programlisting>
|
|
without any precision or scale creates a column in which numeric
|
|
values of any precision and scale can be stored, up to the
|
|
implementation limit on precision. A column of this kind will
|
|
not coerce input values to any particular scale, whereas
|
|
<type>numeric</type> columns with a declared scale will coerce
|
|
input values to that scale. (The <acronym>SQL</acronym> standard
|
|
requires a default scale of 0, i.e., coercion to integer
|
|
precision. We find this a bit useless. If you're concerned
|
|
about portability, always specify the precision and scale
|
|
explicitly.)
|
|
</para>
|
|
|
|
<para>
|
|
If the scale of a value to be stored is greater than the declared
|
|
scale of the column, the system will round the value to the specified
|
|
number of fractional digits. Then, if the number of digits to the
|
|
left of the decimal point exceeds the declared precision minus the
|
|
declared scale, an error is raised.
|
|
</para>
|
|
|
|
<para>
|
|
Numeric values are physically stored without any extra leading or
|
|
trailing zeroes. Thus, the declared precision and scale of a column
|
|
are maximums, not fixed allocations. (In this sense the <type>numeric</>
|
|
type is more akin to <type>varchar(<replaceable>n</>)</type>
|
|
than to <type>char(<replaceable>n</>)</type>.) The actual storage
|
|
requirement is two bytes for each group of four decimal digits,
|
|
plus five to eight bytes overhead.
|
|
</para>
|
|
|
|
<indexterm>
|
|
<primary>NaN</primary>
|
|
<see>not a number</see>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>not a number</primary>
|
|
<secondary>numeric (data type)</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
In addition to ordinary numeric values, the <type>numeric</type>
|
|
type allows the special value <literal>NaN</>, meaning
|
|
<quote>not-a-number</quote>. Any operation on <literal>NaN</>
|
|
yields another <literal>NaN</>. When writing this value
|
|
as a constant in an SQL command, you must put quotes around it,
|
|
for example <literal>UPDATE table SET x = 'NaN'</>. On input,
|
|
the string <literal>NaN</> is recognized in a case-insensitive manner.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
In most implementations of the <quote>not-a-number</> concept,
|
|
<literal>NaN</> is not considered equal to any other numeric
|
|
value (including <literal>NaN</>). In order to allow
|
|
<type>numeric</> values to be sorted and used in tree-based
|
|
indexes, <productname>PostgreSQL</> treats <literal>NaN</>
|
|
values as equal, and greater than all non-<literal>NaN</>
|
|
values.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
The types <type>decimal</type> and <type>numeric</type> are
|
|
equivalent. Both types are part of the <acronym>SQL</acronym>
|
|
standard.
|
|
</para>
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="datatype-float">
|
|
<title>Floating-Point Types</title>
|
|
|
|
<indexterm zone="datatype-float">
|
|
<primary>real</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-float">
|
|
<primary>double precision</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>float4</primary>
|
|
<see>real</see>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>float8</primary>
|
|
<see>double precision</see>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-float">
|
|
<primary>floating point</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The data types <type>real</type> and <type>double
|
|
precision</type> are inexact, variable-precision numeric types.
|
|
In practice, these types are usually implementations of
|
|
<acronym>IEEE</acronym> Standard 754 for Binary Floating-Point
|
|
Arithmetic (single and double precision, respectively), to the
|
|
extent that the underlying processor, operating system, and
|
|
compiler support it.
|
|
</para>
|
|
|
|
<para>
|
|
Inexact means that some values cannot be converted exactly to the
|
|
internal format and are stored as approximations, so that storing
|
|
and retrieving a value might show slight discrepancies.
|
|
Managing these errors and how they propagate through calculations
|
|
is the subject of an entire branch of mathematics and computer
|
|
science and will not be discussed here, except for the
|
|
following points:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
If you require exact storage and calculations (such as for
|
|
monetary amounts), use the <type>numeric</type> type instead.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
If you want to do complicated calculations with these types
|
|
for anything important, especially if you rely on certain
|
|
behavior in boundary cases (infinity, underflow), you should
|
|
evaluate the implementation carefully.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Comparing two floating-point values for equality might not
|
|
always work as expected.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
On most platforms, the <type>real</type> type has a range of at least
|
|
1E-37 to 1E+37 with a precision of at least 6 decimal digits. The
|
|
<type>double precision</type> type typically has a range of around
|
|
1E-307 to 1E+308 with a precision of at least 15 digits. Values that
|
|
are too large or too small will cause an error. Rounding might
|
|
take place if the precision of an input number is too high.
|
|
Numbers too close to zero that are not representable as distinct
|
|
from zero will cause an underflow error.
|
|
</para>
|
|
|
|
<indexterm>
|
|
<primary>not a number</primary>
|
|
<secondary>double precision</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
In addition to ordinary numeric values, the floating-point types
|
|
have several special values:
|
|
<literallayout>
|
|
<literal>Infinity</literal>
|
|
<literal>-Infinity</literal>
|
|
<literal>NaN</literal>
|
|
</literallayout>
|
|
These represent the IEEE 754 special values
|
|
<quote>infinity</quote>, <quote>negative infinity</quote>, and
|
|
<quote>not-a-number</quote>, respectively. (On a machine whose
|
|
floating-point arithmetic does not follow IEEE 754, these values
|
|
will probably not work as expected.) When writing these values
|
|
as constants in an SQL command, you must put quotes around them,
|
|
for example <literal>UPDATE table SET x = 'Infinity'</>. On input,
|
|
these strings are recognized in a case-insensitive manner.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
IEEE754 specifies that <literal>NaN</> should not compare equal
|
|
to any other floating-point value (including <literal>NaN</>).
|
|
In order to allow floating-point values to be sorted and used
|
|
in tree-based indexes, <productname>PostgreSQL</> treats
|
|
<literal>NaN</> values as equal, and greater than all
|
|
non-<literal>NaN</> values.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> also supports the SQL-standard
|
|
notations <type>float</type> and
|
|
<type>float(<replaceable>p</replaceable>)</type> for specifying
|
|
inexact numeric types. Here, <replaceable>p</replaceable> specifies
|
|
the minimum acceptable precision in <emphasis>binary</> digits.
|
|
<productname>PostgreSQL</productname> accepts
|
|
<type>float(1)</type> to <type>float(24)</type> as selecting the
|
|
<type>real</type> type, while
|
|
<type>float(25)</type> to <type>float(53)</type> select
|
|
<type>double precision</type>. Values of <replaceable>p</replaceable>
|
|
outside the allowed range draw an error.
|
|
<type>float</type> with no precision specified is taken to mean
|
|
<type>double precision</type>.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Prior to <productname>PostgreSQL</productname> 7.4, the precision in
|
|
<type>float(<replaceable>p</replaceable>)</type> was taken to mean
|
|
so many <emphasis>decimal</> digits. This has been corrected to match the SQL
|
|
standard, which specifies that the precision is measured in binary
|
|
digits. The assumption that <type>real</type> and
|
|
<type>double precision</type> have exactly 24 and 53 bits in the
|
|
mantissa respectively is correct for IEEE-standard floating point
|
|
implementations. On non-IEEE platforms it might be off a little, but
|
|
for simplicity the same ranges of <replaceable>p</replaceable> are used
|
|
on all platforms.
|
|
</para>
|
|
</note>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-serial">
|
|
<title>Serial Types</title>
|
|
|
|
<indexterm zone="datatype-serial">
|
|
<primary>serial</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-serial">
|
|
<primary>bigserial</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-serial">
|
|
<primary>serial4</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-serial">
|
|
<primary>serial8</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>auto-increment</primary>
|
|
<see>serial</see>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>sequence</primary>
|
|
<secondary>and serial type</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The data types <type>serial</type> and <type>bigserial</type>
|
|
are not true types, but merely
|
|
a notational convenience for creating unique identifier columns
|
|
(similar to the <literal>AUTO_INCREMENT</literal> property
|
|
supported by some other databases). In the current
|
|
implementation, specifying:
|
|
|
|
<programlisting>
|
|
CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
|
|
<replaceable class="parameter">colname</replaceable> SERIAL
|
|
);
|
|
</programlisting>
|
|
|
|
is equivalent to specifying:
|
|
|
|
<programlisting>
|
|
CREATE SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq;
|
|
CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
|
|
<replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq')
|
|
);
|
|
ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>;
|
|
</programlisting>
|
|
|
|
Thus, we have created an integer column and arranged for its default
|
|
values to be assigned from a sequence generator. A <literal>NOT NULL</>
|
|
constraint is applied to ensure that a null value cannot be
|
|
inserted. (In most cases you would also want to attach a
|
|
<literal>UNIQUE</> or <literal>PRIMARY KEY</> constraint to prevent
|
|
duplicate values from being inserted by accident, but this is
|
|
not automatic.) Lastly, the sequence is marked as <quote>owned by</>
|
|
the column, so that it will be dropped if the column or table is dropped.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>
|
|
implied <literal>UNIQUE</literal>. This is no longer automatic. If
|
|
you wish a serial column to have a unique constraint or be a
|
|
primary key, it must now be specified, just like
|
|
any other data type.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
To insert the next value of the sequence into the <type>serial</type>
|
|
column, specify that the <type>serial</type>
|
|
column should be assigned its default value. This can be done
|
|
either by excluding the column from the list of columns in
|
|
the <command>INSERT</command> statement, or through the use of
|
|
the <literal>DEFAULT</literal> key word.
|
|
</para>
|
|
|
|
<para>
|
|
The type names <type>serial</type> and <type>serial4</type> are
|
|
equivalent: both create <type>integer</type> columns. The type
|
|
names <type>bigserial</type> and <type>serial8</type> work
|
|
the same way, except that they create a <type>bigint</type>
|
|
column. <type>bigserial</type> should be used if you anticipate
|
|
the use of more than 2<superscript>31</> identifiers over the
|
|
lifetime of the table.
|
|
</para>
|
|
|
|
<para>
|
|
The sequence created for a <type>serial</type> column is
|
|
automatically dropped when the owning column is dropped.
|
|
You can drop the sequence without dropping the column, but this
|
|
will force removal of the column default expression.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-money">
|
|
<title>Monetary Types</title>
|
|
|
|
<para>
|
|
The <type>money</type> type stores a currency amount with a fixed
|
|
fractional precision; see <xref
|
|
linkend="datatype-money-table">. The fractional precision is
|
|
determined by the database's <xref linkend="guc-lc-monetary"> setting.
|
|
Input is accepted in a variety of formats, including integer and
|
|
floating-point literals, as well as typical
|
|
currency formatting, such as <literal>'$1,000.00'</literal>.
|
|
Output is generally in the latter form but depends on the locale.
|
|
Non-quoted numeric values can be converted to <type>money</type> by
|
|
casting the numeric value to <type>text</type> and then
|
|
<type>money</type>, for example:
|
|
<programlisting>
|
|
SELECT 1234::text::money;
|
|
</programlisting>
|
|
There is no simple way of doing the reverse in a locale-independent
|
|
manner, namely casting a <type>money</type> value to a numeric type.
|
|
If you know the currency symbol and thousands separator you can use
|
|
<function>regexp_replace()</>:
|
|
<programlisting>
|
|
SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
<para>
|
|
Since the output of this data type is locale-sensitive, it might not
|
|
work to load <type>money</> data into a database that has a different
|
|
setting of <varname>lc_monetary</>. To avoid problems, before
|
|
restoring a dump into a new database make sure <varname>lc_monetary</> has the same or
|
|
equivalent value as in the database that was dumped.
|
|
</para>
|
|
|
|
<table id="datatype-money-table">
|
|
<title>Monetary Types</title>
|
|
<tgroup cols="4">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Description</entry>
|
|
<entry>Range</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>money</entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>currency amount</entry>
|
|
<entry>-92233720368547758.08 to +92233720368547758.07</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="datatype-character">
|
|
<title>Character Types</title>
|
|
|
|
<indexterm zone="datatype-character">
|
|
<primary>character string</primary>
|
|
<secondary>data types</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>string</primary>
|
|
<see>character string</see>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-character">
|
|
<primary>character</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-character">
|
|
<primary>character varying</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-character">
|
|
<primary>text</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-character">
|
|
<primary>char</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-character">
|
|
<primary>varchar</primary>
|
|
</indexterm>
|
|
|
|
<table id="datatype-character-table">
|
|
<title>Character Types</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><type>character varying(<replaceable>n</>)</type>, <type>varchar(<replaceable>n</>)</type></entry>
|
|
<entry>variable-length with limit</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>character(<replaceable>n</>)</type>, <type>char(<replaceable>n</>)</type></entry>
|
|
<entry>fixed-length, blank padded</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>text</type></entry>
|
|
<entry>variable unlimited length</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
<xref linkend="datatype-character-table"> shows the
|
|
general-purpose character types available in
|
|
<productname>PostgreSQL</productname>.
|
|
</para>
|
|
|
|
<para>
|
|
<acronym>SQL</acronym> defines two primary character types:
|
|
<type>character varying(<replaceable>n</>)</type> and
|
|
<type>character(<replaceable>n</>)</type>, where <replaceable>n</>
|
|
is a positive integer. Both of these types can store strings up to
|
|
<replaceable>n</> characters (not bytes) in length. An attempt to store a
|
|
longer string into a column of these types will result in an
|
|
error, unless the excess characters are all spaces, in which case
|
|
the string will be truncated to the maximum length. (This somewhat
|
|
bizarre exception is required by the <acronym>SQL</acronym>
|
|
standard.) If the string to be stored is shorter than the declared
|
|
length, values of type <type>character</type> will be space-padded;
|
|
values of type <type>character varying</type> will simply store the
|
|
shorter
|
|
string.
|
|
</para>
|
|
|
|
<para>
|
|
If one explicitly casts a value to <type>character
|
|
varying(<replaceable>n</>)</type> or
|
|
<type>character(<replaceable>n</>)</type>, then an over-length
|
|
value will be truncated to <replaceable>n</> characters without
|
|
raising an error. (This too is required by the
|
|
<acronym>SQL</acronym> standard.)
|
|
</para>
|
|
|
|
<para>
|
|
The notations <type>varchar(<replaceable>n</>)</type> and
|
|
<type>char(<replaceable>n</>)</type> are aliases for <type>character
|
|
varying(<replaceable>n</>)</type> and
|
|
<type>character(<replaceable>n</>)</type>, respectively.
|
|
<type>character</type> without length specifier is equivalent to
|
|
<type>character(1)</type>. If <type>character varying</type> is used
|
|
without length specifier, the type accepts strings of any size. The
|
|
latter is a <productname>PostgreSQL</> extension.
|
|
</para>
|
|
|
|
<para>
|
|
In addition, <productname>PostgreSQL</productname> provides the
|
|
<type>text</type> type, which stores strings of any length.
|
|
Although the type <type>text</type> is not in the
|
|
<acronym>SQL</acronym> standard, several other SQL database
|
|
management systems have it as well.
|
|
</para>
|
|
|
|
<para>
|
|
Values of type <type>character</type> are physically padded
|
|
with spaces to the specified width <replaceable>n</>, and are
|
|
stored and displayed that way. However, the padding spaces are
|
|
treated as semantically insignificant. Trailing spaces are
|
|
disregarded when comparing two values of type <type>character</type>,
|
|
and they will be removed when converting a <type>character</type> value
|
|
to one of the other string types. Note that trailing spaces
|
|
<emphasis>are</> semantically significant in
|
|
<type>character varying</type> and <type>text</type> values.
|
|
</para>
|
|
|
|
<para>
|
|
The storage requirement for a short string (up to 126 bytes) is 1 byte
|
|
plus the actual string, which includes the space padding in the case of
|
|
<type>character</type>. Longer strings have 4 bytes of overhead instead
|
|
of 1. Long strings are compressed by the system automatically, so
|
|
the physical requirement on disk might be less. Very long values are also
|
|
stored in background tables so that they do not interfere with rapid
|
|
access to shorter column values. In any case, the longest
|
|
possible character string that can be stored is about 1 GB. (The
|
|
maximum value that will be allowed for <replaceable>n</> in the data
|
|
type declaration is less than that. It wouldn't be useful to
|
|
change this because with multibyte character encodings the number of
|
|
characters and bytes can be quite different. If you desire to
|
|
store long strings with no specific upper limit, use
|
|
<type>text</type> or <type>character varying</type> without a length
|
|
specifier, rather than making up an arbitrary length limit.)
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
There is no performance difference among these three types,
|
|
apart from increased storage space when using the blank-padded
|
|
type, and a few extra CPU cycles to check the length when storing into
|
|
a length-constrained column. While
|
|
<type>character(<replaceable>n</>)</type> has performance
|
|
advantages in some other database systems, there is no such advantage in
|
|
<productname>PostgreSQL</productname>; in fact
|
|
<type>character(<replaceable>n</>)</type> is usually the slowest of
|
|
the three because of its additional storage costs. In most situations
|
|
<type>text</type> or <type>character varying</type> should be used
|
|
instead.
|
|
</para>
|
|
</tip>
|
|
|
|
<para>
|
|
Refer to <xref linkend="sql-syntax-strings"> for information about
|
|
the syntax of string literals, and to <xref linkend="functions">
|
|
for information about available operators and functions. The
|
|
database character set determines the character set used to store
|
|
textual values; for more information on character set support,
|
|
refer to <xref linkend="multibyte">.
|
|
</para>
|
|
|
|
<example>
|
|
<title>Using the character types</title>
|
|
|
|
<programlisting>
|
|
CREATE TABLE test1 (a character(4));
|
|
INSERT INTO test1 VALUES ('ok');
|
|
SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char">
|
|
<computeroutput>
|
|
a | char_length
|
|
------+-------------
|
|
ok | 2
|
|
</computeroutput>
|
|
|
|
CREATE TABLE test2 (b varchar(5));
|
|
INSERT INTO test2 VALUES ('ok');
|
|
INSERT INTO test2 VALUES ('good ');
|
|
INSERT INTO test2 VALUES ('too long');
|
|
<computeroutput>ERROR: value too long for type character varying(5)</computeroutput>
|
|
INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
|
|
SELECT b, char_length(b) FROM test2;
|
|
<computeroutput>
|
|
b | char_length
|
|
-------+-------------
|
|
ok | 2
|
|
good | 5
|
|
too l | 5
|
|
</computeroutput>
|
|
</programlisting>
|
|
<calloutlist>
|
|
<callout arearefs="co.datatype-char">
|
|
<para>
|
|
The <function>char_length</function> function is discussed in
|
|
<xref linkend="functions-string">.
|
|
</para>
|
|
</callout>
|
|
</calloutlist>
|
|
</example>
|
|
|
|
<para>
|
|
There are two other fixed-length character types in
|
|
<productname>PostgreSQL</productname>, shown in <xref
|
|
linkend="datatype-character-special-table">. The <type>name</type>
|
|
type exists <emphasis>only</emphasis> for the storage of identifiers
|
|
in the internal system catalogs and is not intended for use by the general user. Its
|
|
length is currently defined as 64 bytes (63 usable characters plus
|
|
terminator) but should be referenced using the constant
|
|
<symbol>NAMEDATALEN</symbol> in <literal>C</> source code.
|
|
The length is set at compile time (and
|
|
is therefore adjustable for special uses); the default maximum
|
|
length might change in a future release. The type <type>"char"</type>
|
|
(note the quotes) is different from <type>char(1)</type> in that it
|
|
only uses one byte of storage. It is internally used in the system
|
|
catalogs as a simplistic enumeration type.
|
|
</para>
|
|
|
|
<table id="datatype-character-special-table">
|
|
<title>Special Character Types</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><type>"char"</type></entry>
|
|
<entry>1 byte</entry>
|
|
<entry>single-byte internal type</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>name</type></entry>
|
|
<entry>64 bytes</entry>
|
|
<entry>internal type for object names</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-binary">
|
|
<title>Binary Data Types</title>
|
|
|
|
<indexterm zone="datatype-binary">
|
|
<primary>binary data</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-binary">
|
|
<primary>bytea</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <type>bytea</type> data type allows storage of binary strings;
|
|
see <xref linkend="datatype-binary-table">.
|
|
</para>
|
|
|
|
<table id="datatype-binary-table">
|
|
<title>Binary Data Types</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><type>bytea</type></entry>
|
|
<entry>1 or 4 bytes plus the actual binary string</entry>
|
|
<entry>variable-length binary string</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
A binary string is a sequence of octets (or bytes). Binary
|
|
strings are distinguished from character strings in two
|
|
ways. First, binary strings specifically allow storing
|
|
octets of value zero and other <quote>non-printable</quote>
|
|
octets (usually, octets outside the range 32 to 126).
|
|
Character strings disallow zero octets, and also disallow any
|
|
other octet values and sequences of octet values that are invalid
|
|
according to the database's selected character set encoding.
|
|
Second, operations on binary strings process the actual bytes,
|
|
whereas the processing of character strings depends on locale settings.
|
|
In short, binary strings are appropriate for storing data that the
|
|
programmer thinks of as <quote>raw bytes</>, whereas character
|
|
strings are appropriate for storing text.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>bytea</type> type supports two external formats for
|
|
input and output: <productname>PostgreSQL</productname>'s historical
|
|
<quote>escape</quote> format, and <quote>hex</quote> format. Both
|
|
of these are always accepted on input. The output format depends
|
|
on the configuration parameter <xref linkend="guc-bytea-output">;
|
|
the default is hex. (Note that the hex format was introduced in
|
|
<productname>PostgreSQL</productname> 8.5; earlier versions and some
|
|
tools don't understand it.)
|
|
</para>
|
|
|
|
<para>
|
|
The <acronym>SQL</acronym> standard defines a different binary
|
|
string type, called <type>BLOB</type> or <type>BINARY LARGE
|
|
OBJECT</type>. The input format is different from
|
|
<type>bytea</type>, but the provided functions and operators are
|
|
mostly the same.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title><type>bytea</> hex format</title>
|
|
|
|
<para>
|
|
The <quote>hex</> format encodes binary data as 2 hexadecimal digits
|
|
per byte, most significant nibble first. The entire string is
|
|
preceded by the sequence <literal>\x</literal> (to distinguish it
|
|
from the escape format). In some contexts, the initial backslash may
|
|
need to be escaped by doubling it, in the same cases in which backslashes
|
|
have to be doubled in escape format; details appear below.
|
|
The hexadecimal digits can
|
|
be either upper or lower case, and whitespace is permitted between
|
|
digit pairs (but not within a digit pair nor in the starting
|
|
<literal>\x</literal> sequence).
|
|
The hex format is compatible with a wide
|
|
range of external applications and protocols, and it tends to be
|
|
faster to convert than the escape format, so its use is preferred.
|
|
</para>
|
|
|
|
<para>
|
|
Example:
|
|
<programlisting>
|
|
SELECT E'\\xDEADBEEF';
|
|
</programlisting>
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title><type>bytea</> escape format</title>
|
|
|
|
<para>
|
|
The <quote>escape</quote> format is the traditional
|
|
<productname>PostgreSQL</productname> format for the <type>bytea</type>
|
|
type. It
|
|
takes the approach of representing a binary string as a sequence
|
|
of ASCII characters, while converting those bytes that cannot be
|
|
represented as an ASCII character into special escape sequences.
|
|
If, from the point of view of the application, representing bytes
|
|
as characters makes sense, then this representation can be
|
|
convenient. But in practice it is usually confusing becauses it
|
|
fuzzes up the distinction between binary strings and character
|
|
strings, and also the particular escape mechanism that was chosen is
|
|
somewhat unwieldy. So this format should probably be avoided
|
|
for most new applications.
|
|
</para>
|
|
|
|
<para>
|
|
When entering <type>bytea</type> values in escape format,
|
|
octets of certain
|
|
values <emphasis>must</emphasis> be escaped, while all octet
|
|
values <emphasis>can</emphasis> be escaped. In
|
|
general, to escape an octet, convert it into its three-digit
|
|
octal value and precede it
|
|
by a backslash (or two backslashes, if writing the value as a
|
|
literal using escape string syntax).
|
|
Backslash itself (octet value 92) can alternatively be represented by
|
|
double backslashes.
|
|
<xref linkend="datatype-binary-sqlesc">
|
|
shows the characters that must be escaped, and gives the alternative
|
|
escape sequences where applicable.
|
|
</para>
|
|
|
|
<table id="datatype-binary-sqlesc">
|
|
<title><type>bytea</> Literal Escaped Octets</title>
|
|
<tgroup cols="5">
|
|
<thead>
|
|
<row>
|
|
<entry>Decimal Octet Value</entry>
|
|
<entry>Description</entry>
|
|
<entry>Escaped Input Representation</entry>
|
|
<entry>Example</entry>
|
|
<entry>Output Representation</entry>
|
|
</row>
|
|
</thead>
|
|
|
|
<tbody>
|
|
<row>
|
|
<entry>0</entry>
|
|
<entry>zero octet</entry>
|
|
<entry><literal>E'\\000'</literal></entry>
|
|
<entry><literal>SELECT E'\\000'::bytea;</literal></entry>
|
|
<entry><literal>\000</literal></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>39</entry>
|
|
<entry>single quote</entry>
|
|
<entry><literal>''''</literal> or <literal>E'\\047'</literal></entry>
|
|
<entry><literal>SELECT E'\''::bytea;</literal></entry>
|
|
<entry><literal>'</literal></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>92</entry>
|
|
<entry>backslash</entry>
|
|
<entry><literal>E'\\\\'</literal> or <literal>E'\\134'</literal></entry>
|
|
<entry><literal>SELECT E'\\\\'::bytea;</literal></entry>
|
|
<entry><literal>\\</literal></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0 to 31 and 127 to 255</entry>
|
|
<entry><quote>non-printable</quote> octets</entry>
|
|
<entry><literal>E'\\<replaceable>xxx'</></literal> (octal value)</entry>
|
|
<entry><literal>SELECT E'\\001'::bytea;</literal></entry>
|
|
<entry><literal>\001</literal></entry>
|
|
</row>
|
|
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
The requirement to escape <emphasis>non-printable</emphasis> octets
|
|
varies depending on locale settings. In some instances you can get away
|
|
with leaving them unescaped. Note that the result in each of the examples
|
|
in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in
|
|
length, even though the output representation is sometimes
|
|
more than one character.
|
|
</para>
|
|
|
|
<para>
|
|
The reason multiple backslashes are required, as shown
|
|
in <xref linkend="datatype-binary-sqlesc">, is that an input
|
|
string written as a string literal must pass through two parse
|
|
phases in the <productname>PostgreSQL</productname> server.
|
|
The first backslash of each pair is interpreted as an escape
|
|
character by the string-literal parser (assuming escape string
|
|
syntax is used) and is therefore consumed, leaving the second backslash of the
|
|
pair. (Dollar-quoted strings can be used to avoid this level
|
|
of escaping.) The remaining backslash is then recognized by the
|
|
<type>bytea</type> input function as starting either a three
|
|
digit octal value or escaping another backslash. For example,
|
|
a string literal passed to the server as <literal>E'\\001'</literal>
|
|
becomes <literal>\001</literal> after passing through the
|
|
escape string parser. The <literal>\001</literal> is then sent
|
|
to the <type>bytea</type> input function, where it is converted
|
|
to a single octet with a decimal value of 1. Note that the
|
|
single-quote character is not treated specially by <type>bytea</type>,
|
|
so it follows the normal rules for string literals. (See also
|
|
<xref linkend="sql-syntax-strings">.)
|
|
</para>
|
|
|
|
<para>
|
|
<type>Bytea</type> octets are sometimes escaped when output. In general, each
|
|
<quote>non-printable</quote> octet is converted into
|
|
its equivalent three-digit octal value and preceded by one backslash.
|
|
Most <quote>printable</quote> octets are represented by their standard
|
|
representation in the client character set. The octet with decimal
|
|
value 92 (backslash) is doubled in the output.
|
|
Details are in <xref linkend="datatype-binary-resesc">.
|
|
</para>
|
|
|
|
<table id="datatype-binary-resesc">
|
|
<title><type>bytea</> Output Escaped Octets</title>
|
|
<tgroup cols="5">
|
|
<thead>
|
|
<row>
|
|
<entry>Decimal Octet Value</entry>
|
|
<entry>Description</entry>
|
|
<entry>Escaped Output Representation</entry>
|
|
<entry>Example</entry>
|
|
<entry>Output Result</entry>
|
|
</row>
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
<entry>92</entry>
|
|
<entry>backslash</entry>
|
|
<entry><literal>\\</literal></entry>
|
|
<entry><literal>SELECT E'\\134'::bytea;</literal></entry>
|
|
<entry><literal>\\</literal></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0 to 31 and 127 to 255</entry>
|
|
<entry><quote>non-printable</quote> octets</entry>
|
|
<entry><literal>\<replaceable>xxx</></literal> (octal value)</entry>
|
|
<entry><literal>SELECT E'\\001'::bytea;</literal></entry>
|
|
<entry><literal>\001</literal></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>32 to 126</entry>
|
|
<entry><quote>printable</quote> octets</entry>
|
|
<entry>client character set representation</entry>
|
|
<entry><literal>SELECT E'\\176'::bytea;</literal></entry>
|
|
<entry><literal>~</literal></entry>
|
|
</row>
|
|
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
Depending on the front end to <productname>PostgreSQL</> you use,
|
|
you might have additional work to do in terms of escaping and
|
|
unescaping <type>bytea</type> strings. For example, you might also
|
|
have to escape line feeds and carriage returns if your interface
|
|
automatically translates these.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="datatype-datetime">
|
|
<title>Date/Time Types</title>
|
|
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>date</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>time</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>time without time zone</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>time with time zone</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>timestamp</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>timestamp with time zone</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>timestamp without time zone</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>interval</primary>
|
|
</indexterm>
|
|
<indexterm zone="datatype-datetime">
|
|
<primary>time span</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> supports the full set of
|
|
<acronym>SQL</acronym> date and time types, shown in <xref
|
|
linkend="datatype-datetime-table">. The operations available
|
|
on these data types are described in
|
|
<xref linkend="functions-datetime">.
|
|
</para>
|
|
|
|
<table id="datatype-datetime-table">
|
|
<title>Date/Time Types</title>
|
|
<tgroup cols="6">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Description</entry>
|
|
<entry>Low Value</entry>
|
|
<entry>High Value</entry>
|
|
<entry>Resolution</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>both date and time (no time zone)</entry>
|
|
<entry>4713 BC</entry>
|
|
<entry>294276 AD</entry>
|
|
<entry>1 microsecond / 14 digits</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>both date and time, with time zone</entry>
|
|
<entry>4713 BC</entry>
|
|
<entry>294276 AD</entry>
|
|
<entry>1 microsecond / 14 digits</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>date</type></entry>
|
|
<entry>4 bytes</entry>
|
|
<entry>date (no time of day)</entry>
|
|
<entry>4713 BC</entry>
|
|
<entry>5874897 AD</entry>
|
|
<entry>1 day</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
|
|
<entry>8 bytes</entry>
|
|
<entry>time of day (no date)</entry>
|
|
<entry>00:00:00</entry>
|
|
<entry>24:00:00</entry>
|
|
<entry>1 microsecond / 14 digits</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
|
|
<entry>12 bytes</entry>
|
|
<entry>times of day only, with time zone</entry>
|
|
<entry>00:00:00+1459</entry>
|
|
<entry>24:00:00-1459</entry>
|
|
<entry>1 microsecond / 14 digits</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
|
|
<entry>12 bytes</entry>
|
|
<entry>time interval</entry>
|
|
<entry>-178000000 years</entry>
|
|
<entry>178000000 years</entry>
|
|
<entry>1 microsecond / 14 digits</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<note>
|
|
<para>
|
|
Prior to <productname>PostgreSQL</productname> 7.3, writing just
|
|
<type>timestamp</type> was equivalent to <type>timestamp with
|
|
time zone</type>. This was changed for SQL compliance.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
<type>time</type>, <type>timestamp</type>, and
|
|
<type>interval</type> accept an optional precision value
|
|
<replaceable>p</replaceable> which specifies the number of
|
|
fractional digits retained in the seconds field. By default, there
|
|
is no explicit bound on precision. The allowed range of
|
|
<replaceable>p</replaceable> is from 0 to 6 for the
|
|
<type>timestamp</type> and <type>interval</type> types.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
When <type>timestamp</> values are stored as eight-byte integers
|
|
(currently the default), microsecond precision is available over
|
|
the full range of values. When <type>timestamp</> values are
|
|
stored as double precision floating-point numbers instead (a
|
|
deprecated compile-time option), the effective limit of precision
|
|
might be less than 6. <type>timestamp</type> values are stored as
|
|
seconds before or after midnight 2000-01-01. When
|
|
<type>timestamp</type> values are implemented using floating-point
|
|
numbers, microsecond precision is achieved for dates within a few
|
|
years of 2000-01-01, but the precision degrades for dates further
|
|
away. Note that using floating-point datetimes allows a larger
|
|
range of <type>timestamp</type> values to be represented than
|
|
shown above: from 4713 BC up to 5874897 AD.
|
|
</para>
|
|
|
|
<para>
|
|
The same compile-time option also determines whether
|
|
<type>time</type> and <type>interval</type> values are stored as
|
|
floating-point numbers or eight-byte integers. In the
|
|
floating-point case, large <type>interval</type> values degrade in
|
|
precision as the size of the interval increases.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
For the <type>time</type> types, the allowed range of
|
|
<replaceable>p</replaceable> is from 0 to 6 when eight-byte integer
|
|
storage is used, or from 0 to 10 when floating-point storage is used.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>interval</type> type has an additional option, which is
|
|
to restrict the set of stored fields by writing one of these phrases:
|
|
<programlisting>
|
|
YEAR
|
|
MONTH
|
|
DAY
|
|
HOUR
|
|
MINUTE
|
|
SECOND
|
|
YEAR TO MONTH
|
|
DAY TO HOUR
|
|
DAY TO MINUTE
|
|
DAY TO SECOND
|
|
HOUR TO MINUTE
|
|
HOUR TO SECOND
|
|
MINUTE TO SECOND
|
|
</programlisting>
|
|
Note that if both <replaceable>fields</replaceable> and
|
|
<replaceable>p</replaceable> are specified, the
|
|
<replaceable>fields</replaceable> must include <literal>SECOND</>,
|
|
since the precision applies only to the seconds.
|
|
</para>
|
|
|
|
<para>
|
|
The type <type>time with time zone</type> is defined by the SQL
|
|
standard, but the definition exhibits properties which lead to
|
|
questionable usefulness. In most cases, a combination of
|
|
<type>date</type>, <type>time</type>, <type>timestamp without time
|
|
zone</type>, and <type>timestamp with time zone</type> should
|
|
provide a complete range of date/time functionality required by
|
|
any application.
|
|
</para>
|
|
|
|
<para>
|
|
The types <type>abstime</type>
|
|
and <type>reltime</type> are lower precision types which are used internally.
|
|
You are discouraged from using these types in
|
|
applications; these internal types
|
|
might disappear in a future release.
|
|
</para>
|
|
|
|
<sect2 id="datatype-datetime-input">
|
|
<title>Date/Time Input</title>
|
|
|
|
<para>
|
|
Date and time input is accepted in almost any reasonable format, including
|
|
ISO 8601, <acronym>SQL</acronym>-compatible,
|
|
traditional <productname>POSTGRES</productname>, and others.
|
|
For some formats, ordering of day, month, and year in date input is
|
|
ambiguous and there is support for specifying the expected
|
|
ordering of these fields. Set the <xref linkend="guc-datestyle"> parameter
|
|
to <literal>MDY</> to select month-day-year interpretation,
|
|
<literal>DMY</> to select day-month-year interpretation, or
|
|
<literal>YMD</> to select year-month-day interpretation.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> is more flexible in
|
|
handling date/time input than the
|
|
<acronym>SQL</acronym> standard requires.
|
|
See <xref linkend="datetime-appendix">
|
|
for the exact parsing rules of date/time input and for the
|
|
recognized text fields including months, days of the week, and
|
|
time zones.
|
|
</para>
|
|
|
|
<para>
|
|
Remember that any date or time literal input needs to be enclosed
|
|
in single quotes, like text strings. Refer to
|
|
<xref linkend="sql-syntax-constants-generic"> for more
|
|
information.
|
|
<acronym>SQL</acronym> requires the following syntax
|
|
<synopsis>
|
|
<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
|
|
</synopsis>
|
|
where <replaceable>p</replaceable> is an optional precision
|
|
specification giving the number of
|
|
fractional digits in the seconds field. Precision can be
|
|
specified for <type>time</type>, <type>timestamp</type>, and
|
|
<type>interval</type> types. The allowed values are mentioned
|
|
above. If no precision is specified in a constant specification,
|
|
it defaults to the precision of the literal value.
|
|
</para>
|
|
|
|
<sect3>
|
|
<title>Dates</title>
|
|
|
|
<indexterm>
|
|
<primary>date</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<xref linkend="datatype-datetime-date-table"> shows some possible
|
|
inputs for the <type>date</type> type.
|
|
</para>
|
|
|
|
<table id="datatype-datetime-date-table">
|
|
<title>Date Input</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Example</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>1999-01-08</entry>
|
|
<entry>ISO 8601; January 8 in any mode
|
|
(recommended format)</entry>
|
|
</row>
|
|
<row>
|
|
<entry>January 8, 1999</entry>
|
|
<entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>1/8/1999</entry>
|
|
<entry>January 8 in <literal>MDY</> mode;
|
|
August 1 in <literal>DMY</> mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>1/18/1999</entry>
|
|
<entry>January 18 in <literal>MDY</> mode;
|
|
rejected in other modes</entry>
|
|
</row>
|
|
<row>
|
|
<entry>01/02/03</entry>
|
|
<entry>January 2, 2003 in <literal>MDY</> mode;
|
|
February 1, 2003 in <literal>DMY</> mode;
|
|
February 3, 2001 in <literal>YMD</> mode
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry>1999-Jan-08</entry>
|
|
<entry>January 8 in any mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Jan-08-1999</entry>
|
|
<entry>January 8 in any mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>08-Jan-1999</entry>
|
|
<entry>January 8 in any mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>99-Jan-08</entry>
|
|
<entry>January 8 in <literal>YMD</> mode, else error</entry>
|
|
</row>
|
|
<row>
|
|
<entry>08-Jan-99</entry>
|
|
<entry>January 8, except error in <literal>YMD</> mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>Jan-08-99</entry>
|
|
<entry>January 8, except error in <literal>YMD</> mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>19990108</entry>
|
|
<entry>ISO 8601; January 8, 1999 in any mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>990108</entry>
|
|
<entry>ISO 8601; January 8, 1999 in any mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>1999.008</entry>
|
|
<entry>year and day of year</entry>
|
|
</row>
|
|
<row>
|
|
<entry>J2451187</entry>
|
|
<entry>Julian day</entry>
|
|
</row>
|
|
<row>
|
|
<entry>January 8, 99 BC</entry>
|
|
<entry>year 99 BC</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Times</title>
|
|
|
|
<indexterm>
|
|
<primary>time</primary>
|
|
</indexterm>
|
|
<indexterm>
|
|
<primary>time without time zone</primary>
|
|
</indexterm>
|
|
<indexterm>
|
|
<primary>time with time zone</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The time-of-day types are <type>time [
|
|
(<replaceable>p</replaceable>) ] without time zone</type> and
|
|
<type>time [ (<replaceable>p</replaceable>) ] with time
|
|
zone</type>. <type>time</type> alone is equivalent to
|
|
<type>time without time zone</type>.
|
|
</para>
|
|
|
|
<para>
|
|
Valid input for these types consists of a time of day followed
|
|
by an optional time zone. (See <xref
|
|
linkend="datatype-datetime-time-table">
|
|
and <xref linkend="datatype-timezone-table">.) If a time zone is
|
|
specified in the input for <type>time without time zone</type>,
|
|
it is silently ignored. You can also specify a date but it will
|
|
be ignored, except when you use a time zone name that involves a
|
|
daylight-savings rule, such as
|
|
<literal>America/New_York</literal>. In this case specifying the date
|
|
is required in order to determine whether standard or daylight-savings
|
|
time applies. The appropriate time zone offset is recorded in the
|
|
<type>time with time zone</type> value.
|
|
</para>
|
|
|
|
<table id="datatype-datetime-time-table">
|
|
<title>Time Input</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Example</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>04:05:06.789</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05:06</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>040506</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05 AM</literal></entry>
|
|
<entry>same as 04:05; AM does not affect value</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05 PM</literal></entry>
|
|
<entry>same as 16:05; input hour must be <= 12</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05:06.789-8</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05:06-08:00</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05-08:00</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>040506-08</literal></entry>
|
|
<entry>ISO 8601</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>04:05:06 PST</literal></entry>
|
|
<entry>time zone specified by abbreviation</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry>
|
|
<entry>time zone specified by full name</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<table tocentry="1" id="datatype-timezone-table">
|
|
<title>Time Zone Input</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Example</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>PST</literal></entry>
|
|
<entry>Abbreviation (for Pacific Standard Time)</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>America/New_York</literal></entry>
|
|
<entry>Full time zone name</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>PST8PDT</literal></entry>
|
|
<entry>POSIX-style time zone specification</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-8:00</literal></entry>
|
|
<entry>ISO-8601 offset for PST</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-800</literal></entry>
|
|
<entry>ISO-8601 offset for PST</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-8</literal></entry>
|
|
<entry>ISO-8601 offset for PST</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>zulu</literal></entry>
|
|
<entry>Military abbreviation for UTC</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>z</literal></entry>
|
|
<entry>Short form of <literal>zulu</literal></entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
Refer to <xref linkend="datatype-timezones"> for more information on how
|
|
to specify time zones.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Time Stamps</title>
|
|
|
|
<indexterm>
|
|
<primary>timestamp</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>timestamp with time zone</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>timestamp without time zone</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Valid input for the time stamp types consists of the concatenation
|
|
of a date and a time, followed by an optional time zone,
|
|
followed by an optional <literal>AD</literal> or <literal>BC</literal>.
|
|
(Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear
|
|
before the time zone, but this is not the preferred ordering.)
|
|
Thus:
|
|
|
|
<programlisting>
|
|
1999-01-08 04:05:06
|
|
</programlisting>
|
|
and:
|
|
<programlisting>
|
|
1999-01-08 04:05:06 -8:00
|
|
</programlisting>
|
|
|
|
are valid values, which follow the <acronym>ISO</acronym> 8601
|
|
standard. In addition, the common format:
|
|
<programlisting>
|
|
January 8 04:05:06 1999 PST
|
|
</programlisting>
|
|
is supported.
|
|
</para>
|
|
|
|
<para>
|
|
The <acronym>SQL</acronym> standard differentiates
|
|
<type>timestamp without time zone</type>
|
|
and <type>timestamp with time zone</type> literals by the presence of a
|
|
<quote>+</quote> or <quote>-</quote> symbol and time zone offset after
|
|
the time. Hence, according to the standard,
|
|
|
|
<programlisting>TIMESTAMP '2004-10-19 10:23:54'</programlisting>
|
|
|
|
is a <type>timestamp without time zone</type>, while
|
|
|
|
<programlisting>TIMESTAMP '2004-10-19 10:23:54+02'</programlisting>
|
|
|
|
is a <type>timestamp with time zone</type>.
|
|
<productname>PostgreSQL</productname> never examines the content of a
|
|
literal string before determining its type, and therefore will treat
|
|
both of the above as <type>timestamp without time zone</type>. To
|
|
ensure that a literal is treated as <type>timestamp with time
|
|
zone</type>, give it the correct explicit type:
|
|
|
|
<programlisting>TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'</programlisting>
|
|
|
|
In a literal that has been determined to be <type>timestamp without time
|
|
zone</type>, <productname>PostgreSQL</productname> will silently ignore
|
|
any time zone indication.
|
|
That is, the resulting value is derived from the date/time
|
|
fields in the input value, and is not adjusted for time zone.
|
|
</para>
|
|
|
|
<para>
|
|
For <type>timestamp with time zone</type>, the internally stored
|
|
value is always in UTC (Universal
|
|
Coordinated Time, traditionally known as Greenwich Mean Time,
|
|
<acronym>GMT</>). An input value that has an explicit
|
|
time zone specified is converted to UTC using the appropriate offset
|
|
for that time zone. If no time zone is stated in the input string,
|
|
then it is assumed to be in the time zone indicated by the system's
|
|
<xref linkend="guc-timezone"> parameter, and is converted to UTC using the
|
|
offset for the <varname>timezone</> zone.
|
|
</para>
|
|
|
|
<para>
|
|
When a <type>timestamp with time
|
|
zone</type> value is output, it is always converted from UTC to the
|
|
current <varname>timezone</> zone, and displayed as local time in that
|
|
zone. To see the time in another time zone, either change
|
|
<varname>timezone</> or use the <literal>AT TIME ZONE</> construct
|
|
(see <xref linkend="functions-datetime-zoneconvert">).
|
|
</para>
|
|
|
|
<para>
|
|
Conversions between <type>timestamp without time zone</type> and
|
|
<type>timestamp with time zone</type> normally assume that the
|
|
<type>timestamp without time zone</type> value should be taken or given
|
|
as <varname>timezone</> local time. A different time zone can
|
|
be specified for the conversion using <literal>AT TIME ZONE</>.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Special Values</title>
|
|
|
|
<indexterm>
|
|
<primary>time</primary>
|
|
<secondary>constants</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>date</primary>
|
|
<secondary>constants</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> supports several
|
|
special date/time input values for convenience, as shown in <xref
|
|
linkend="datatype-datetime-special-table">. The values
|
|
<literal>infinity</literal> and <literal>-infinity</literal>
|
|
are specially represented inside the system and will be displayed
|
|
unchanged; but the others are simply notational shorthands
|
|
that will be converted to ordinary date/time values when read.
|
|
(In particular, <literal>now</> and related strings are converted
|
|
to a specific time value as soon as they are read.)
|
|
All of these values need to be enclosed in single quotes when used
|
|
as constants in SQL commands.
|
|
</para>
|
|
|
|
<table id="datatype-datetime-special-table">
|
|
<title>Special Date/Time Inputs</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Input String</entry>
|
|
<entry>Valid Types</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>epoch</literal></entry>
|
|
<entry><type>date</type>, <type>timestamp</type></entry>
|
|
<entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>infinity</literal></entry>
|
|
<entry><type>date</type>, <type>timestamp</type></entry>
|
|
<entry>later than all other time stamps</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-infinity</literal></entry>
|
|
<entry><type>date</type>, <type>timestamp</type></entry>
|
|
<entry>earlier than all other time stamps</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>now</literal></entry>
|
|
<entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>
|
|
<entry>current transaction's start time</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>today</literal></entry>
|
|
<entry><type>date</type>, <type>timestamp</type></entry>
|
|
<entry>midnight today</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>tomorrow</literal></entry>
|
|
<entry><type>date</type>, <type>timestamp</type></entry>
|
|
<entry>midnight tomorrow</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>yesterday</literal></entry>
|
|
<entry><type>date</type>, <type>timestamp</type></entry>
|
|
<entry>midnight yesterday</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>allballs</literal></entry>
|
|
<entry><type>time</type></entry>
|
|
<entry>00:00:00.00 UTC</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
The following <acronym>SQL</acronym>-compatible functions can also
|
|
be used to obtain the current time value for the corresponding data
|
|
type:
|
|
<literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>,
|
|
<literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,
|
|
<literal>LOCALTIMESTAMP</literal>. The latter four accept an
|
|
optional subsecond precision specification. (See <xref
|
|
linkend="functions-datetime-current">.) Note that these are
|
|
SQL functions and are <emphasis>not</> recognized in data input strings.
|
|
</para>
|
|
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-datetime-output">
|
|
<title>Date/Time Output</title>
|
|
|
|
<indexterm>
|
|
<primary>date</primary>
|
|
<secondary>output format</secondary>
|
|
<seealso>formatting</seealso>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>time</primary>
|
|
<secondary>output format</secondary>
|
|
<seealso>formatting</seealso>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The output format of the date/time types can be set to one of the four
|
|
styles ISO 8601,
|
|
<acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</>
|
|
(Unix <application>date</> format), or
|
|
German. The default
|
|
is the <acronym>ISO</acronym> format. (The
|
|
<acronym>SQL</acronym> standard requires the use of the ISO 8601
|
|
format. The name of the <quote>SQL</quote> output format is a
|
|
historical accident.) <xref
|
|
linkend="datatype-datetime-output-table"> shows examples of each
|
|
output style. The output of the <type>date</type> and
|
|
<type>time</type> types is of course only the date or time part
|
|
in accordance with the given examples.
|
|
</para>
|
|
|
|
<table id="datatype-datetime-output-table">
|
|
<title>Date/Time Output Styles</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Style Specification</entry>
|
|
<entry>Description</entry>
|
|
<entry>Example</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>ISO</entry>
|
|
<entry>ISO 8601/SQL standard</entry>
|
|
<entry>1997-12-17 07:37:16-08</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SQL</entry>
|
|
<entry>traditional style</entry>
|
|
<entry>12/17/1997 07:37:16.00 PST</entry>
|
|
</row>
|
|
<row>
|
|
<entry>POSTGRES</entry>
|
|
<entry>original style</entry>
|
|
<entry>Wed Dec 17 07:37:16 1997 PST</entry>
|
|
</row>
|
|
<row>
|
|
<entry>German</entry>
|
|
<entry>regional style</entry>
|
|
<entry>17.12.1997 07:37:16.00 PST</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
In the <acronym>SQL</acronym> and POSTGRES styles, day appears before
|
|
month if DMY field ordering has been specified, otherwise month appears
|
|
before day.
|
|
(See <xref linkend="datatype-datetime-input">
|
|
for how this setting also affects interpretation of input values.)
|
|
<xref linkend="datatype-datetime-output2-table"> shows an
|
|
example.
|
|
</para>
|
|
|
|
<table id="datatype-datetime-output2-table">
|
|
<title>Date Order Conventions</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry><varname>datestyle</varname> Setting</entry>
|
|
<entry>Input Ordering</entry>
|
|
<entry>Example Output</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>SQL, DMY</></entry>
|
|
<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
|
|
<entry>17/12/1997 15:37:16.00 CET</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>SQL, MDY</></entry>
|
|
<entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry>
|
|
<entry>12/17/1997 07:37:16.00 PST</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>Postgres, DMY</></entry>
|
|
<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
|
|
<entry>Wed 17 Dec 07:37:16 1997 PST</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
The date/time styles can be selected by the user using the
|
|
<command>SET datestyle</command> command, the <xref
|
|
linkend="guc-datestyle"> parameter in the
|
|
<filename>postgresql.conf</filename> configuration file, or the
|
|
<envar>PGDATESTYLE</envar> environment variable on the server or
|
|
client. The formatting function <function>to_char</function>
|
|
(see <xref linkend="functions-formatting">) is also available as
|
|
a more flexible way to format date/time output.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-timezones">
|
|
<title>Time Zones</title>
|
|
|
|
<indexterm zone="datatype-timezones">
|
|
<primary>time zone</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Time zones, and time-zone conventions, are influenced by
|
|
political decisions, not just earth geometry. Time zones around the
|
|
world became somewhat standardized during the 1900's,
|
|
but continue to be prone to arbitrary changes, particularly with
|
|
respect to daylight-savings rules.
|
|
<productname>PostgreSQL</productname> uses the widely-used
|
|
<literal>zoneinfo</> time zone database for information about
|
|
historical time zone rules. For times in the future, the assumption
|
|
is that the latest known rules for a given time zone will
|
|
continue to be observed indefinitely far into the future.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> endeavors to be compatible with
|
|
the <acronym>SQL</acronym> standard definitions for typical usage.
|
|
However, the <acronym>SQL</acronym> standard has an odd mix of date and
|
|
time types and capabilities. Two obvious problems are:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Although the <type>date</type> type
|
|
cannot have an associated time zone, the
|
|
<type>time</type> type can.
|
|
Time zones in the real world have little meaning unless
|
|
associated with a date as well as a time,
|
|
since the offset can vary through the year with daylight-saving
|
|
time boundaries.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
The default time zone is specified as a constant numeric offset
|
|
from <acronym>UTC</>. It is therefore impossible to adapt to
|
|
daylight-saving time when doing date/time arithmetic across
|
|
<acronym>DST</acronym> boundaries.
|
|
</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
To address these difficulties, we recommend using date/time types
|
|
that contain both date and time when using time zones. We
|
|
do <emphasis>not</> recommend using the type <type>time with
|
|
time zone</type> (though it is supported by
|
|
<productname>PostgreSQL</productname> for legacy applications and
|
|
for compliance with the <acronym>SQL</acronym> standard).
|
|
<productname>PostgreSQL</productname> assumes
|
|
your local time zone for any type containing only date or time.
|
|
</para>
|
|
|
|
<para>
|
|
All timezone-aware dates and times are stored internally in
|
|
<acronym>UTC</acronym>. They are converted to local time
|
|
in the zone specified by the <xref linkend="guc-timezone"> configuration
|
|
parameter before being displayed to the client.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> allows you to specify time zones in
|
|
three different forms:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
A full time zone name, for example <literal>America/New_York</>.
|
|
The recognized time zone names are listed in the
|
|
<literal>pg_timezone_names</literal> view (see <xref
|
|
linkend="view-pg-timezone-names">).
|
|
<productname>PostgreSQL</productname> uses the widely-used
|
|
<literal>zoneinfo</> time zone data for this purpose, so the same
|
|
names are also recognized by much other software.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
A time zone abbreviation, for example <literal>PST</>. Such a
|
|
specification merely defines a particular offset from UTC, in
|
|
contrast to full time zone names which can imply a set of daylight
|
|
savings transition-date rules as well. The recognized abbreviations
|
|
are listed in the <literal>pg_timezone_abbrevs</> view (see <xref
|
|
linkend="view-pg-timezone-abbrevs">). You cannot set the
|
|
configuration parameters <xref linkend="guc-timezone"> or
|
|
<xref linkend="guc-log-timezone"> to a time
|
|
zone abbreviation, but you can use abbreviations in
|
|
date/time input values and with the <literal>AT TIME ZONE</>
|
|
operator.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
In addition to the timezone names and abbreviations,
|
|
<productname>PostgreSQL</productname> will accept POSIX-style time zone
|
|
specifications of the form <replaceable>STD</><replaceable>offset</> or
|
|
<replaceable>STD</><replaceable>offset</><replaceable>DST</>, where
|
|
<replaceable>STD</> is a zone abbreviation, <replaceable>offset</> is a
|
|
numeric offset in hours west from UTC, and <replaceable>DST</> is an
|
|
optional daylight-savings zone abbreviation, assumed to stand for one
|
|
hour ahead of the given offset. For example, if <literal>EST5EDT</>
|
|
were not already a recognized zone name, it would be accepted and would
|
|
be functionally equivalent to United States East Coast time. When a
|
|
daylight-savings zone name is present, it is assumed to be used
|
|
according to the same daylight-savings transition rules used in the
|
|
<literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
|
|
In a standard <productname>PostgreSQL</productname> installation,
|
|
<filename>posixrules</> is the same as <literal>US/Eastern</>, so
|
|
that POSIX-style time zone specifications follow USA daylight-savings
|
|
rules. If needed, you can adjust this behavior by replacing the
|
|
<filename>posixrules</> file.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
In short, this is the difference between abbreviations
|
|
and full names: abbreviations always represent a fixed offset from
|
|
UTC, whereas most of the full names imply a local daylight-savings time
|
|
rule, and so have two possible UTC offsets.
|
|
</para>
|
|
|
|
<para>
|
|
One should be wary that the POSIX-style time zone feature can
|
|
lead to silently accepting bogus input, since there is no check on the
|
|
reasonableness of the zone abbreviations. For example, <literal>SET
|
|
TIMEZONE TO FOOBAR0</> will work, leaving the system effectively using
|
|
a rather peculiar abbreviation for UTC.
|
|
Another issue to keep in mind is that in POSIX time zone names,
|
|
positive offsets are used for locations <emphasis>west</> of Greenwich.
|
|
Everywhere else, <productname>PostgreSQL</productname> follows the
|
|
ISO-8601 convention that positive timezone offsets are <emphasis>east</>
|
|
of Greenwich.
|
|
</para>
|
|
|
|
<para>
|
|
In all cases, timezone names are recognized case-insensitively.
|
|
(This is a change from <productname>PostgreSQL</productname> versions
|
|
prior to 8.2, which were case-sensitive in some contexts but not others.)
|
|
</para>
|
|
|
|
<para>
|
|
Neither full names nor abbreviations are hard-wired into the server;
|
|
they are obtained from configuration files stored under
|
|
<filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>
|
|
of the installation directory
|
|
(see <xref linkend="datetime-config-files">).
|
|
</para>
|
|
|
|
<para>
|
|
The <xref linkend="guc-timezone"> configuration parameter can
|
|
be set in the file <filename>postgresql.conf</>, or in any of the
|
|
other standard ways described in <xref linkend="runtime-config">.
|
|
There are also several special ways to set it:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
If <varname>timezone</> is not specified in
|
|
<filename>postgresql.conf</> or as a server command-line option,
|
|
the server attempts to use the value of the <envar>TZ</envar>
|
|
environment variable as the default time zone. If <envar>TZ</envar>
|
|
is not defined or is not any of the time zone names known to
|
|
<productname>PostgreSQL</productname>, the server attempts to
|
|
determine the operating system's default time zone by checking the
|
|
behavior of the C library function <literal>localtime()</>. The
|
|
default time zone is selected as the closest match among
|
|
<productname>PostgreSQL</productname>'s known time zones.
|
|
(These rules are also used to choose the default value of
|
|
<xref linkend="guc-log-timezone">, if not specified.)
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
The <acronym>SQL</acronym> command <command>SET TIME ZONE</command>
|
|
sets the time zone for the session. This is an alternative spelling
|
|
of <command>SET TIMEZONE TO</> with a more SQL-spec-compatible syntax.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
The <envar>PGTZ</envar> environment variable is used by
|
|
<application>libpq</application> clients
|
|
to send a <command>SET TIME ZONE</command>
|
|
command to the server upon connection.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-interval-input">
|
|
<title>Interval Input</title>
|
|
|
|
<indexterm>
|
|
<primary>interval</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<type>interval</type> values can be written using the following
|
|
verbose syntax:
|
|
|
|
<synopsis>
|
|
<optional>@</> <replaceable>quantity</> <replaceable>unit</> <optional><replaceable>quantity</> <replaceable>unit</>...</> <optional><replaceable>direction</></optional>
|
|
</synopsis>
|
|
|
|
where <replaceable>quantity</> is a number (possibly signed);
|
|
<replaceable>unit</> is <literal>microsecond</literal>,
|
|
<literal>millisecond</literal>, <literal>second</literal>,
|
|
<literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,
|
|
<literal>week</literal>, <literal>month</literal>, <literal>year</literal>,
|
|
<literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,
|
|
or abbreviations or plurals of these units;
|
|
<replaceable>direction</> can be <literal>ago</literal> or
|
|
empty. The at sign (<literal>@</>) is optional noise. The amounts
|
|
of the different units are implicitly added with appropriate
|
|
sign accounting. <literal>ago</literal> negates all the fields.
|
|
This syntax is also used for interval output, if
|
|
<xref linkend="guc-intervalstyle"> is set to
|
|
<literal>postgres_verbose</>.
|
|
</para>
|
|
|
|
<para>
|
|
Quantities of days, hours, minutes, and seconds can be specified without
|
|
explicit unit markings. For example, <literal>'1 12:59:10'</> is read
|
|
the same as <literal>'1 day 12 hours 59 min 10 sec'</>. Also,
|
|
a combination of years and months can be specified with a dash;
|
|
for example <literal>'200-10'</> is read the same as <literal>'200 years
|
|
10 months'</>. (These shorter forms are in fact the only ones allowed
|
|
by the <acronym>SQL</acronym> standard, and are used for output when
|
|
<varname>IntervalStyle</> is set to <literal>sql_standard</literal>.)
|
|
</para>
|
|
|
|
<para>
|
|
Interval values can also be written as ISO 8601 time intervals, using
|
|
either the <quote>format with designators</> of the standard's section
|
|
4.4.3.2 or the <quote>alternative format</> of section 4.4.3.3. The
|
|
format with designators looks like this:
|
|
<synopsis>
|
|
P <replaceable>quantity</> <replaceable>unit</> <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional> <optional> T <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional></optional>
|
|
</synopsis>
|
|
The string must start with a <literal>P</>, and may include a
|
|
<literal>T</> that introduces the time-of-day units. The
|
|
available unit abbreviations are given in <xref
|
|
linkend="datatype-interval-iso8601-units">. Units may be
|
|
omitted, and may be specified in any order, but units smaller than
|
|
a day must appear after <literal>T</>. In particular, the meaning of
|
|
<literal>M</> depends on whether it is before or after
|
|
<literal>T</>.
|
|
</para>
|
|
|
|
<table id="datatype-interval-iso8601-units">
|
|
<title>ISO 8601 interval unit abbreviations</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Abbreviation</entry>
|
|
<entry>Meaning</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>Y</entry>
|
|
<entry>Years</entry>
|
|
</row>
|
|
<row>
|
|
<entry>M</entry>
|
|
<entry>Months (in the date part)</entry>
|
|
</row>
|
|
<row>
|
|
<entry>W</entry>
|
|
<entry>Weeks</entry>
|
|
</row>
|
|
<row>
|
|
<entry>D</entry>
|
|
<entry>Days</entry>
|
|
</row>
|
|
<row>
|
|
<entry>H</entry>
|
|
<entry>Hours</entry>
|
|
</row>
|
|
<row>
|
|
<entry>M</entry>
|
|
<entry>Minutes (in the time part)</entry>
|
|
</row>
|
|
<row>
|
|
<entry>S</entry>
|
|
<entry>Seconds</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
In the alternative format:
|
|
<synopsis>
|
|
P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> </optional> <optional> T <replaceable>hours</>:<replaceable>minutes</>:<replaceable>seconds</> </optional>
|
|
</synopsis>
|
|
the string must begin with <literal>P</literal>, and a
|
|
<literal>T</> separates the date and time parts of the interval.
|
|
The values are given as numbers similar to ISO 8601 dates.
|
|
</para>
|
|
|
|
<para>
|
|
When writing an interval constant with a <replaceable>fields</>
|
|
specification, or when assigning a string to an interval column that was
|
|
defined with a <replaceable>fields</> specification, the interpretation of
|
|
unmarked quantities depends on the <replaceable>fields</>. For
|
|
example <literal>INTERVAL '1' YEAR</> is read as 1 year, whereas
|
|
<literal>INTERVAL '1'</> means 1 second. Also, field values
|
|
<quote>to the right</> of the least significant field allowed by the
|
|
<replaceable>fields</> specification are silently discarded. For
|
|
example, writing <literal>INTERVAL '1 day 2:03:04' HOUR TO MINUTE</>
|
|
results in dropping the seconds field, but not the day field.
|
|
</para>
|
|
|
|
<para>
|
|
According to the <acronym>SQL</> standard all fields of an interval
|
|
value must have the same sign, so a leading negative sign applies to all
|
|
fields; for example the negative sign in the interval literal
|
|
<literal>'-1 2:03:04'</> applies to both the days and hour/minute/second
|
|
parts. <productname>PostgreSQL</> allows the fields to have different
|
|
signs, and traditionally treats each field in the textual representation
|
|
as independently signed, so that the hour/minute/second part is
|
|
considered positive in this example. If <varname>IntervalStyle</> is
|
|
set to <literal>sql_standard</literal> then a leading sign is considered
|
|
to apply to all fields (but only if no additional signs appear).
|
|
Otherwise the traditional <productname>PostgreSQL</> interpretation is
|
|
used. To avoid ambiguity, it's recommended to attach an explicit sign
|
|
to each field if any field is negative.
|
|
</para>
|
|
|
|
<para>
|
|
Internally <type>interval</> values are stored as months, days,
|
|
and seconds. This is done because the number of days in a month
|
|
varies, and a day can have 23 or 25 hours if a daylight savings
|
|
time adjustment is involved. The months and days fields are integers
|
|
while the seconds field can store fractions. Because intervals are
|
|
usually created from constant strings or <type>timestamp</> subtraction,
|
|
this storage method works well in most cases. Functions
|
|
<function>justify_days</> and <function>justify_hours</> are
|
|
available for adjusting days and hours that overflow their normal
|
|
ranges.
|
|
</para>
|
|
|
|
<para>
|
|
In the verbose input format, and in some fields of the more compact
|
|
input formats, field values can have fractional parts; for example
|
|
<literal>'1.5 week'</> or <literal>'01:02:03.45'</>. Such input is
|
|
converted to the appropriate number of months, days, and seconds
|
|
for storage. When this would result in a fractional number of
|
|
months or days, the fraction is added to the lower-order fields
|
|
using the conversion factors 1 month = 30 days and 1 day = 24 hours.
|
|
For example, <literal>'1.5 month'</> becomes 1 month and 15 days.
|
|
Only seconds will ever be shown as fractional on output.
|
|
</para>
|
|
|
|
<para>
|
|
<xref linkend="datatype-interval-input-examples"> shows some examples
|
|
of valid <type>interval</> input.
|
|
</para>
|
|
|
|
<table id="datatype-interval-input-examples">
|
|
<title>Interval Input</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Example</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>1-2</entry>
|
|
<entry>SQL standard format: 1 year 2 months</entry>
|
|
</row>
|
|
<row>
|
|
<entry>3 4:05:06</entry>
|
|
<entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry>
|
|
</row>
|
|
<row>
|
|
<entry>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
|
|
<entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
|
|
</row>
|
|
<row>
|
|
<entry>P1Y2M3DT4H5M6S</entry>
|
|
<entry>ISO 8601 <quote>format with designators</>: same meaning as above</entry>
|
|
</row>
|
|
<row>
|
|
<entry>P0001-02-03T04:05:06</entry>
|
|
<entry>ISO 8601 <quote>alternative format</>: same meaning as above</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-interval-output">
|
|
<title>Interval Output</title>
|
|
|
|
<indexterm>
|
|
<primary>interval</primary>
|
|
<secondary>output format</secondary>
|
|
<seealso>formatting</seealso>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The output format of the interval type can be set to one of the
|
|
four styles <literal>sql_standard</>, <literal>postgres</>,
|
|
<literal>postgres_verbose</>, or <literal>iso_8601</>,
|
|
using the command <literal>SET intervalstyle</literal>.
|
|
The default is the <literal>postgres</> format.
|
|
<xref linkend="interval-style-output-table"> shows examples of each
|
|
output style.
|
|
</para>
|
|
|
|
<para>
|
|
The <literal>sql_standard</> style produces output that conforms to
|
|
the SQL standard's specification for interval literal strings, if
|
|
the interval value meets the standard's restrictions (either year-month
|
|
only or day-time only, with no mixing of positive
|
|
and negative components). Otherwise the output looks like a standard
|
|
year-month literal string followed by a day-time literal string,
|
|
with explicit signs added to disambiguate mixed-sign intervals.
|
|
</para>
|
|
|
|
<para>
|
|
The output of the <literal>postgres</> style matches the output of
|
|
<productname>PostgreSQL</> releases prior to 8.4 when the
|
|
<xref linkend="guc-datestyle"> parameter was set to <literal>ISO</>.
|
|
</para>
|
|
|
|
<para>
|
|
The output of the <literal>postgres_verbose</> style matches the output of
|
|
<productname>PostgreSQL</> releases prior to 8.4 when the
|
|
<varname>DateStyle</> parameter was set to non-<literal>ISO</> output.
|
|
</para>
|
|
|
|
<para>
|
|
The output of the <literal>iso_8601</> style matches the <quote>format
|
|
with designators</> described in section 4.4.3.2 of the
|
|
ISO 8601 standard.
|
|
</para>
|
|
|
|
<table id="interval-style-output-table">
|
|
<title>Interval Output Style Examples</title>
|
|
<tgroup cols="4">
|
|
<thead>
|
|
<row>
|
|
<entry>Style Specification</entry>
|
|
<entry>Year-Month Interval</entry>
|
|
<entry>Day-Time Interval</entry>
|
|
<entry>Mixed Interval</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>sql_standard</></entry>
|
|
<entry>1-2</entry>
|
|
<entry>3 4:05:06</entry>
|
|
<entry>-1-2 +3 -4:05:06</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>postgres</></entry>
|
|
<entry>1 year 2 mons</entry>
|
|
<entry>3 days 04:05:06</entry>
|
|
<entry>-1 year -2 mons +3 days -04:05:06</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>postgres_verbose</></entry>
|
|
<entry>@ 1 year 2 mons</entry>
|
|
<entry>@ 3 days 4 hours 5 mins 6 secs</entry>
|
|
<entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>iso_8601</></entry>
|
|
<entry>P1Y2M</entry>
|
|
<entry>P3DT4H5M6S</entry>
|
|
<entry>P-1Y-2M3DT-4H-5M-6S</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-datetime-internals">
|
|
<title>Internals</title>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> uses Julian dates
|
|
for all date/time calculations. This has the useful property of correctly
|
|
calculating dates from 4713 BC
|
|
to far into the future, using the assumption that the length of the
|
|
year is 365.2425 days.
|
|
</para>
|
|
|
|
<para>
|
|
Date conventions before the 19th century make for interesting reading,
|
|
but are not consistent enough to warrant coding into a date/time handler.
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-boolean">
|
|
<title>Boolean Type</title>
|
|
|
|
<indexterm zone="datatype-boolean">
|
|
<primary>Boolean</primary>
|
|
<secondary>data type</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-boolean">
|
|
<primary>true</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-boolean">
|
|
<primary>false</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides the
|
|
standard <acronym>SQL</acronym> type <type>boolean</type>.
|
|
<type>boolean</type> can have one of only two states:
|
|
<quote>true</quote> or <quote>false</quote>. A third state,
|
|
<quote>unknown</quote>, is represented by the
|
|
<acronym>SQL</acronym> null value.
|
|
</para>
|
|
|
|
<para>
|
|
Valid literal values for the <quote>true</quote> state are:
|
|
<simplelist>
|
|
<member><literal>TRUE</literal></member>
|
|
<member><literal>'t'</literal></member>
|
|
<member><literal>'true'</literal></member>
|
|
<member><literal>'y'</literal></member>
|
|
<member><literal>'yes'</literal></member>
|
|
<member><literal>'on'</literal></member>
|
|
<member><literal>'1'</literal></member>
|
|
</simplelist>
|
|
For the <quote>false</quote> state, the following values can be
|
|
used:
|
|
<simplelist>
|
|
<member><literal>FALSE</literal></member>
|
|
<member><literal>'f'</literal></member>
|
|
<member><literal>'false'</literal></member>
|
|
<member><literal>'n'</literal></member>
|
|
<member><literal>'no'</literal></member>
|
|
<member><literal>'off'</literal></member>
|
|
<member><literal>'0'</literal></member>
|
|
</simplelist>
|
|
Leading or trailing whitespace is ignored, and case does not matter.
|
|
The key words
|
|
<literal>TRUE</literal> and <literal>FALSE</literal> are the preferred
|
|
(<acronym>SQL</acronym>-compliant) usage.
|
|
</para>
|
|
|
|
<example id="datatype-boolean-example">
|
|
<title>Using the <type>boolean</type> type</title>
|
|
|
|
<programlisting>
|
|
CREATE TABLE test1 (a boolean, b text);
|
|
INSERT INTO test1 VALUES (TRUE, 'sic est');
|
|
INSERT INTO test1 VALUES (FALSE, 'non est');
|
|
SELECT * FROM test1;
|
|
a | b
|
|
---+---------
|
|
t | sic est
|
|
f | non est
|
|
|
|
SELECT * FROM test1 WHERE a;
|
|
a | b
|
|
---+---------
|
|
t | sic est
|
|
</programlisting>
|
|
</example>
|
|
|
|
<para>
|
|
<xref linkend="datatype-boolean-example"> shows that
|
|
<type>boolean</type> values are output using the letters
|
|
<literal>t</literal> and <literal>f</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
<type>boolean</type> uses 1 byte of storage.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-enum">
|
|
<title>Enumerated Types</title>
|
|
|
|
<indexterm zone="datatype-enum">
|
|
<primary>data type</primary>
|
|
<secondary>enumerated (enum)</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-enum">
|
|
<primary>enumerated types</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Enumerated (enum) types are data types that
|
|
comprise a static, ordered set of values.
|
|
They are equivalent to the <type>enum</type>
|
|
types supported in a number of programming languages. An example of an enum
|
|
type might be the days of the week, or a set of status values for
|
|
a piece of data.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>Declaration of Enumerated Types</title>
|
|
|
|
<para>
|
|
Enum types are created using the <xref
|
|
linkend="sql-createtype" endterm="sql-createtype-title"> command,
|
|
for example:
|
|
|
|
<programlisting>
|
|
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
|
|
</programlisting>
|
|
|
|
Once created, the enum type can be used in table and function
|
|
definitions much like any other type:
|
|
</para>
|
|
|
|
<example>
|
|
<title>Basic Enum Usage</title>
|
|
<programlisting>
|
|
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
|
|
CREATE TABLE person (
|
|
name text,
|
|
current_mood mood
|
|
);
|
|
INSERT INTO person VALUES ('Moe', 'happy');
|
|
SELECT * FROM person WHERE current_mood = 'happy';
|
|
name | current_mood
|
|
------+--------------
|
|
Moe | happy
|
|
(1 row)
|
|
</programlisting>
|
|
</example>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Ordering</title>
|
|
|
|
<para>
|
|
The ordering of the values in an enum type is the
|
|
order in which the values were listed when the type was created.
|
|
All standard comparison operators and related
|
|
aggregate functions are supported for enums. For example:
|
|
</para>
|
|
|
|
<example>
|
|
<title>Enum Ordering</title>
|
|
<programlisting>
|
|
INSERT INTO person VALUES ('Larry', 'sad');
|
|
INSERT INTO person VALUES ('Curly', 'ok');
|
|
SELECT * FROM person WHERE current_mood > 'sad';
|
|
name | current_mood
|
|
-------+--------------
|
|
Moe | happy
|
|
Curly | ok
|
|
(2 rows)
|
|
|
|
SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;
|
|
name | current_mood
|
|
-------+--------------
|
|
Curly | ok
|
|
Moe | happy
|
|
(2 rows)
|
|
|
|
SELECT name
|
|
FROM person
|
|
WHERE current_mood = (SELECT MIN(current_mood) FROM person);
|
|
name
|
|
-------
|
|
Larry
|
|
(1 row)
|
|
</programlisting>
|
|
</example>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Type Safety</title>
|
|
|
|
<para>
|
|
Each enumerated data type is separate and cannot
|
|
be compared with other enumerated types.
|
|
</para>
|
|
|
|
<example>
|
|
<title>Lack of Casting</title>
|
|
<programlisting>
|
|
CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');
|
|
CREATE TABLE holidays (
|
|
num_weeks integer,
|
|
happiness happiness
|
|
);
|
|
INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');
|
|
INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy');
|
|
INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic');
|
|
INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad');
|
|
ERROR: invalid input value for enum happiness: "sad"
|
|
SELECT person.name, holidays.num_weeks FROM person, holidays
|
|
WHERE person.current_mood = holidays.happiness;
|
|
ERROR: operator does not exist: mood = happiness
|
|
</programlisting>
|
|
</example>
|
|
|
|
<para>
|
|
If you really need to do something like that, you can either
|
|
write a custom operator or add explicit casts to your query:
|
|
</para>
|
|
|
|
<example>
|
|
<title>Comparing Different Enums by Casting to Text</title>
|
|
<programlisting>
|
|
SELECT person.name, holidays.num_weeks FROM person, holidays
|
|
WHERE person.current_mood::text = holidays.happiness::text;
|
|
name | num_weeks
|
|
------+-----------
|
|
Moe | 4
|
|
(1 row)
|
|
|
|
</programlisting>
|
|
</example>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Implementation Details</title>
|
|
|
|
<para>
|
|
An enum value occupies four bytes on disk. The length of an enum
|
|
value's textual label is limited by the <symbol>NAMEDATALEN</symbol>
|
|
setting compiled into <productname>PostgreSQL</productname>; in standard
|
|
builds this means at most 63 bytes.
|
|
</para>
|
|
|
|
<para>
|
|
Enum labels are case sensitive, so
|
|
<type>'happy'</type> is not the same as <type>'HAPPY'</type>.
|
|
White space in the labels is significant too.
|
|
</para>
|
|
|
|
<para>
|
|
The translations from internal enum values to textual labels are
|
|
kept in the system catalog
|
|
<link linkend="catalog-pg-enum"><structname>pg_enum</structname></link>.
|
|
Querying this catalog directly can be useful.
|
|
</para>
|
|
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-geometric">
|
|
<title>Geometric Types</title>
|
|
|
|
<para>
|
|
Geometric data types represent two-dimensional spatial
|
|
objects. <xref linkend="datatype-geo-table"> shows the geometric
|
|
types available in <productname>PostgreSQL</productname>. The
|
|
most fundamental type, the point, forms the basis for all of the
|
|
other types.
|
|
</para>
|
|
|
|
<table id="datatype-geo-table">
|
|
<title>Geometric Types</title>
|
|
<tgroup cols="4">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Representation</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><type>point</type></entry>
|
|
<entry>16 bytes</entry>
|
|
<entry>Point on a plane</entry>
|
|
<entry>(x,y)</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>line</type></entry>
|
|
<entry>32 bytes</entry>
|
|
<entry>Infinite line (not fully implemented)</entry>
|
|
<entry>((x1,y1),(x2,y2))</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>lseg</type></entry>
|
|
<entry>32 bytes</entry>
|
|
<entry>Finite line segment</entry>
|
|
<entry>((x1,y1),(x2,y2))</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>box</type></entry>
|
|
<entry>32 bytes</entry>
|
|
<entry>Rectangular box</entry>
|
|
<entry>((x1,y1),(x2,y2))</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>path</type></entry>
|
|
<entry>16+16n bytes</entry>
|
|
<entry>Closed path (similar to polygon)</entry>
|
|
<entry>((x1,y1),...)</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>path</type></entry>
|
|
<entry>16+16n bytes</entry>
|
|
<entry>Open path</entry>
|
|
<entry>[(x1,y1),...]</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>polygon</type></entry>
|
|
<entry>40+16n bytes</entry>
|
|
<entry>Polygon (similar to closed path)</entry>
|
|
<entry>((x1,y1),...)</entry>
|
|
</row>
|
|
<row>
|
|
<entry><type>circle</type></entry>
|
|
<entry>24 bytes</entry>
|
|
<entry>Circle</entry>
|
|
<entry><(x,y),r> (center point and radius)</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
A rich set of functions and operators is available to perform various geometric
|
|
operations such as scaling, translation, rotation, and determining
|
|
intersections. They are explained in <xref linkend="functions-geometry">.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>Points</title>
|
|
|
|
<indexterm>
|
|
<primary>point</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Points are the fundamental two-dimensional building block for geometric types.
|
|
Values of type <type>point</type> are specified using the following syntax:
|
|
|
|
<synopsis>
|
|
( <replaceable>x</replaceable> , <replaceable>y</replaceable> )
|
|
<replaceable>x</replaceable> , <replaceable>y</replaceable>
|
|
</synopsis>
|
|
|
|
where <replaceable>x</> and <replaceable>y</> are the respective
|
|
coordinates, as floating-point numbers.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Line Segments</title>
|
|
|
|
<indexterm>
|
|
<primary>lseg</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>line segment</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Line segments (<type>lseg</type>) are represented by pairs of points.
|
|
Values of type <type>lseg</type> are specified using the following syntax:
|
|
|
|
<synopsis>
|
|
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
|
|
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
|
|
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
|
|
</synopsis>
|
|
|
|
where
|
|
<literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
|
|
and
|
|
<literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
|
|
are the end points of the line segment.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Boxes</title>
|
|
|
|
<indexterm>
|
|
<primary>box (data type)</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>rectangle</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Boxes are represented by pairs of points that are opposite
|
|
corners of the box.
|
|
Values of type <type>box</type> are specified using the following syntax:
|
|
|
|
<synopsis>
|
|
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
|
|
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
|
|
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
|
|
</synopsis>
|
|
|
|
where
|
|
<literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
|
|
and
|
|
<literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
|
|
are any two opposite corners of the box.
|
|
</para>
|
|
|
|
<para>
|
|
Boxes are output using the first syntax.
|
|
Any two opposite corners can be supplied on input, but the values
|
|
will be reordered as needed to store the
|
|
upper right and lower left corners.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Paths</title>
|
|
|
|
<indexterm>
|
|
<primary>path (data type)</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Paths are represented by lists of connected points. Paths can be
|
|
<firstterm>open</firstterm>, where
|
|
the first and last points in the list are considered not connected, or
|
|
<firstterm>closed</firstterm>,
|
|
where the first and last points are considered connected.
|
|
</para>
|
|
|
|
<para>
|
|
Values of type <type>path</type> are specified using the following syntax:
|
|
|
|
<synopsis>
|
|
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
|
|
[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ]
|
|
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
|
|
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
|
|
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
|
|
</synopsis>
|
|
|
|
where the points are the end points of the line segments
|
|
comprising the path. Square brackets (<literal>[]</>) indicate
|
|
an open path, while parentheses (<literal>()</>) indicate a
|
|
closed path.
|
|
</para>
|
|
|
|
<para>
|
|
Paths are output using the first or second syntax, as appropriate.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Polygons</title>
|
|
|
|
<indexterm>
|
|
<primary>polygon</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Polygons are represented by lists of points (the vertexes of the
|
|
polygon). Polygons are very similar to closed paths, but are
|
|
stored differently
|
|
and have their own set of support routines.
|
|
</para>
|
|
|
|
<para>
|
|
Values of type <type>polygon</type> are specified using the following syntax:
|
|
|
|
<synopsis>
|
|
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
|
|
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
|
|
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
|
|
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
|
|
</synopsis>
|
|
|
|
where the points are the end points of the line segments
|
|
comprising the boundary of the polygon.
|
|
</para>
|
|
|
|
<para>
|
|
Polygons are output using the first syntax.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Circles</title>
|
|
|
|
<indexterm>
|
|
<primary>circle</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Circles are represented by a center point and radius.
|
|
Values of type <type>circle</type> are specified using the following syntax:
|
|
|
|
<synopsis>
|
|
< ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> >
|
|
( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> )
|
|
( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable>
|
|
<replaceable>x</replaceable> , <replaceable>y</replaceable> , <replaceable>r</replaceable>
|
|
</synopsis>
|
|
|
|
where
|
|
<literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal>
|
|
is the center point and <replaceable>r</replaceable> is the radius of the circle.
|
|
</para>
|
|
|
|
<para>
|
|
Circles are output using the first syntax.
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-net-types">
|
|
<title>Network Address Types</title>
|
|
|
|
<indexterm zone="datatype-net-types">
|
|
<primary>network</primary>
|
|
<secondary>data types</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</> offers data types to store IPv4, IPv6, and MAC
|
|
addresses, as shown in <xref linkend="datatype-net-types-table">. It
|
|
is better to use these types instead of plain text types to store
|
|
network addresses, because
|
|
these types offer input error checking and specialized
|
|
operators and functions (see <xref linkend="functions-net">).
|
|
</para>
|
|
|
|
<table tocentry="1" id="datatype-net-types-table">
|
|
<title>Network Address Types</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Storage Size</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
|
|
<row>
|
|
<entry><type>cidr</type></entry>
|
|
<entry>7 or 19 bytes</entry>
|
|
<entry>IPv4 and IPv6 networks</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>inet</type></entry>
|
|
<entry>7 or 19 bytes</entry>
|
|
<entry>IPv4 and IPv6 hosts and networks</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>macaddr</type></entry>
|
|
<entry>6 bytes</entry>
|
|
<entry>MAC addresses</entry>
|
|
</row>
|
|
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
When sorting <type>inet</type> or <type>cidr</type> data types,
|
|
IPv4 addresses will always sort before IPv6 addresses, including
|
|
IPv4 addresses encapsulated or mapped to IPv6 addresses, such as
|
|
::10.2.3.4 or ::ffff:10.4.3.2.
|
|
</para>
|
|
|
|
|
|
<sect2 id="datatype-inet">
|
|
<title><type>inet</type></title>
|
|
|
|
<indexterm>
|
|
<primary>inet (data type)</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <type>inet</type> type holds an IPv4 or IPv6 host address, and
|
|
optionally its subnet, all in one field.
|
|
The subnet is represented by the number of network address bits
|
|
present in the host address (the
|
|
<quote>netmask</quote>). If the netmask is 32 and the address is IPv4,
|
|
then the value does not indicate a subnet, only a single host.
|
|
In IPv6, the address length is 128 bits, so 128 bits specify a
|
|
unique host address. Note that if you
|
|
want to accept only networks, you should use the
|
|
<type>cidr</type> type rather than <type>inet</type>.
|
|
</para>
|
|
|
|
<para>
|
|
The input format for this type is
|
|
<replaceable class="parameter">address/y</replaceable>
|
|
where
|
|
<replaceable class="parameter">address</replaceable>
|
|
is an IPv4 or IPv6 address and
|
|
<replaceable class="parameter">y</replaceable>
|
|
is the number of bits in the netmask. If the
|
|
<replaceable class="parameter">/y</replaceable>
|
|
portion is missing, the
|
|
netmask is 32 for IPv4 and 128 for IPv6, so the value represents
|
|
just a single host. On display, the
|
|
<replaceable class="parameter">/y</replaceable>
|
|
portion is suppressed if the netmask specifies a single host.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-cidr">
|
|
<title><type>cidr</></title>
|
|
|
|
<indexterm>
|
|
<primary>cidr</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <type>cidr</type> type holds an IPv4 or IPv6 network specification.
|
|
Input and output formats follow Classless Internet Domain Routing
|
|
conventions.
|
|
The format for specifying networks is <replaceable
|
|
class="parameter">address/y</> where <replaceable
|
|
class="parameter">address</> is the network represented as an
|
|
IPv4 or IPv6 address, and <replaceable
|
|
class="parameter">y</> is the number of bits in the netmask. If
|
|
<replaceable class="parameter">y</> is omitted, it is calculated
|
|
using assumptions from the older classful network numbering system, except
|
|
it will be at least large enough to include all of the octets
|
|
written in the input. It is an error to specify a network address
|
|
that has bits set to the right of the specified netmask.
|
|
</para>
|
|
|
|
<para>
|
|
<xref linkend="datatype-net-cidr-table"> shows some examples.
|
|
</para>
|
|
|
|
<table id="datatype-net-cidr-table">
|
|
<title><type>cidr</> Type Input Examples</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry><type>cidr</type> Input</entry>
|
|
<entry><type>cidr</type> Output</entry>
|
|
<entry><literal><function>abbrev</function>(<type>cidr</type>)</literal></entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>192.168.100.128/25</entry>
|
|
<entry>192.168.100.128/25</entry>
|
|
<entry>192.168.100.128/25</entry>
|
|
</row>
|
|
<row>
|
|
<entry>192.168/24</entry>
|
|
<entry>192.168.0.0/24</entry>
|
|
<entry>192.168.0/24</entry>
|
|
</row>
|
|
<row>
|
|
<entry>192.168/25</entry>
|
|
<entry>192.168.0.0/25</entry>
|
|
<entry>192.168.0.0/25</entry>
|
|
</row>
|
|
<row>
|
|
<entry>192.168.1</entry>
|
|
<entry>192.168.1.0/24</entry>
|
|
<entry>192.168.1/24</entry>
|
|
</row>
|
|
<row>
|
|
<entry>192.168</entry>
|
|
<entry>192.168.0.0/24</entry>
|
|
<entry>192.168.0/24</entry>
|
|
</row>
|
|
<row>
|
|
<entry>128.1</entry>
|
|
<entry>128.1.0.0/16</entry>
|
|
<entry>128.1/16</entry>
|
|
</row>
|
|
<row>
|
|
<entry>128</entry>
|
|
<entry>128.0.0.0/16</entry>
|
|
<entry>128.0/16</entry>
|
|
</row>
|
|
<row>
|
|
<entry>128.1.2</entry>
|
|
<entry>128.1.2.0/24</entry>
|
|
<entry>128.1.2/24</entry>
|
|
</row>
|
|
<row>
|
|
<entry>10.1.2</entry>
|
|
<entry>10.1.2.0/24</entry>
|
|
<entry>10.1.2/24</entry>
|
|
</row>
|
|
<row>
|
|
<entry>10.1</entry>
|
|
<entry>10.1.0.0/16</entry>
|
|
<entry>10.1/16</entry>
|
|
</row>
|
|
<row>
|
|
<entry>10</entry>
|
|
<entry>10.0.0.0/8</entry>
|
|
<entry>10/8</entry>
|
|
</row>
|
|
<row>
|
|
<entry>10.1.2.3/32</entry>
|
|
<entry>10.1.2.3/32</entry>
|
|
<entry>10.1.2.3/32</entry>
|
|
</row>
|
|
<row>
|
|
<entry>2001:4f8:3:ba::/64</entry>
|
|
<entry>2001:4f8:3:ba::/64</entry>
|
|
<entry>2001:4f8:3:ba::/64</entry>
|
|
</row>
|
|
<row>
|
|
<entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
|
|
<entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
|
|
<entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1</entry>
|
|
</row>
|
|
<row>
|
|
<entry>::ffff:1.2.3.0/120</entry>
|
|
<entry>::ffff:1.2.3.0/120</entry>
|
|
<entry>::ffff:1.2.3/120</entry>
|
|
</row>
|
|
<row>
|
|
<entry>::ffff:1.2.3.0/128</entry>
|
|
<entry>::ffff:1.2.3.0/128</entry>
|
|
<entry>::ffff:1.2.3.0/128</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-inet-vs-cidr">
|
|
<title><type>inet</type> vs. <type>cidr</type></title>
|
|
|
|
<para>
|
|
The essential difference between <type>inet</type> and <type>cidr</type>
|
|
data types is that <type>inet</type> accepts values with nonzero bits to
|
|
the right of the netmask, whereas <type>cidr</type> does not.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
If you do not like the output format for <type>inet</type> or
|
|
<type>cidr</type> values, try the functions <function>host</>,
|
|
<function>text</>, and <function>abbrev</>.
|
|
</para>
|
|
</tip>
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-macaddr">
|
|
<title><type>macaddr</></>
|
|
|
|
<indexterm>
|
|
<primary>macaddr (data type)</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>MAC address</primary>
|
|
<see>macaddr</see>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <type>macaddr</> type stores MAC addresses, known for example
|
|
from Ethernet card hardware addresses (although MAC addresses are
|
|
used for other purposes as well). Input is accepted in the
|
|
following formats:
|
|
|
|
<simplelist>
|
|
<member><literal>'08:00:2b:01:02:03'</></member>
|
|
<member><literal>'08-00-2b-01-02-03'</></member>
|
|
<member><literal>'08002b:010203'</></member>
|
|
<member><literal>'08002b-010203'</></member>
|
|
<member><literal>'0800.2b01.0203'</></member>
|
|
<member><literal>'08002b010203'</></member>
|
|
</simplelist>
|
|
|
|
These examples would all specify the same address. Upper and
|
|
lower case is accepted for the digits
|
|
<literal>a</> through <literal>f</>. Output is always in the
|
|
first of the forms shown.
|
|
</para>
|
|
|
|
<para>
|
|
IEEE Std 802-2001 specifies the second shown form (with hyphens)
|
|
as the canonical form for MAC addresses, and specifies the first
|
|
form (with colons) as the bit-reversed notation, so that
|
|
08-00-2b-01-02-03 = 01:00:4D:08:04:0C. This convention is widely
|
|
ignored nowadays, and it is only relevant for obsolete network
|
|
protocols (such as Token Ring). PostgreSQL makes no provisions
|
|
for bit reversal, and all accepted formats use the canonical LSB
|
|
order.
|
|
</para>
|
|
|
|
<para>
|
|
The remaining four input formats are not part of any standard.
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-bit">
|
|
<title>Bit String Types</title>
|
|
|
|
<indexterm zone="datatype-bit">
|
|
<primary>bit string</primary>
|
|
<secondary>data type</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Bit strings are strings of 1's and 0's. They can be used to store
|
|
or visualize bit masks. There are two SQL bit types:
|
|
<type>bit(<replaceable>n</replaceable>)</type> and <type>bit
|
|
varying(<replaceable>n</replaceable>)</type>, where
|
|
<replaceable>n</replaceable> is a positive integer.
|
|
</para>
|
|
|
|
<para>
|
|
<type>bit</type> type data must match the length
|
|
<replaceable>n</replaceable> exactly; it is an error to attempt to
|
|
store shorter or longer bit strings. <type>bit varying</type> data is
|
|
of variable length up to the maximum length
|
|
<replaceable>n</replaceable>; longer strings will be rejected.
|
|
Writing <type>bit</type> without a length is equivalent to
|
|
<literal>bit(1)</literal>, while <type>bit varying</type> without a length
|
|
specification means unlimited length.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
If one explicitly casts a bit-string value to
|
|
<type>bit(<replaceable>n</>)</type>, it will be truncated or
|
|
zero-padded on the right to be exactly <replaceable>n</> bits,
|
|
without raising an error. Similarly,
|
|
if one explicitly casts a bit-string value to
|
|
<type>bit varying(<replaceable>n</>)</type>, it will be truncated
|
|
on the right if it is more than <replaceable>n</> bits.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
Refer to <xref
|
|
linkend="sql-syntax-bit-strings"> for information about the syntax
|
|
of bit string constants. Bit-logical operators and string
|
|
manipulation functions are available; see <xref
|
|
linkend="functions-bitstring">.
|
|
</para>
|
|
|
|
<example>
|
|
<title>Using the bit string types</title>
|
|
|
|
<programlisting>
|
|
CREATE TABLE test (a BIT(3), b BIT VARYING(5));
|
|
INSERT INTO test VALUES (B'101', B'00');
|
|
INSERT INTO test VALUES (B'10', B'101');
|
|
<computeroutput>
|
|
ERROR: bit string length 2 does not match type bit(3)
|
|
</computeroutput>
|
|
INSERT INTO test VALUES (B'10'::bit(3), B'101');
|
|
SELECT * FROM test;
|
|
<computeroutput>
|
|
a | b
|
|
-----+-----
|
|
101 | 00
|
|
100 | 101
|
|
</computeroutput>
|
|
</programlisting>
|
|
</example>
|
|
|
|
<para>
|
|
A bit string value requires 1 byte for each group of 8 bits, plus
|
|
5 or 8 bytes overhead depending on the length of the string
|
|
(but long values may be compressed or moved out-of-line, as explained
|
|
in <xref linkend="datatype-character"> for character strings).
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-textsearch">
|
|
<title>Text Search Types</title>
|
|
|
|
<indexterm zone="datatype-textsearch">
|
|
<primary>full text search</primary>
|
|
<secondary>data types</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-textsearch">
|
|
<primary>text search</primary>
|
|
<secondary>data types</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides two data types that
|
|
are designed to support full text search, which is the activity of
|
|
searching through a collection of natural-language <firstterm>documents</>
|
|
to locate those that best match a <firstterm>query</>.
|
|
The <type>tsvector</type> type represents a document in a form optimized
|
|
for text search; the <type>tsquery</type> type similarly represents
|
|
a text query.
|
|
<xref linkend="textsearch"> provides a detailed explanation of this
|
|
facility, and <xref linkend="functions-textsearch"> summarizes the
|
|
related functions and operators.
|
|
</para>
|
|
|
|
<sect2 id="datatype-tsvector">
|
|
<title><type>tsvector</type></title>
|
|
|
|
<indexterm>
|
|
<primary>tsvector (data type)</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
A <type>tsvector</type> value is a sorted list of distinct
|
|
<firstterm>lexemes</>, which are words that have been
|
|
<firstterm>normalized</> to merge different variants of the same word
|
|
(see <xref linkend="textsearch"> for details). Sorting and
|
|
duplicate-elimination are done automatically during input, as shown in
|
|
this example:
|
|
|
|
<programlisting>
|
|
SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;
|
|
tsvector
|
|
----------------------------------------------------
|
|
'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'
|
|
</programlisting>
|
|
|
|
To represent
|
|
lexemes containing whitespace or punctuation, surround them with quotes:
|
|
|
|
<programlisting>
|
|
SELECT $$the lexeme ' ' contains spaces$$::tsvector;
|
|
tsvector
|
|
-------------------------------------------
|
|
' ' 'contains' 'lexeme' 'spaces' 'the'
|
|
</programlisting>
|
|
|
|
(We use dollar-quoted string literals in this example and the next one
|
|
to avoid the confusion of having to double quote marks within the
|
|
literals.) Embedded quotes and backslashes must be doubled:
|
|
|
|
<programlisting>
|
|
SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
|
|
tsvector
|
|
------------------------------------------------
|
|
'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
|
|
</programlisting>
|
|
|
|
Optionally, integer <firstterm>positions</>
|
|
can be attached to lexemes:
|
|
|
|
<programlisting>
|
|
SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
|
|
tsvector
|
|
-------------------------------------------------------------------------------
|
|
'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4
|
|
</programlisting>
|
|
|
|
A position normally indicates the source word's location in the
|
|
document. Positional information can be used for
|
|
<firstterm>proximity ranking</firstterm>. Position values can
|
|
range from 1 to 16383; larger numbers are silently set to 16383.
|
|
Duplicate positions for the same lexeme are discarded.
|
|
</para>
|
|
|
|
<para>
|
|
Lexemes that have positions can further be labeled with a
|
|
<firstterm>weight</>, which can be <literal>A</literal>,
|
|
<literal>B</literal>, <literal>C</literal>, or <literal>D</literal>.
|
|
<literal>D</literal> is the default and hence is not shown on output:
|
|
|
|
<programlisting>
|
|
SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
|
|
tsvector
|
|
----------------------------
|
|
'a':1A 'cat':5 'fat':2B,4C
|
|
</programlisting>
|
|
|
|
Weights are typically used to reflect document structure, for example
|
|
by marking title words differently from body words. Text search
|
|
ranking functions can assign different priorities to the different
|
|
weight markers.
|
|
</para>
|
|
|
|
<para>
|
|
It is important to understand that the
|
|
<type>tsvector</type> type itself does not perform any normalization;
|
|
it assumes the words it is given are normalized appropriately
|
|
for the application. For example,
|
|
|
|
<programlisting>
|
|
select 'The Fat Rats'::tsvector;
|
|
tsvector
|
|
--------------------
|
|
'Fat' 'Rats' 'The'
|
|
</programlisting>
|
|
|
|
For most English-text-searching applications the above words would
|
|
be considered non-normalized, but <type>tsvector</type> doesn't care.
|
|
Raw document text should usually be passed through
|
|
<function>to_tsvector</> to normalize the words appropriately
|
|
for searching:
|
|
|
|
<programlisting>
|
|
SELECT to_tsvector('english', 'The Fat Rats');
|
|
to_tsvector
|
|
-----------------
|
|
'fat':2 'rat':3
|
|
</programlisting>
|
|
|
|
Again, see <xref linkend="textsearch"> for more detail.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="datatype-tsquery">
|
|
<title><type>tsquery</type></title>
|
|
|
|
<indexterm>
|
|
<primary>tsquery (data type)</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
A <type>tsquery</type> value stores lexemes that are to be
|
|
searched for, and combines them honoring the boolean operators
|
|
<literal>&</literal> (AND), <literal>|</literal> (OR), and
|
|
<literal>!</> (NOT). Parentheses can be used to enforce grouping
|
|
of the operators:
|
|
|
|
<programlisting>
|
|
SELECT 'fat & rat'::tsquery;
|
|
tsquery
|
|
---------------
|
|
'fat' & 'rat'
|
|
|
|
SELECT 'fat & (rat | cat)'::tsquery;
|
|
tsquery
|
|
---------------------------
|
|
'fat' & ( 'rat' | 'cat' )
|
|
|
|
SELECT 'fat & rat & ! cat'::tsquery;
|
|
tsquery
|
|
------------------------
|
|
'fat' & 'rat' & !'cat'
|
|
</programlisting>
|
|
|
|
In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
|
|
and <literal>&</literal> (AND) binds more tightly than
|
|
<literal>|</literal> (OR).
|
|
</para>
|
|
|
|
<para>
|
|
Optionally, lexemes in a <type>tsquery</type> can be labeled with
|
|
one or more weight letters, which restricts them to match only
|
|
<type>tsvector</> lexemes with matching weights:
|
|
|
|
<programlisting>
|
|
SELECT 'fat:ab & cat'::tsquery;
|
|
tsquery
|
|
------------------
|
|
'fat':AB & 'cat'
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</>
|
|
to specify prefix matching:
|
|
<programlisting>
|
|
SELECT 'super:*'::tsquery;
|
|
tsquery
|
|
-----------
|
|
'super':*
|
|
</programlisting>
|
|
This query will match any word in a <type>tsvector</> that begins
|
|
with <quote>super</>.
|
|
</para>
|
|
|
|
<para>
|
|
Quoting rules for lexemes are the same as described previously for
|
|
lexemes in <type>tsvector</>; and, as with <type>tsvector</>,
|
|
any required normalization of words must be done before converting
|
|
to the <type>tsquery</> type. The <function>to_tsquery</>
|
|
function is convenient for performing such normalization:
|
|
|
|
<programlisting>
|
|
SELECT to_tsquery('Fat:ab & Cats');
|
|
to_tsquery
|
|
------------------
|
|
'fat':AB & 'cat'
|
|
</programlisting>
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-uuid">
|
|
<title><acronym>UUID</acronym> Type</title>
|
|
|
|
<indexterm zone="datatype-uuid">
|
|
<primary>UUID</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The data type <type>uuid</type> stores Universally Unique Identifiers
|
|
(UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.
|
|
(Some systems refer to this data type as a globally unique identifier, or
|
|
GUID,<indexterm><primary>GUID</primary></indexterm> instead.) This
|
|
identifier is a 128-bit quantity that is generated by an algorithm chosen
|
|
to make it very unlikely that the same identifier will be generated by
|
|
anyone else in the known universe using the same algorithm. Therefore,
|
|
for distributed systems, these identifiers provide a better uniqueness
|
|
guarantee than sequence generators, which
|
|
are only unique within a single database.
|
|
</para>
|
|
|
|
<para>
|
|
A UUID is written as a sequence of lower-case hexadecimal digits,
|
|
in several groups separated by hyphens, specifically a group of 8
|
|
digits followed by three groups of 4 digits followed by a group of
|
|
12 digits, for a total of 32 digits representing the 128 bits. An
|
|
example of a UUID in this standard form is:
|
|
<programlisting>
|
|
a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11
|
|
</programlisting>
|
|
<productname>PostgreSQL</productname> also accepts the following
|
|
alternative forms for input:
|
|
use of upper-case digits, the standard format surrounded by
|
|
braces, omitting some or all hyphens, adding a hyphen after any
|
|
group of four digits. Examples are:
|
|
<programlisting>
|
|
A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11
|
|
{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}
|
|
a0eebc999c0b4ef8bb6d6bb9bd380a11
|
|
a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
|
|
{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}
|
|
</programlisting>
|
|
Output is always in the standard form.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides storage and comparison
|
|
functions for UUIDs, but the core database does not include any
|
|
function for generating UUIDs, because no single algorithm is well
|
|
suited for every application. The contrib module
|
|
<filename>contrib/uuid-ossp</filename> provides functions that implement
|
|
several standard algorithms.
|
|
Alternatively, UUIDs could be generated by client applications or
|
|
other libraries invoked through a server-side function.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-xml">
|
|
<title><acronym>XML</> Type</title>
|
|
|
|
<indexterm zone="datatype-xml">
|
|
<primary>XML</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <type>xml</type> data type can be used to store XML data. Its
|
|
advantage over storing XML data in a <type>text</type> field is that it
|
|
checks the input values for well-formedness, and there are support
|
|
functions to perform type-safe operations on it; see <xref
|
|
linkend="functions-xml">. Use of this data type requires the
|
|
installation to have been built with <command>configure
|
|
--with-libxml</>.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>xml</type> type can store well-formed
|
|
<quote>documents</quote>, as defined by the XML standard, as well
|
|
as <quote>content</quote> fragments, which are defined by the
|
|
production <literal>XMLDecl? content</literal> in the XML
|
|
standard. Roughly, this means that content fragments can have
|
|
more than one top-level element or character node. The expression
|
|
<literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>
|
|
can be used to evaluate whether a particular <type>xml</type>
|
|
value is a full document or only a content fragment.
|
|
</para>
|
|
|
|
<sect2>
|
|
<title>Creating XML Values</title>
|
|
<para>
|
|
To produce a value of type <type>xml</type> from character data,
|
|
use the function
|
|
<function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>
|
|
<synopsis>
|
|
XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
|
|
</synopsis>
|
|
Examples:
|
|
<programlisting><![CDATA[
|
|
XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
|
|
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
|
|
]]></programlisting>
|
|
While this is the only way to convert character strings into XML
|
|
values according to the SQL standard, the PostgreSQL-specific
|
|
syntaxes:
|
|
<programlisting><![CDATA[
|
|
xml '<foo>bar</foo>'
|
|
'<foo>bar</foo>'::xml
|
|
]]></programlisting>
|
|
can also be used.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>xml</type> type does not validate input values
|
|
against a document type declaration
|
|
(DTD),<indexterm><primary>DTD</primary></indexterm>
|
|
even when the input value specifies a DTD.
|
|
</para>
|
|
|
|
<para>
|
|
The inverse operation, producing a character string value from
|
|
<type>xml</type>, uses the function
|
|
<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
|
|
<synopsis>
|
|
XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
|
|
</synopsis>
|
|
<replaceable>type</replaceable> can be
|
|
<type>character</type>, <type>character varying</type>, or
|
|
<type>text</type> (or an alias for one of those). Again, according
|
|
to the SQL standard, this is the only way to convert between type
|
|
<type>xml</type> and character types, but PostgreSQL also allows
|
|
you to simply cast the value.
|
|
</para>
|
|
|
|
<para>
|
|
When a character string value is cast to or from type
|
|
<type>xml</type> without going through <type>XMLPARSE</type> or
|
|
<type>XMLSERIALIZE</type>, respectively, the choice of
|
|
<literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
|
|
determined by the <quote>XML option</quote>
|
|
<indexterm><primary>XML option</primary></indexterm>
|
|
session configuration parameter, which can be set using the
|
|
standard command:
|
|
<synopsis>
|
|
SET XML OPTION { DOCUMENT | CONTENT };
|
|
</synopsis>
|
|
or the more PostgreSQL-like syntax
|
|
<synopsis>
|
|
SET xmloption TO { DOCUMENT | CONTENT };
|
|
</synopsis>
|
|
The default is <literal>CONTENT</literal>, so all forms of XML
|
|
data are allowed.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Encoding Handling</title>
|
|
<para>
|
|
Care must be taken when dealing with multiple character encodings
|
|
on the client, server, and in the XML data passed through them.
|
|
When using the text mode to pass queries to the server and query
|
|
results to the client (which is the normal mode), PostgreSQL
|
|
converts all character data passed between the client and the
|
|
server and vice versa to the character encoding of the respective
|
|
end; see <xref linkend="multibyte">. This includes string
|
|
representations of XML values, such as in the above examples.
|
|
This would ordinarily mean that encoding declarations contained in
|
|
XML data can become invalid as the character data is converted
|
|
to other encodings while travelling between client and server,
|
|
because the embedded encoding declaration is not changed. To cope
|
|
with this behavior, encoding declarations contained in
|
|
character strings presented for input to the <type>xml</type> type
|
|
are <emphasis>ignored</emphasis>, and content is assumed
|
|
to be in the current server encoding. Consequently, for correct
|
|
processing, character strings of XML data must be sent
|
|
from the client in the current client encoding. It is the
|
|
responsibility of the client to either convert documents to the
|
|
current client encoding before sending them to the server, or to
|
|
adjust the client encoding appropriately. On output, values of
|
|
type <type>xml</type> will not have an encoding declaration, and
|
|
clients should assume all data is in the current client
|
|
encoding.
|
|
</para>
|
|
|
|
<para>
|
|
When using binary mode to pass query parameters to the server
|
|
and query results back to the client, no character set conversion
|
|
is performed, so the situation is different. In this case, an
|
|
encoding declaration in the XML data will be observed, and if it
|
|
is absent, the data will be assumed to be in UTF-8 (as required by
|
|
the XML standard; note that PostgreSQL does not support UTF-16).
|
|
On output, data will have an encoding declaration
|
|
specifying the client encoding, unless the client encoding is
|
|
UTF-8, in which case it will be omitted.
|
|
</para>
|
|
|
|
<para>
|
|
Needless to say, processing XML data with PostgreSQL will be less
|
|
error-prone and more efficient if the XML data encoding, client encoding,
|
|
and server encoding are the same. Since XML data is internally
|
|
processed in UTF-8, computations will be most efficient if the
|
|
server encoding is also UTF-8.
|
|
</para>
|
|
|
|
<caution>
|
|
<para>
|
|
Some XML-related functions may not work at all on non-ASCII data
|
|
when the server encoding is not UTF-8. This is known to be an
|
|
issue for <function>xpath()</> in particular.
|
|
</para>
|
|
</caution>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Accessing XML Values</title>
|
|
|
|
<para>
|
|
The <type>xml</type> data type is unusual in that it does not
|
|
provide any comparison operators. This is because there is no
|
|
well-defined and universally useful comparison algorithm for XML
|
|
data. One consequence of this is that you cannot retrieve rows by
|
|
comparing an <type>xml</type> column against a search value. XML
|
|
values should therefore typically be accompanied by a separate key
|
|
field such as an ID. An alternative solution for comparing XML
|
|
values is to convert them to character strings first, but note
|
|
that character string comparison has little to do with a useful
|
|
XML comparison method.
|
|
</para>
|
|
|
|
<para>
|
|
Since there are no comparison operators for the <type>xml</type>
|
|
data type, it is not possible to create an index directly on a
|
|
column of this type. If speedy searches in XML data are desired,
|
|
possible workarounds include casting the expression to a
|
|
character string type and indexing that, or indexing an XPath
|
|
expression. Of course, the actual query would have to be adjusted
|
|
to search by the indexed expression.
|
|
</para>
|
|
|
|
<para>
|
|
The text-search functionality in PostgreSQL can also be used to speed
|
|
up full-document searches of XML data. The necessary
|
|
preprocessing support is, however, not yet available in the PostgreSQL
|
|
distribution.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
&array;
|
|
|
|
&rowtypes;
|
|
|
|
<sect1 id="datatype-oid">
|
|
<title>Object Identifier Types</title>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>object identifier</primary>
|
|
<secondary>data type</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>oid</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regproc</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regprocedure</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regoper</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regoperator</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regclass</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regtype</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regconfig</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>regdictionary</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>xid</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>cid</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-oid">
|
|
<primary>tid</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Object identifiers (OIDs) are used internally by
|
|
<productname>PostgreSQL</productname> as primary keys for various
|
|
system tables. OIDs are not added to user-created tables, unless
|
|
<literal>WITH OIDS</literal> is specified when the table is
|
|
created, or the <xref linkend="guc-default-with-oids">
|
|
configuration variable is enabled. Type <type>oid</> represents
|
|
an object identifier. There are also several alias types for
|
|
<type>oid</>: <type>regproc</>, <type>regprocedure</>,
|
|
<type>regoper</>, <type>regoperator</>, <type>regclass</>,
|
|
<type>regtype</>, <type>regconfig</>, and <type>regdictionary</>.
|
|
<xref linkend="datatype-oid-table"> shows an overview.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>oid</> type is currently implemented as an unsigned
|
|
four-byte integer. Therefore, it is not large enough to provide
|
|
database-wide uniqueness in large databases, or even in large
|
|
individual tables. So, using a user-created table's OID column as
|
|
a primary key is discouraged. OIDs are best used only for
|
|
references to system tables.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>oid</> type itself has few operations beyond comparison.
|
|
It can be cast to integer, however, and then manipulated using the
|
|
standard integer operators. (Beware of possible
|
|
signed-versus-unsigned confusion if you do this.)
|
|
</para>
|
|
|
|
<para>
|
|
The OID alias types have no operations of their own except
|
|
for specialized input and output routines. These routines are able
|
|
to accept and display symbolic names for system objects, rather than
|
|
the raw numeric value that type <type>oid</> would use. The alias
|
|
types allow simplified lookup of OID values for objects. For example,
|
|
to examine the <structname>pg_attribute</> rows related to a table
|
|
<literal>mytable</>, one could write:
|
|
<programlisting>
|
|
SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;
|
|
</programlisting>
|
|
rather than:
|
|
<programlisting>
|
|
SELECT * FROM pg_attribute
|
|
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');
|
|
</programlisting>
|
|
While that doesn't look all that bad by itself, it's still oversimplified.
|
|
A far more complicated sub-select would be needed to
|
|
select the right OID if there are multiple tables named
|
|
<literal>mytable</> in different schemas.
|
|
The <type>regclass</> input converter handles the table lookup according
|
|
to the schema path setting, and so it does the <quote>right thing</>
|
|
automatically. Similarly, casting a table's OID to
|
|
<type>regclass</> is handy for symbolic display of a numeric OID.
|
|
</para>
|
|
|
|
<table id="datatype-oid-table">
|
|
<title>Object Identifier Types</title>
|
|
<tgroup cols="4">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>References</entry>
|
|
<entry>Description</entry>
|
|
<entry>Value Example</entry>
|
|
</row>
|
|
</thead>
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
<entry><type>oid</></entry>
|
|
<entry>any</entry>
|
|
<entry>numeric object identifier</entry>
|
|
<entry><literal>564182</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regproc</></entry>
|
|
<entry><structname>pg_proc</></entry>
|
|
<entry>function name</entry>
|
|
<entry><literal>sum</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regprocedure</></entry>
|
|
<entry><structname>pg_proc</></entry>
|
|
<entry>function with argument types</entry>
|
|
<entry><literal>sum(int4)</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regoper</></entry>
|
|
<entry><structname>pg_operator</></entry>
|
|
<entry>operator name</entry>
|
|
<entry><literal>+</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regoperator</></entry>
|
|
<entry><structname>pg_operator</></entry>
|
|
<entry>operator with argument types</entry>
|
|
<entry><literal>*(integer,integer)</> or <literal>-(NONE,integer)</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regclass</></entry>
|
|
<entry><structname>pg_class</></entry>
|
|
<entry>relation name</entry>
|
|
<entry><literal>pg_type</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regtype</></entry>
|
|
<entry><structname>pg_type</></entry>
|
|
<entry>data type name</entry>
|
|
<entry><literal>integer</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regconfig</></entry>
|
|
<entry><structname>pg_ts_config</></entry>
|
|
<entry>text search configuration</entry>
|
|
<entry><literal>english</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>regdictionary</></entry>
|
|
<entry><structname>pg_ts_dict</></entry>
|
|
<entry>text search dictionary</entry>
|
|
<entry><literal>simple</></entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
All of the OID alias types accept schema-qualified names, and will
|
|
display schema-qualified names on output if the object would not
|
|
be found in the current search path without being qualified.
|
|
The <type>regproc</> and <type>regoper</> alias types will only
|
|
accept input names that are unique (not overloaded), so they are
|
|
of limited use; for most uses <type>regprocedure</> or
|
|
<type>regoperator</> are more appropriate. For <type>regoperator</>,
|
|
unary operators are identified by writing <literal>NONE</> for the unused
|
|
operand.
|
|
</para>
|
|
|
|
<para>
|
|
An additional property of the OID alias types is the creation of
|
|
dependencies. If a
|
|
constant of one of these types appears in a stored expression
|
|
(such as a column default expression or view), it creates a dependency
|
|
on the referenced object. For example, if a column has a default
|
|
expression <literal>nextval('my_seq'::regclass)</>,
|
|
<productname>PostgreSQL</productname>
|
|
understands that the default expression depends on the sequence
|
|
<literal>my_seq</>; the system will not let the sequence be dropped
|
|
without first removing the default expression.
|
|
</para>
|
|
|
|
<para>
|
|
Another identifier type used by the system is <type>xid</>, or transaction
|
|
(abbreviated <abbrev>xact</>) identifier. This is the data type of the system columns
|
|
<structfield>xmin</> and <structfield>xmax</>. Transaction identifiers are 32-bit quantities.
|
|
</para>
|
|
|
|
<para>
|
|
A third identifier type used by the system is <type>cid</>, or
|
|
command identifier. This is the data type of the system columns
|
|
<structfield>cmin</> and <structfield>cmax</>. Command identifiers are also 32-bit quantities.
|
|
</para>
|
|
|
|
<para>
|
|
A final identifier type used by the system is <type>tid</>, or tuple
|
|
identifier (row identifier). This is the data type of the system column
|
|
<structfield>ctid</>. A tuple ID is a pair
|
|
(block number, tuple index within block) that identifies the
|
|
physical location of the row within its table.
|
|
</para>
|
|
|
|
<para>
|
|
(The system columns are further explained in <xref
|
|
linkend="ddl-system-columns">.)
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="datatype-pseudo">
|
|
<title>Pseudo-Types</title>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>record</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>any</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>anyelement</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>anyarray</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>anynonarray</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>anyenum</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>void</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>trigger</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>language_handler</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>cstring</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>internal</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="datatype-pseudo">
|
|
<primary>opaque</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <productname>PostgreSQL</productname> type system contains a
|
|
number of special-purpose entries that are collectively called
|
|
<firstterm>pseudo-types</>. A pseudo-type cannot be used as a
|
|
column data type, but it can be used to declare a function's
|
|
argument or result type. Each of the available pseudo-types is
|
|
useful in situations where a function's behavior does not
|
|
correspond to simply taking or returning a value of a specific
|
|
<acronym>SQL</acronym> data type. <xref
|
|
linkend="datatype-pseudotypes-table"> lists the existing
|
|
pseudo-types.
|
|
</para>
|
|
|
|
<table id="datatype-pseudotypes-table">
|
|
<title>Pseudo-Types</title>
|
|
<tgroup cols="2">
|
|
<thead>
|
|
<row>
|
|
<entry>Name</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
|
|
<tbody>
|
|
<row>
|
|
<entry><type>any</></entry>
|
|
<entry>Indicates that a function accepts any input data type.</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>anyarray</></entry>
|
|
<entry>Indicates that a function accepts any array data type
|
|
(see <xref linkend="extend-types-polymorphic">).</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>anyelement</></entry>
|
|
<entry>Indicates that a function accepts any data type
|
|
(see <xref linkend="extend-types-polymorphic">).</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>anyenum</></entry>
|
|
<entry>Indicates that a function accepts any enum data type
|
|
(see <xref linkend="extend-types-polymorphic"> and
|
|
<xref linkend="datatype-enum">).</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>anynonarray</></entry>
|
|
<entry>Indicates that a function accepts any non-array data type
|
|
(see <xref linkend="extend-types-polymorphic">).</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>cstring</></entry>
|
|
<entry>Indicates that a function accepts or returns a null-terminated C string.</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>internal</></entry>
|
|
<entry>Indicates that a function accepts or returns a server-internal
|
|
data type.</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>language_handler</></entry>
|
|
<entry>A procedural language call handler is declared to return <type>language_handler</>.</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>record</></entry>
|
|
<entry>Identifies a function returning an unspecified row type.</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>trigger</></entry>
|
|
<entry>A trigger function is declared to return <type>trigger.</></entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>void</></entry>
|
|
<entry>Indicates that a function returns no value.</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry><type>opaque</></entry>
|
|
<entry>An obsolete type name that formerly served all the above purposes.</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
Functions coded in C (whether built-in or dynamically loaded) can be
|
|
declared to accept or return any of these pseudo data types. It is up to
|
|
the function author to ensure that the function will behave safely
|
|
when a pseudo-type is used as an argument type.
|
|
</para>
|
|
|
|
<para>
|
|
Functions coded in procedural languages can use pseudo-types only as
|
|
allowed by their implementation languages. At present the procedural
|
|
languages all forbid use of a pseudo-type as argument type, and allow
|
|
only <type>void</> and <type>record</> as a result type (plus
|
|
<type>trigger</> when the function is used as a trigger). Some also
|
|
support polymorphic functions using the types <type>anyarray</>,
|
|
<type>anyelement</>, <type>anyenum</>, and <type>anynonarray</>.
|
|
</para>
|
|
|
|
<para>
|
|
The <type>internal</> pseudo-type is used to declare functions
|
|
that are meant only to be called internally by the database
|
|
system, and not by direct invocation in an <acronym>SQL</acronym>
|
|
query. If a function has at least one <type>internal</>-type
|
|
argument then it cannot be called from <acronym>SQL</acronym>. To
|
|
preserve the type safety of this restriction it is important to
|
|
follow this coding rule: do not create any function that is
|
|
declared to return <type>internal</> unless it has at least one
|
|
<type>internal</> argument.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
</chapter>
|