2009-07-08 19:21:55 +02:00
<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.240 2009/07/08 17:21:55 tgl Exp $ -->
2000-03-31 05:27:42 +02:00
1999-05-12 09:32:47 +02:00
<chapter id="datatype">
1999-06-14 09:36:12 +02:00
<title id="datatype-title">Data Types</title>
1998-03-01 09:16:16 +01:00
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype">
2003-08-31 19:32:24 +02:00
<primary>data type</primary>
2001-05-13 00:51:36 +02:00
</indexterm>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>type</primary>
<see>data type</see>
2001-05-13 00:51:36 +02:00
</indexterm>
1999-05-12 09:32:47 +02:00
<para>
2005-10-15 03:47:12 +02:00
<productname>PostgreSQL</productname> has a rich set of native data
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
types available to users. Users can add new types to
2005-10-15 03:47:12 +02:00
<productname>PostgreSQL</productname> using the <xref
linkend="sql-createtype" endterm="sql-createtype-title"> command.
1999-05-12 09:32:47 +02:00
</para>
1998-03-01 09:16:16 +01:00
1999-05-12 09:32:47 +02:00
<para>
2004-12-23 06:37:40 +01:00
<xref linkend="datatype-table"> shows all the built-in general-purpose data
types. Most of the alternative names listed in the
2001-01-13 19:34:51 +01:00
<quote>Aliases</quote> column are the names used internally by
2001-11-21 06:53:41 +01:00
<productname>PostgreSQL</productname> for historical reasons. In
2001-01-13 19:34:51 +01:00
addition, some internally used or deprecated types are available,
2009-04-27 18:27:36 +02:00
but are not listed here.
1999-05-12 09:32:47 +02:00
</para>
1998-12-18 17:11:12 +01:00
2001-01-13 19:34:51 +01:00
<table id="datatype-table">
2001-02-14 20:37:26 +01:00
<title>Data Types</title>
1999-05-12 09:32:47 +02:00
<tgroup cols="3">
<thead>
<row>
2003-03-13 02:30:29 +01:00
<entry>Name</entry>
2001-01-13 19:34:51 +01:00
<entry>Aliases</entry>
1999-05-12 09:32:47 +02:00
<entry>Description</entry>
</row>
</thead>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<tbody>
<row>
2001-01-13 19:34:51 +01:00
<entry><type>bigint</type></entry>
<entry><type>int8</type></entry>
<entry>signed eight-byte integer</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
2001-08-24 22:03:45 +02:00
<row>
<entry><type>bigserial</type></entry>
<entry><type>serial8</type></entry>
<entry>autoincrementing eight-byte integer</entry>
</row>
1999-05-12 09:32:47 +02:00
<row>
2004-09-20 06:19:50 +02:00
<entry><type>bit [ (<replaceable>n</replaceable>) ]</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2001-01-13 19:34:51 +01:00
<entry>fixed-length bit string</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2004-09-20 06:19:50 +02:00
<entry><type>bit varying [ (<replaceable>n</replaceable>) ]</type></entry>
<entry><type>varbit</type></entry>
2001-01-13 19:34:51 +01:00
<entry>variable-length bit string</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>boolean</type></entry>
<entry><type>bool</type></entry>
2001-02-14 20:37:26 +01:00
<entry>logical Boolean (true/false)</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>box</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>rectangular box on a plane</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
2001-09-04 05:17:54 +02:00
<row>
<entry><type>bytea</type></entry>
<entry></entry>
2004-12-23 06:37:40 +01:00
<entry>binary data (<quote>byte array</>)</entry>
2001-09-04 05:17:54 +02:00
</row>
1999-10-13 04:44:23 +02:00
<row>
2004-09-20 06:19:50 +02:00
<entry><type>character varying [ (<replaceable>n</replaceable>) ]</type></entry>
<entry><type>varchar [ (<replaceable>n</replaceable>) ]</type></entry>
2001-01-13 19:34:51 +01:00
<entry>variable-length character string</entry>
1999-10-13 04:44:23 +02:00
</row>
2001-01-13 19:34:51 +01:00
2003-01-15 19:01:05 +01:00
<row>
2004-09-20 06:19:50 +02:00
<entry><type>character [ (<replaceable>n</replaceable>) ]</type></entry>
<entry><type>char [ (<replaceable>n</replaceable>) ]</type></entry>
2003-01-15 19:01:05 +01:00
<entry>fixed-length character string</entry>
</row>
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>cidr</type></entry>
<entry></entry>
2003-06-25 00:21:24 +02:00
<entry>IPv4 or IPv6 network address</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>circle</type></entry>
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>circle on a plane</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>date</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2001-01-13 19:34:51 +01:00
<entry>calendar date (year, month, day)</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>double precision</type></entry>
<entry><type>float8</type></entry>
2009-04-27 18:27:36 +02:00
<entry>double precision floating-point number (8 bytes)</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>inet</type></entry>
<entry></entry>
2003-06-25 00:21:24 +02:00
<entry>IPv4 or IPv6 host address</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>integer</type></entry>
<entry><type>int</type>, <type>int4</type></entry>
<entry>signed four-byte integer</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
2000-03-14 23:52:53 +01:00
<row>
2008-09-11 17:27:30 +02:00
<entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
2001-01-13 19:34:51 +01:00
<entry></entry>
2003-03-13 02:30:29 +01:00
<entry>time span</entry>
2000-03-14 23:52:53 +01:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>line</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>infinite line on a plane</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>lseg</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>line segment on a plane</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>macaddr</type></entry>
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>MAC (Media Access Control) address</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-10-13 04:44:23 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>money</type></entry>
<entry></entry>
2002-11-11 21:14:04 +01:00
<entry>currency amount</entry>
1999-10-13 04:44:23 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-12-29 19:35:54 +01:00
<entry><type>numeric [ (<replaceable>p</replaceable>,
2003-11-01 02:56:29 +01:00
<replaceable>s</replaceable>) ]</type></entry>
2001-12-29 19:35:54 +01:00
<entry><type>decimal [ (<replaceable>p</replaceable>,
2003-11-01 02:56:29 +01:00
<replaceable>s</replaceable>) ]</type></entry>
2004-12-23 06:37:40 +01:00
<entry>exact numeric of selectable precision</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>path</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>geometric path on a plane</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>point</type></entry>
1999-05-12 09:32:47 +02:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>geometric point on a plane</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
2000-08-25 01:36:29 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>polygon</type></entry>
2000-08-25 01:36:29 +02:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>closed geometric path on a plane</entry>
2000-08-25 01:36:29 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>real</type></entry>
<entry><type>float4</type></entry>
2009-04-27 18:27:36 +02:00
<entry>single precision floating-point number (4 bytes)</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>smallint</type></entry>
<entry><type>int2</type></entry>
<entry>signed two-byte integer</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>serial</type></entry>
2001-08-16 22:38:56 +02:00
<entry><type>serial4</type></entry>
2001-01-13 19:34:51 +01:00
<entry>autoincrementing four-byte integer</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-01-13 19:34:51 +01:00
<entry><type>text</type></entry>
<entry></entry>
1999-05-12 09:32:47 +02:00
<entry>variable-length character string</entry>
</row>
1998-12-18 17:11:12 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-12-08 04:24:23 +01:00
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
2001-01-13 19:34:51 +01:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>time of day (no time zone)</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
1999-05-12 09:32:47 +02:00
<row>
2001-12-08 04:24:23 +01:00
<entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
2001-11-28 21:49:10 +01:00
<entry><type>timetz</type></entry>
2001-01-13 19:34:51 +01:00
<entry>time of day, including time zone</entry>
1999-05-12 09:32:47 +02:00
</row>
2001-01-13 19:34:51 +01:00
2001-09-28 10:15:35 +02:00
<row>
2004-09-18 17:28:03 +02:00
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
2004-12-23 06:37:40 +01:00
<entry></entry>
2009-04-27 18:27:36 +02:00
<entry>date and time (no time zone)</entry>
2001-09-28 10:15:35 +02:00
</row>
1999-05-12 09:32:47 +02:00
<row>
2004-09-18 17:28:03 +02:00
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
2001-11-28 21:49:10 +01:00
<entry><type>timestamptz</type></entry>
2001-11-19 10:05:02 +01:00
<entry>date and time, including time zone</entry>
1999-05-12 09:32:47 +02:00
</row>
2006-12-21 17:05:16 +01:00
2007-08-29 22:37:14 +02:00
<row>
<entry><type>tsquery</type></entry>
<entry></entry>
2007-10-21 22:04:37 +02:00
<entry>text search query</entry>
2007-08-29 22:37:14 +02:00
</row>
<row>
<entry><type>tsvector</type></entry>
<entry></entry>
2007-10-21 22:04:37 +02:00
<entry>text search document</entry>
2007-08-29 22:37:14 +02:00
</row>
2007-10-14 01:06:28 +02:00
<row>
<entry><type>txid_snapshot</type></entry>
<entry></entry>
<entry>user-level transaction ID snapshot</entry>
</row>
2007-04-20 23:51:46 +02:00
<row>
<entry><type>uuid</type></entry>
<entry></entry>
<entry>universally unique identifier</entry>
</row>
2006-12-21 17:05:16 +01:00
<row>
<entry><type>xml</type></entry>
<entry></entry>
<entry>XML data</entry>
</row>
1999-05-12 09:32:47 +02:00
</tbody>
</tgroup>
</table>
1998-10-27 07:14:41 +01:00
2001-01-13 19:34:51 +01:00
<note>
<title>Compatibility</title>
<para>
2002-11-15 04:11:18 +01:00
The following types (or spellings thereof) are specified by
2007-07-27 12:37:52 +02:00
<acronym>SQL</acronym>: <type>bigint</type>, <type>bit</type>, <type>bit
2002-11-15 04:11:18 +01:00
varying</type>, <type>boolean</type>, <type>char</type>,
2003-01-15 19:01:05 +01:00
<type>character varying</type>, <type>character</type>,
2002-11-15 04:11:18 +01:00
<type>varchar</type>, <type>date</type>, <type>double
precision</type>, <type>integer</type>, <type>interval</type>,
<type>numeric</type>, <type>decimal</type>, <type>real</type>,
2003-11-06 23:21:47 +01:00
<type>smallint</type>, <type>time</type> (with or without time zone),
2006-12-21 17:05:16 +01:00
<type>timestamp</type> (with or without time zone),
<type>xml</type>.
2001-01-13 19:34:51 +01:00
</para>
</note>
1998-03-01 09:16:16 +01:00
1999-05-12 09:32:47 +02:00
<para>
2001-11-19 10:05:02 +01:00
Each data type has an external representation determined by its input
and output functions. Many of the built-in types have
obvious external formats. However, several types are either unique
2004-12-23 06:37:40 +01:00
to <productname>PostgreSQL</productname>, such as geometric
2009-04-27 18:27:36 +02:00
paths, or have several possible formats, such as the date
2001-11-19 10:05:02 +01:00
and time types.
2009-06-17 23:58:49 +02:00
Some of the input and output functions are not invertible, i.e.,
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
the result of an output function might lose accuracy when compared to
2001-11-19 10:05:02 +01:00
the original input.
</para>
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-numeric">
1999-05-12 09:32:47 +02:00
<title>Numeric Types</title>
1998-03-01 09:16:16 +01:00
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-numeric">
2003-08-31 19:32:24 +02:00
<primary>data type</primary>
2001-05-13 00:51:36 +02:00
<secondary>numeric</secondary>
</indexterm>
1999-05-12 09:32:47 +02:00
<para>
2000-08-25 01:36:29 +02:00
Numeric types consist of two-, four-, and eight-byte integers,
2004-12-23 06:37:40 +01:00
four- and eight-byte floating-point numbers, and selectable-precision
2002-11-11 21:14:04 +01:00
decimals. <xref linkend="datatype-numeric-table"> lists the
available types.
1998-12-18 17:11:12 +01:00
</para>
1998-10-14 18:26:31 +02:00
2002-11-11 21:14:04 +01:00
<table id="datatype-numeric-table">
2001-02-14 20:37:26 +01:00
<title>Numeric Types</title>
1999-08-06 15:43:42 +02:00
<tgroup cols="4">
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
<entry>Range</entry>
1999-08-06 15:43:42 +02:00
</row>
</thead>
2001-01-13 19:34:51 +01:00
1999-08-06 15:43:42 +02:00
<tbody>
1999-10-13 04:44:23 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>smallint</></entry>
<entry>2 bytes</entry>
<entry>small-range integer</entry>
<entry>-32768 to +32767</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>integer</></entry>
<entry>4 bytes</entry>
2009-04-27 18:27:36 +02:00
<entry>typical choice for integer</entry>
2003-11-01 02:56:29 +01:00
<entry>-2147483648 to +2147483647</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>bigint</></entry>
<entry>8 bytes</entry>
<entry>large-range integer</entry>
<entry>-9223372036854775808 to 9223372036854775807</entry>
2001-01-13 19:34:51 +01:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>decimal</></entry>
<entry>variable</entry>
<entry>user-specified precision, exact</entry>
<entry>no limit</entry>
1999-10-13 04:44:23 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>numeric</></entry>
<entry>variable</entry>
<entry>user-specified precision, exact</entry>
<entry>no limit</entry>
1999-08-06 15:43:42 +02:00
</row>
2001-01-13 19:34:51 +01:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>real</></entry>
<entry>4 bytes</entry>
<entry>variable-precision, inexact</entry>
<entry>6 decimal digits precision</entry>
2001-01-13 19:34:51 +01:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>double precision</></entry>
<entry>8 bytes</entry>
<entry>variable-precision, inexact</entry>
<entry>15 decimal digits precision</entry>
2001-01-13 19:34:51 +01:00
</row>
1999-08-06 15:43:42 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>serial</></entry>
<entry>4 bytes</entry>
<entry>autoincrementing integer</entry>
<entry>1 to 2147483647</entry>
2001-08-16 22:38:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>bigserial</type></entry>
<entry>8 bytes</entry>
<entry>large autoincrementing integer</entry>
<entry>1 to 9223372036854775807</entry>
1999-08-06 15:43:42 +02:00
</row>
</tbody>
</tgroup>
</table>
1998-10-14 18:26:31 +02:00
1999-08-06 15:43:42 +02:00
<para>
2001-01-26 23:04:22 +01:00
The syntax of constants for the numeric types is described in
<xref linkend="sql-syntax-constants">. The numeric types have a
full set of corresponding arithmetic operators and
functions. Refer to <xref linkend="functions"> for more
2001-08-24 22:03:45 +02:00
information. The following sections describe the types in detail.
1999-08-06 15:43:42 +02:00
</para>
2001-08-24 22:03:45 +02:00
<sect2 id="datatype-int">
2003-03-13 02:30:29 +01:00
<title>Integer Types</title>
2001-08-24 22:03:45 +02:00
2003-08-31 19:32:24 +02:00
<indexterm zone="datatype-int">
<primary>integer</primary>
</indexterm>
<indexterm zone="datatype-int">
<primary>smallint</primary>
</indexterm>
<indexterm zone="datatype-int">
<primary>bigint</primary>
</indexterm>
<indexterm>
<primary>int4</primary>
<see>integer</see>
</indexterm>
<indexterm>
<primary>int2</primary>
<see>smallint</see>
</indexterm>
<indexterm>
<primary>int8</primary>
<see>bigint</see>
</indexterm>
2001-08-24 22:03:45 +02:00
<para>
2003-03-13 02:30:29 +01:00
The types <type>smallint</type>, <type>integer</type>, and
2001-08-24 22:03:45 +02:00
<type>bigint</type> store whole numbers, that is, numbers without
fractional components, of various ranges. Attempts to store
values outside of the allowed range will result in an error.
</para>
<para>
2009-04-27 18:27:36 +02:00
The type <type>integer</type> is the common choice, as it offers
2001-08-24 22:03:45 +02:00
the best balance between range, storage size, and performance.
The <type>smallint</type> type is generally only used if disk
space is at a premium. The <type>bigint</type> type should only
2009-04-27 18:27:36 +02:00
be used if the <type>integer</type> range is insufficient,
2001-08-24 22:03:45 +02:00
because the latter is definitely faster.
</para>
<para>
2009-04-27 18:27:36 +02:00
On very minimal operating systems the <type>bigint</type> type
2009-06-17 23:58:49 +02:00
might not function correctly, because it relies on compiler support
2009-04-27 18:27:36 +02:00
for eight-byte integers. On such machines, <type>bigint</type>
2009-06-17 23:58:49 +02:00
acts the same as <type>integer</type>, but still takes up eight
bytes of storage. (We are not aware of any modern
platform where this is the case.)
2001-08-24 22:03:45 +02:00
</para>
2001-11-19 10:05:02 +01:00
<para>
2002-11-15 04:11:18 +01:00
<acronym>SQL</acronym> only specifies the integer types
2007-07-27 12:37:52 +02:00
<type>integer</type> (or <type>int</type>),
<type>smallint</type>, and <type>bigint</type>. The
2002-11-15 04:11:18 +01:00
type names <type>int2</type>, <type>int4</type>, and
2009-06-17 23:58:49 +02:00
<type>int8</type> are extensions, which are also used by some
2002-11-15 04:11:18 +01:00
other <acronym>SQL</acronym> database systems.
2001-11-19 10:05:02 +01:00
</para>
2001-08-24 22:03:45 +02:00
</sect2>
<sect2 id="datatype-numeric-decimal">
<title>Arbitrary Precision Numbers</title>
2007-01-14 23:37:59 +01:00
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>numeric (data type)</primary>
</indexterm>
2007-01-14 23:37:59 +01:00
<indexterm>
<primary>arbitrary precision numbers</primary>
</indexterm>
2003-08-31 19:32:24 +02:00
<indexterm>
<primary>decimal</primary>
<see>numeric</see>
</indexterm>
2001-08-24 22:03:45 +02:00
<para>
2003-03-13 02:30:29 +01:00
The type <type>numeric</type> can store numbers with up to 1000
2002-04-13 03:35:09 +02:00
digits of precision and perform calculations exactly. It is
especially recommended for storing monetary amounts and other
2005-01-08 06:19:18 +01:00
quantities where exactness is required. However, arithmetic on
<type>numeric</type> values is very slow compared to the integer
2009-06-17 23:58:49 +02:00
types, or to the floating-point types described in the next section.
2001-08-24 22:03:45 +02:00
</para>
<para>
2009-04-27 18:27:36 +02:00
We use the following terms below: The
2001-08-24 22:03:45 +02:00
<firstterm>scale</firstterm> of a <type>numeric</type> is the
count of decimal digits in the fractional part, to the right of
the decimal point. The <firstterm>precision</firstterm> of a
<type>numeric</type> is the total count of significant digits in
the whole number, that is, the number of digits to both sides of
the decimal point. So the number 23.5141 has a precision of 6
and a scale of 4. Integers can be considered to have a scale of
zero.
</para>
<para>
2005-01-08 06:19:18 +01:00
Both the maximum precision and the maximum scale of a
<type>numeric</type> column can be
2001-08-24 22:03:45 +02:00
configured. To declare a column of type <type>numeric</type> use
2007-02-01 01:28:19 +01:00
the syntax:
2001-08-24 22:03:45 +02:00
<programlisting>
NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>)
</programlisting>
The precision must be positive, the scale zero or positive.
2007-02-01 01:28:19 +01:00
Alternatively:
2001-08-24 22:03:45 +02:00
<programlisting>
NUMERIC(<replaceable>precision</replaceable>)
</programlisting>
2007-02-01 01:28:19 +01:00
selects a scale of 0. Specifying:
2001-08-24 22:03:45 +02:00
<programlisting>
NUMERIC
</programlisting>
2001-11-12 22:04:46 +01:00
without any precision or scale creates a column in which numeric
2002-11-15 04:11:18 +01:00
values of any precision and scale can be stored, up to the
implementation limit on precision. A column of this kind will
not coerce input values to any particular scale, whereas
<type>numeric</type> columns with a declared scale will coerce
input values to that scale. (The <acronym>SQL</acronym> standard
requires a default scale of 0, i.e., coercion to integer
precision. We find this a bit useless. If you're concerned
about portability, always specify the precision and scale
explicitly.)
2001-08-24 22:03:45 +02:00
</para>
<para>
2005-01-08 06:19:18 +01:00
If the scale of a value to be stored is greater than the declared
scale of the column, the system will round the value to the specified
number of fractional digits. Then, if the number of digits to the
left of the decimal point exceeds the declared precision minus the
declared scale, an error is raised.
</para>
<para>
Numeric values are physically stored without any extra leading or
trailing zeroes. Thus, the declared precision and scale of a column
are maximums, not fixed allocations. (In this sense the <type>numeric</>
type is more akin to <type>varchar(<replaceable>n</>)</type>
2005-05-01 17:54:46 +02:00
than to <type>char(<replaceable>n</>)</type>.) The actual storage
requirement is two bytes for each group of four decimal digits,
2007-04-06 21:22:38 +02:00
plus five to eight bytes overhead.
2001-08-24 22:03:45 +02:00
</para>
2007-01-14 23:37:59 +01:00
<indexterm>
<primary>NaN</primary>
<see>not a number</see>
</indexterm>
<indexterm>
<primary>not a number</primary>
<secondary>numeric (data type)</secondary>
</indexterm>
2004-09-21 00:48:29 +02:00
<para>
In addition to ordinary numeric values, the <type>numeric</type>
type allows the special value <literal>NaN</>, meaning
<quote>not-a-number</quote>. Any operation on <literal>NaN</>
yields another <literal>NaN</>. When writing this value
2009-04-27 18:27:36 +02:00
as a constant in an SQL command, you must put quotes around it,
2004-09-21 00:48:29 +02:00
for example <literal>UPDATE table SET x = 'NaN'</>. On input,
the string <literal>NaN</> is recognized in a case-insensitive manner.
</para>
2007-01-14 23:37:59 +01:00
<note>
<para>
In most implementations of the <quote>not-a-number</> concept,
<literal>NaN</> is not considered equal to any other numeric
value (including <literal>NaN</>). In order to allow
<type>numeric</> values to be sorted and used in tree-based
indexes, <productname>PostgreSQL</> treats <literal>NaN</>
values as equal, and greater than all non-<literal>NaN</>
values.
</para>
</note>
2001-08-24 22:03:45 +02:00
<para>
The types <type>decimal</type> and <type>numeric</type> are
2002-11-15 04:11:18 +01:00
equivalent. Both types are part of the <acronym>SQL</acronym>
standard.
2001-08-24 22:03:45 +02:00
</para>
</sect2>
<sect2 id="datatype-float">
2002-01-07 03:29:15 +01:00
<title>Floating-Point Types</title>
2001-08-24 22:03:45 +02:00
2003-08-31 19:32:24 +02:00
<indexterm zone="datatype-float">
<primary>real</primary>
</indexterm>
<indexterm zone="datatype-float">
<primary>double precision</primary>
</indexterm>
<indexterm>
<primary>float4</primary>
<see>real</see>
</indexterm>
<indexterm>
<primary>float8</primary>
<see>double precision</see>
</indexterm>
<indexterm zone="datatype-float">
<primary>floating point</primary>
</indexterm>
2001-08-24 22:03:45 +02:00
<para>
The data types <type>real</type> and <type>double
2002-01-07 03:29:15 +01:00
precision</type> are inexact, variable-precision numeric types.
2002-11-11 21:14:04 +01:00
In practice, these types are usually implementations of
<acronym>IEEE</acronym> Standard 754 for Binary Floating-Point
Arithmetic (single and double precision, respectively), to the
extent that the underlying processor, operating system, and
compiler support it.
2001-08-24 22:03:45 +02:00
</para>
<para>
Inexact means that some values cannot be converted exactly to the
internal format and are stored as approximations, so that storing
2009-04-27 18:27:36 +02:00
and retrieving a value might show slight discrepancies.
2001-08-24 22:03:45 +02:00
Managing these errors and how they propagate through calculations
is the subject of an entire branch of mathematics and computer
2009-04-27 18:27:36 +02:00
science and will not be discussed here, except for the
2001-08-24 22:03:45 +02:00
following points:
<itemizedlist>
<listitem>
<para>
If you require exact storage and calculations (such as for
monetary amounts), use the <type>numeric</type> type instead.
</para>
</listitem>
<listitem>
<para>
If you want to do complicated calculations with these types
for anything important, especially if you rely on certain
behavior in boundary cases (infinity, underflow), you should
evaluate the implementation carefully.
</para>
</listitem>
<listitem>
<para>
2009-04-27 18:27:36 +02:00
Comparing two floating-point values for equality might not
always work as expected.
2001-08-24 22:03:45 +02:00
</para>
</listitem>
</itemizedlist>
</para>
<para>
2003-06-18 01:12:36 +02:00
On most platforms, the <type>real</type> type has a range of at least
1E-37 to 1E+37 with a precision of at least 6 decimal digits. The
<type>double precision</type> type typically has a range of around
1E-307 to 1E+308 with a precision of at least 15 digits. Values that
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
are too large or too small will cause an error. Rounding might
2001-08-24 22:03:45 +02:00
take place if the precision of an input number is too high.
Numbers too close to zero that are not representable as distinct
from zero will cause an underflow error.
</para>
2007-01-14 23:37:59 +01:00
<indexterm>
<primary>not a number</primary>
<secondary>double precision</secondary>
</indexterm>
2004-09-21 00:48:29 +02:00
<para>
In addition to ordinary numeric values, the floating-point types
have several special values:
<literallayout>
<literal>Infinity</literal>
<literal>-Infinity</literal>
<literal>NaN</literal>
</literallayout>
These represent the IEEE 754 special values
<quote>infinity</quote>, <quote>negative infinity</quote>, and
<quote>not-a-number</quote>, respectively. (On a machine whose
floating-point arithmetic does not follow IEEE 754, these values
will probably not work as expected.) When writing these values
2009-06-17 23:58:49 +02:00
as constants in an SQL command, you must put quotes around them,
2004-09-21 00:48:29 +02:00
for example <literal>UPDATE table SET x = 'Infinity'</>. On input,
these strings are recognized in a case-insensitive manner.
</para>
2007-01-14 23:37:59 +01:00
<note>
<para>
IEEE754 specifies that <literal>NaN</> should not compare equal
to any other floating-point value (including <literal>NaN</>).
In order to allow floating-point values to be sorted and used
in tree-based indexes, <productname>PostgreSQL</> treats
<literal>NaN</> values as equal, and greater than all
non-<literal>NaN</> values.
</para>
</note>
2003-06-18 01:12:36 +02:00
<para>
<productname>PostgreSQL</productname> also supports the SQL-standard
notations <type>float</type> and
<type>float(<replaceable>p</replaceable>)</type> for specifying
inexact numeric types. Here, <replaceable>p</replaceable> specifies
2009-04-27 18:27:36 +02:00
the minimum acceptable precision in <emphasis>binary</> digits.
2003-06-18 01:12:36 +02:00
<productname>PostgreSQL</productname> accepts
<type>float(1)</type> to <type>float(24)</type> as selecting the
<type>real</type> type, while
<type>float(25)</type> to <type>float(53)</type> select
<type>double precision</type>. Values of <replaceable>p</replaceable>
outside the allowed range draw an error.
<type>float</type> with no precision specified is taken to mean
<type>double precision</type>.
</para>
<note>
<para>
Prior to <productname>PostgreSQL</productname> 7.4, the precision in
<type>float(<replaceable>p</replaceable>)</type> was taken to mean
2009-04-27 18:27:36 +02:00
so many <emphasis>decimal</> digits. This has been corrected to match the SQL
2003-06-18 01:12:36 +02:00
standard, which specifies that the precision is measured in binary
digits. The assumption that <type>real</type> and
<type>double precision</type> have exactly 24 and 53 bits in the
mantissa respectively is correct for IEEE-standard floating point
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
implementations. On non-IEEE platforms it might be off a little, but
2003-06-18 01:12:36 +02:00
for simplicity the same ranges of <replaceable>p</replaceable> are used
on all platforms.
</para>
</note>
2001-08-24 22:03:45 +02:00
</sect2>
1999-08-06 15:43:42 +02:00
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-serial">
2003-03-13 02:30:29 +01:00
<title>Serial Types</title>
1999-08-06 15:43:42 +02:00
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-serial">
<primary>serial</primary>
</indexterm>
2001-10-30 21:13:44 +01:00
<indexterm zone="datatype-serial">
<primary>bigserial</primary>
</indexterm>
2001-08-16 22:38:56 +02:00
<indexterm zone="datatype-serial">
<primary>serial4</primary>
</indexterm>
<indexterm zone="datatype-serial">
<primary>serial8</primary>
</indexterm>
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>auto-increment</primary>
<see>serial</see>
</indexterm>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>sequence</primary>
2001-05-13 00:51:36 +02:00
<secondary>and serial type</secondary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2003-03-13 02:30:29 +01:00
The data types <type>serial</type> and <type>bigserial</type>
are not true types, but merely
2009-04-27 18:27:36 +02:00
a notational convenience for creating unique identifier columns
2002-12-06 06:17:42 +01:00
(similar to the <literal>AUTO_INCREMENT</literal> property
supported by some other databases). In the current
2007-02-01 01:28:19 +01:00
implementation, specifying:
1999-08-06 15:43:42 +02:00
2001-08-24 22:03:45 +02:00
<programlisting>
2001-10-09 20:46:00 +02:00
CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
<replaceable class="parameter">colname</replaceable> SERIAL
);
2001-08-24 22:03:45 +02:00
</programlisting>
1998-10-14 18:26:31 +02:00
1999-08-06 15:43:42 +02:00
is equivalent to specifying:
1998-10-14 18:26:31 +02:00
2001-08-24 22:03:45 +02:00
<programlisting>
1998-10-14 18:26:31 +02:00
CREATE SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq;
2001-10-09 20:46:00 +02:00
CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
2006-08-21 02:57:26 +02:00
<replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq')
2001-10-09 20:46:00 +02:00
);
2006-08-21 02:57:26 +02:00
ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>;
2001-08-24 22:03:45 +02:00
</programlisting>
1999-08-06 15:43:42 +02:00
2001-08-16 22:38:56 +02:00
Thus, we have created an integer column and arranged for its default
2002-08-19 21:33:36 +02:00
values to be assigned from a sequence generator. A <literal>NOT NULL</>
2009-06-17 23:58:49 +02:00
constraint is applied to ensure that a null value cannot be
2009-04-27 18:27:36 +02:00
inserted. (In most cases you would also want to attach a
2002-08-19 21:33:36 +02:00
<literal>UNIQUE</> or <literal>PRIMARY KEY</> constraint to prevent
duplicate values from being inserted by accident, but this is
2006-08-21 02:57:26 +02:00
not automatic.) Lastly, the sequence is marked as <quote>owned by</>
the column, so that it will be dropped if the column or table is dropped.
2001-10-09 20:46:00 +02:00
</para>
2001-08-16 22:38:56 +02:00
2003-03-13 02:30:29 +01:00
<note>
<para>
Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>
implied <literal>UNIQUE</literal>. This is no longer automatic. If
2009-04-27 18:27:36 +02:00
you wish a serial column to have a unique constraint or be a
2009-06-17 23:58:49 +02:00
primary key, it must now be specified, just like
2003-03-13 02:30:29 +01:00
any other data type.
</para>
</note>
2002-12-06 06:17:42 +01:00
<para>
2003-10-16 06:52:21 +02:00
To insert the next value of the sequence into the <type>serial</type>
column, specify that the <type>serial</type>
column should be assigned its default value. This can be done
either by excluding the column from the list of columns in
2002-12-06 06:17:42 +01:00
the <command>INSERT</command> statement, or through the use of
2003-11-01 02:56:29 +01:00
the <literal>DEFAULT</literal> key word.
2002-12-06 06:17:42 +01:00
</para>
2001-08-16 22:38:56 +02:00
<para>
The type names <type>serial</type> and <type>serial4</type> are
equivalent: both create <type>integer</type> columns. The type
2009-04-27 18:27:36 +02:00
names <type>bigserial</type> and <type>serial8</type> work
2001-10-30 21:13:44 +01:00
the same way, except that they create a <type>bigint</type>
column. <type>bigserial</type> should be used if you anticipate
2002-12-06 06:17:42 +01:00
the use of more than 2<superscript>31</> identifiers over the
lifetime of the table.
2001-08-16 22:38:56 +02:00
</para>
<para>
2003-10-16 06:52:21 +02:00
The sequence created for a <type>serial</type> column is
2006-08-21 02:57:26 +02:00
automatically dropped when the owning column is dropped.
You can drop the sequence without dropping the column, but this
will force removal of the column default expression.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
</sect1>
1998-10-14 18:26:31 +02:00
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-money">
2003-03-13 02:30:29 +01:00
<title>Monetary Types</title>
1999-08-06 15:43:42 +02:00
<para>
2003-03-13 02:30:29 +01:00
The <type>money</type> type stores a currency amount with a fixed
fractional precision; see <xref
2009-06-17 23:58:49 +02:00
linkend="datatype-money-table">. The fractional precision is
determined by the database's <xref linkend="guc-lc-monetary"> setting.
2001-01-26 23:04:22 +01:00
Input is accepted in a variety of formats, including integer and
2009-04-27 18:27:36 +02:00
floating-point literals, as well as typical
2001-01-26 23:04:22 +01:00
currency formatting, such as <literal>'$1,000.00'</literal>.
2003-03-13 02:30:29 +01:00
Output is generally in the latter form but depends on the locale.
2007-11-27 06:49:58 +01:00
Non-quoted numeric values can be converted to <type>money</type> by
casting the numeric value to <type>text</type> and then
2009-06-17 23:58:49 +02:00
<type>money</type>, for example:
2007-11-27 06:49:58 +01:00
<programlisting>
SELECT 1234::text::money;
</programlisting>
2007-11-27 17:46:36 +01:00
There is no simple way of doing the reverse in a locale-independent
manner, namely casting a <type>money</type> value to a numeric type.
If you know the currency symbol and thousands separator you can use
<function>regexp_replace()</>:
<programlisting>
SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
</programlisting>
1998-12-18 17:11:12 +01:00
</para>
1998-10-14 18:26:31 +02:00
2007-11-05 13:02:20 +01:00
<para>
2009-04-27 18:27:36 +02:00
Since the output of this data type is locale-sensitive, it might not
2007-11-05 13:02:20 +01:00
work to load <type>money</> data into a database that has a different
setting of <varname>lc_monetary</>. To avoid problems, before
2009-04-27 18:27:36 +02:00
restoring a dump into a new database make sure <varname>lc_monetary</> has the same or
2007-11-05 13:02:20 +01:00
equivalent value as in the database that was dumped.
</para>
2002-11-11 21:14:04 +01:00
<table id="datatype-money-table">
2001-02-14 20:37:26 +01:00
<title>Monetary Types</title>
1999-08-06 15:43:42 +02:00
<tgroup cols="4">
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
<entry>Range</entry>
1999-08-06 15:43:42 +02:00
</row>
</thead>
<tbody>
<row>
2003-11-01 02:56:29 +01:00
<entry>money</entry>
2007-04-06 21:22:38 +02:00
<entry>8 bytes</entry>
2003-11-01 02:56:29 +01:00
<entry>currency amount</entry>
2007-04-06 21:22:38 +02:00
<entry>-92233720368547758.08 to +92233720368547758.07</entry>
1999-08-06 15:43:42 +02:00
</row>
</tbody>
</tgroup>
</table>
</sect1>
1998-03-01 09:16:16 +01:00
2001-01-26 23:04:22 +01:00
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-character">
1999-08-06 15:43:42 +02:00
<title>Character Types</title>
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-character">
2003-08-31 19:32:24 +02:00
<primary>character string</primary>
2001-05-13 00:51:36 +02:00
<secondary>data types</secondary>
</indexterm>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>string</primary>
<see>character string</see>
2001-05-13 00:51:36 +02:00
</indexterm>
2003-08-31 19:32:24 +02:00
<indexterm zone="datatype-character">
<primary>character</primary>
</indexterm>
<indexterm zone="datatype-character">
<primary>character varying</primary>
</indexterm>
<indexterm zone="datatype-character">
2001-05-13 00:51:36 +02:00
<primary>text</primary>
2003-08-31 19:32:24 +02:00
</indexterm>
<indexterm zone="datatype-character">
<primary>char</primary>
</indexterm>
<indexterm zone="datatype-character">
<primary>varchar</primary>
2001-05-13 00:51:36 +02:00
</indexterm>
2002-11-11 21:14:04 +01:00
<table id="datatype-character-table">
2001-02-14 20:37:26 +01:00
<title>Character Types</title>
2001-08-08 00:41:49 +02:00
<tgroup cols="2">
1999-08-06 15:43:42 +02:00
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Description</entry>
1999-08-06 15:43:42 +02:00
</row>
</thead>
<tbody>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>character varying(<replaceable>n</>)</type>, <type>varchar(<replaceable>n</>)</type></entry>
<entry>variable-length with limit</entry>
1999-08-06 15:43:42 +02:00
</row>
2003-01-15 19:01:05 +01:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>character(<replaceable>n</>)</type>, <type>char(<replaceable>n</>)</type></entry>
<entry>fixed-length, blank padded</entry>
2003-01-15 19:01:05 +01:00
</row>
1999-08-06 15:43:42 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>text</type></entry>
<entry>variable unlimited length</entry>
1999-08-06 15:43:42 +02:00
</row>
2001-09-04 05:17:54 +02:00
</tbody>
1999-08-06 15:43:42 +02:00
</tgroup>
</table>
2001-01-13 19:34:51 +01:00
2002-11-11 21:14:04 +01:00
<para>
<xref linkend="datatype-character-table"> shows the
2002-11-15 04:11:18 +01:00
general-purpose character types available in
<productname>PostgreSQL</productname>.
2002-11-11 21:14:04 +01:00
</para>
2001-05-21 18:54:46 +02:00
<para>
<acronym>SQL</acronym> defines two primary character types:
2003-01-15 19:01:05 +01:00
<type>character varying(<replaceable>n</>)</type> and
<type>character(<replaceable>n</>)</type>, where <replaceable>n</>
is a positive integer. Both of these types can store strings up to
2009-06-17 23:58:49 +02:00
<replaceable>n</> characters (not bytes) in length. An attempt to store a
2001-05-21 18:54:46 +02:00
longer string into a column of these types will result in an
error, unless the excess characters are all spaces, in which case
2003-01-15 19:01:05 +01:00
the string will be truncated to the maximum length. (This somewhat
bizarre exception is required by the <acronym>SQL</acronym>
standard.) If the string to be stored is shorter than the declared
length, values of type <type>character</type> will be space-padded;
values of type <type>character varying</type> will simply store the
shorter
2002-11-15 04:11:18 +01:00
string.
1999-08-06 15:43:42 +02:00
</para>
1998-12-18 17:11:12 +01:00
2006-04-23 05:39:52 +02:00
<para>
If one explicitly casts a value to <type>character
varying(<replaceable>n</>)</type> or
<type>character(<replaceable>n</>)</type>, then an over-length
value will be truncated to <replaceable>n</> characters without
raising an error. (This too is required by the
<acronym>SQL</acronym> standard.)
</para>
2001-05-21 18:54:46 +02:00
<para>
2003-01-15 19:01:05 +01:00
The notations <type>varchar(<replaceable>n</>)</type> and
<type>char(<replaceable>n</>)</type> are aliases for <type>character
varying(<replaceable>n</>)</type> and
<type>character(<replaceable>n</>)</type>, respectively.
<type>character</type> without length specifier is equivalent to
2004-12-23 06:37:40 +01:00
<type>character(1)</type>. If <type>character varying</type> is used
2003-01-15 19:01:05 +01:00
without length specifier, the type accepts strings of any size. The
latter is a <productname>PostgreSQL</> extension.
2001-05-21 18:54:46 +02:00
</para>
<para>
2003-03-13 02:30:29 +01:00
In addition, <productname>PostgreSQL</productname> provides the
2003-11-04 10:55:39 +01:00
<type>text</type> type, which stores strings of any length.
Although the type <type>text</type> is not in the
<acronym>SQL</acronym> standard, several other SQL database
management systems have it as well.
2001-05-21 18:54:46 +02:00
</para>
2004-02-01 07:27:48 +01:00
<para>
Values of type <type>character</type> are physically padded
with spaces to the specified width <replaceable>n</>, and are
stored and displayed that way. However, the padding spaces are
treated as semantically insignificant. Trailing spaces are
disregarded when comparing two values of type <type>character</type>,
and they will be removed when converting a <type>character</type> value
to one of the other string types. Note that trailing spaces
<emphasis>are</> semantically significant in
<type>character varying</type> and <type>text</type> values.
</para>
2001-05-21 18:54:46 +02:00
<para>
2007-04-06 21:22:38 +02:00
The storage requirement for a short string (up to 126 bytes) is 1 byte
plus the actual string, which includes the space padding in the case of
2009-04-27 18:27:36 +02:00
<type>character</type>. Longer strings have 4 bytes of overhead instead
2007-04-06 21:22:38 +02:00
of 1. Long strings are compressed by the system automatically, so
the physical requirement on disk might be less. Very long values are also
stored in background tables so that they do not interfere with rapid
access to shorter column values. In any case, the longest
2002-07-16 06:45:59 +02:00
possible character string that can be stored is about 1 GB. (The
maximum value that will be allowed for <replaceable>n</> in the data
2009-04-27 18:27:36 +02:00
type declaration is less than that. It wouldn't be useful to
2002-07-16 06:45:59 +02:00
change this because with multibyte character encodings the number of
2009-04-27 18:27:36 +02:00
characters and bytes can be quite different. If you desire to
2002-07-16 06:45:59 +02:00
store long strings with no specific upper limit, use
<type>text</type> or <type>character varying</type> without a length
specifier, rather than making up an arbitrary length limit.)
2001-05-21 18:54:46 +02:00
</para>
<tip>
<para>
2009-06-17 23:58:49 +02:00
There is no performance difference among these three types,
2009-04-27 18:27:36 +02:00
apart from increased storage space when using the blank-padded
type, and a few extra CPU cycles to check the length when storing into
2007-04-06 21:22:38 +02:00
a length-constrained column. While
<type>character(<replaceable>n</>)</type> has performance
2009-04-27 18:27:36 +02:00
advantages in some other database systems, there is no such advantage in
2009-06-17 23:58:49 +02:00
<productname>PostgreSQL</productname>; in fact
<type>character(<replaceable>n</>)</type> is usually the slowest of
the three because of its additional storage costs. In most situations
2004-02-01 07:27:48 +01:00
<type>text</type> or <type>character varying</type> should be used
instead.
2001-05-21 18:54:46 +02:00
</para>
</tip>
2001-08-08 00:41:49 +02:00
<para>
Refer to <xref linkend="sql-syntax-strings"> for information about
the syntax of string literals, and to <xref linkend="functions">
2004-03-23 03:47:35 +01:00
for information about available operators and functions. The
database character set determines the character set used to store
textual values; for more information on character set support,
refer to <xref linkend="multibyte">.
2001-08-08 00:41:49 +02:00
</para>
2001-05-21 18:54:46 +02:00
<example>
<title>Using the character types</title>
<programlisting>
CREATE TABLE test1 (a character(4));
INSERT INTO test1 VALUES ('ok');
SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char">
<computeroutput>
a | char_length
------+-------------
2004-02-01 07:55:07 +01:00
ok | 2
2001-05-21 18:54:46 +02:00
</computeroutput>
CREATE TABLE test2 (b varchar(5));
INSERT INTO test2 VALUES ('ok');
INSERT INTO test2 VALUES ('good ');
INSERT INTO test2 VALUES ('too long');
<computeroutput>ERROR: value too long for type character varying(5)</computeroutput>
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
2001-05-21 18:54:46 +02:00
SELECT b, char_length(b) FROM test2;
<computeroutput>
b | char_length
-------+-------------
ok | 2
good | 5
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
too l | 5
2001-05-21 18:54:46 +02:00
</computeroutput>
</programlisting>
<calloutlist>
<callout arearefs="co.datatype-char">
<para>
The <function>char_length</function> function is discussed in
<xref linkend="functions-string">.
</para>
</callout>
</calloutlist>
</example>
1999-05-27 17:47:28 +02:00
<para>
2001-01-13 19:34:51 +01:00
There are two other fixed-length character types in
2003-01-15 19:01:05 +01:00
<productname>PostgreSQL</productname>, shown in <xref
linkend="datatype-character-special-table">. The <type>name</type>
2009-04-27 18:27:36 +02:00
type exists <emphasis>only</emphasis> for the storage of identifiers
2003-03-13 02:30:29 +01:00
in the internal system catalogs and is not intended for use by the general user. Its
2003-01-15 19:01:05 +01:00
length is currently defined as 64 bytes (63 usable characters plus
terminator) but should be referenced using the constant
2009-04-27 18:27:36 +02:00
<symbol>NAMEDATALEN</symbol> in <literal>C</> source code.
The length is set at compile time (and
2003-01-15 19:01:05 +01:00
is therefore adjustable for special uses); the default maximum
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
length might change in a future release. The type <type>"char"</type>
2003-01-15 19:01:05 +01:00
(note the quotes) is different from <type>char(1)</type> in that it
only uses one byte of storage. It is internally used in the system
2009-04-27 18:27:36 +02:00
catalogs as a simplistic enumeration type.
1999-05-27 17:47:28 +02:00
</para>
1998-03-01 09:16:16 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-character-special-table">
2003-03-13 02:30:29 +01:00
<title>Special Character Types</title>
1999-08-06 15:43:42 +02:00
<tgroup cols="3">
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
1999-08-06 15:43:42 +02:00
</row>
</thead>
<tbody>
2001-01-13 19:34:51 +01:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>"char"</type></entry>
<entry>1 byte</entry>
2007-04-06 21:22:38 +02:00
<entry>single-byte internal type</entry>
2001-01-13 19:34:51 +01:00
</row>
1999-08-06 15:43:42 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>name</type></entry>
<entry>64 bytes</entry>
<entry>internal type for object names</entry>
1999-08-06 15:43:42 +02:00
</row>
</tbody>
</tgroup>
</table>
</sect1>
1998-12-18 17:11:12 +01:00
2001-11-20 16:42:44 +01:00
<sect1 id="datatype-binary">
2003-03-13 02:30:29 +01:00
<title>Binary Data Types</title>
2003-08-31 19:32:24 +02:00
<indexterm zone="datatype-binary">
<primary>binary data</primary>
</indexterm>
<indexterm zone="datatype-binary">
<primary>bytea</primary>
</indexterm>
2001-11-20 16:42:44 +01:00
<para>
2002-11-11 21:14:04 +01:00
The <type>bytea</type> data type allows storage of binary strings;
see <xref linkend="datatype-binary-table">.
2001-11-20 16:42:44 +01:00
</para>
2002-11-11 21:14:04 +01:00
<table id="datatype-binary-table">
2003-03-13 02:30:29 +01:00
<title>Binary Data Types</title>
2001-11-20 16:42:44 +01:00
<tgroup cols="3">
<thead>
<row>
2003-03-13 02:30:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
2001-11-20 16:42:44 +01:00
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
2002-01-20 23:19:57 +01:00
<entry><type>bytea</type></entry>
2007-04-06 21:22:38 +02:00
<entry>1 or 4 bytes plus the actual binary string</entry>
2003-03-13 02:30:29 +01:00
<entry>variable-length binary string</entry>
2001-11-20 16:42:44 +01:00
</row>
</tbody>
</tgroup>
</table>
<para>
2002-11-11 21:14:04 +01:00
A binary string is a sequence of octets (or bytes). Binary
2009-04-27 18:27:36 +02:00
strings are distinguished from character strings in two
ways: First, binary strings specifically allow storing
2003-03-13 02:30:29 +01:00
octets of value zero and other <quote>non-printable</quote>
2005-01-08 06:19:18 +01:00
octets (usually, octets outside the range 32 to 126).
Character strings disallow zero octets, and also disallow any
other octet values and sequences of octet values that are invalid
according to the database's selected character set encoding.
2003-11-30 21:55:09 +01:00
Second, operations on binary strings process the actual bytes,
2005-01-08 06:19:18 +01:00
whereas the processing of character strings depends on locale settings.
In short, binary strings are appropriate for storing data that the
programmer thinks of as <quote>raw bytes</>, whereas character
strings are appropriate for storing text.
2001-11-20 16:42:44 +01:00
</para>
2001-09-09 19:21:59 +02:00
<para>
2006-11-23 05:27:33 +01:00
When entering <type>bytea</type> values, octets of certain
values <emphasis>must</emphasis> be escaped (but all octet
values <emphasis>can</emphasis> be escaped) when used as part
of a string literal in an <acronym>SQL</acronym> statement. In
2009-04-27 18:27:36 +02:00
general, to escape an octet, convert it into its three-digit
octal value and precede it
2007-01-30 23:29:23 +01:00
by two backslashes. <xref linkend="datatype-binary-sqlesc">
2007-11-07 13:24:24 +01:00
shows the characters that must be escaped, and gives the alternative
2007-01-30 23:29:23 +01:00
escape sequences where applicable.
2001-09-09 19:21:59 +02:00
</para>
2001-11-20 16:42:44 +01:00
<table id="datatype-binary-sqlesc">
2002-11-11 21:14:04 +01:00
<title><type>bytea</> Literal Escaped Octets</title>
2001-11-20 16:42:44 +01:00
<tgroup cols="5">
<thead>
<row>
2001-11-21 04:17:22 +01:00
<entry>Decimal Octet Value</entry>
2001-11-20 16:42:44 +01:00
<entry>Description</entry>
2003-03-13 02:30:29 +01:00
<entry>Escaped Input Representation</entry>
2001-11-20 16:42:44 +01:00
<entry>Example</entry>
2003-03-13 02:30:29 +01:00
<entry>Output Representation</entry>
2001-11-20 16:42:44 +01:00
</row>
</thead>
<tbody>
<row>
2002-11-11 21:14:04 +01:00
<entry>0</entry>
<entry>zero octet</entry>
2007-01-30 23:29:23 +01:00
<entry><literal>E'\\000'</literal></entry>
<entry><literal>SELECT E'\\000'::bytea;</literal></entry>
2002-11-11 21:14:04 +01:00
<entry><literal>\000</literal></entry>
2001-11-20 16:42:44 +01:00
</row>
<row>
2002-11-11 21:14:04 +01:00
<entry>39</entry>
<entry>single quote</entry>
2007-01-30 23:29:23 +01:00
<entry><literal>''''</literal> or <literal>E'\\047'</literal></entry>
<entry><literal>SELECT E'\''::bytea;</literal></entry>
2002-11-11 21:14:04 +01:00
<entry><literal>'</literal></entry>
2001-11-20 16:42:44 +01:00
</row>
<row>
2002-11-11 21:14:04 +01:00
<entry>92</entry>
<entry>backslash</entry>
2007-01-30 23:29:23 +01:00
<entry><literal>E'\\\\'</literal> or <literal>E'\\134'</literal></entry>
<entry><literal>SELECT E'\\\\'::bytea;</literal></entry>
2002-11-11 21:14:04 +01:00
<entry><literal>\\</literal></entry>
2001-11-20 16:42:44 +01:00
</row>
2003-11-30 21:55:09 +01:00
<row>
<entry>0 to 31 and 127 to 255</entry>
<entry><quote>non-printable</quote> octets</entry>
2007-01-30 23:29:23 +01:00
<entry><literal>E'\\<replaceable>xxx'</></literal> (octal value)</entry>
<entry><literal>SELECT E'\\001'::bytea;</literal></entry>
2003-11-30 21:55:09 +01:00
<entry><literal>\001</literal></entry>
</row>
2001-11-20 16:42:44 +01:00
</tbody>
</tgroup>
</table>
<para>
2009-04-27 18:27:36 +02:00
The requirement to escape <emphasis>non-printable</emphasis> octets
2003-11-30 21:55:09 +01:00
varies depending on locale settings. In some instances you can get away
with leaving them unescaped. Note that the result in each of the examples
in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in
2009-04-27 18:27:36 +02:00
length, even though the output representation is sometimes
more than one character.
2003-03-13 02:30:29 +01:00
</para>
<para>
2009-04-27 18:27:36 +02:00
The reason multiple backslashes are required, as shown
2006-11-23 05:27:33 +01:00
in <xref linkend="datatype-binary-sqlesc">, is that an input
string written as a string literal must pass through two parse
phases in the <productname>PostgreSQL</productname> server.
The first backslash of each pair is interpreted as an escape
2007-01-30 23:29:23 +01:00
character by the string-literal parser (assuming escape string
syntax is used) and is therefore consumed, leaving the second backslash of the
pair. (Dollar-quoted strings can be used to avoid this level
of escaping.) The remaining backslash is then recognized by the
2006-11-23 05:27:33 +01:00
<type>bytea</type> input function as starting either a three
digit octal value or escaping another backslash. For example,
2007-01-30 23:29:23 +01:00
a string literal passed to the server as <literal>E'\\001'</literal>
2006-11-23 05:27:33 +01:00
becomes <literal>\001</literal> after passing through the
2007-01-30 23:29:23 +01:00
escape string parser. The <literal>\001</literal> is then sent
2006-11-23 05:27:33 +01:00
to the <type>bytea</type> input function, where it is converted
to a single octet with a decimal value of 1. Note that the
2007-01-30 23:29:23 +01:00
single-quote character is not treated specially by <type>bytea</type>,
2006-11-23 05:27:33 +01:00
so it follows the normal rules for string literals. (See also
<xref linkend="sql-syntax-strings">.)
2003-03-13 02:30:29 +01:00
</para>
<para>
2009-04-27 18:27:36 +02:00
<type>Bytea</type> octets are sometimes escaped when output. In general, each
2003-03-13 02:30:29 +01:00
<quote>non-printable</quote> octet is converted into
its equivalent three-digit octal value and preceded by one backslash.
2001-11-21 04:17:22 +01:00
Most <quote>printable</quote> octets are represented by their standard
representation in the client character set. The octet with decimal
2009-04-27 18:27:36 +02:00
value 92 (backslash) is doubled in the output.
2001-11-21 04:17:22 +01:00
Details are in <xref linkend="datatype-binary-resesc">.
2001-11-20 16:42:44 +01:00
</para>
<table id="datatype-binary-resesc">
2002-11-11 21:14:04 +01:00
<title><type>bytea</> Output Escaped Octets</title>
2001-11-20 16:42:44 +01:00
<tgroup cols="5">
<thead>
<row>
2001-11-21 04:17:22 +01:00
<entry>Decimal Octet Value</entry>
2001-11-20 16:42:44 +01:00
<entry>Description</entry>
2003-03-13 02:30:29 +01:00
<entry>Escaped Output Representation</entry>
2001-11-20 16:42:44 +01:00
<entry>Example</entry>
2003-03-13 02:30:29 +01:00
<entry>Output Result</entry>
2001-11-20 16:42:44 +01:00
</row>
</thead>
<tbody>
<row>
2002-11-11 21:14:04 +01:00
<entry>92</entry>
<entry>backslash</entry>
<entry><literal>\\</literal></entry>
2007-01-30 23:29:23 +01:00
<entry><literal>SELECT E'\\134'::bytea;</literal></entry>
2002-11-11 21:14:04 +01:00
<entry><literal>\\</literal></entry>
2001-11-20 16:42:44 +01:00
</row>
<row>
2002-11-11 21:14:04 +01:00
<entry>0 to 31 and 127 to 255</entry>
<entry><quote>non-printable</quote> octets</entry>
2003-03-13 02:30:29 +01:00
<entry><literal>\<replaceable>xxx</></literal> (octal value)</entry>
2007-01-30 23:29:23 +01:00
<entry><literal>SELECT E'\\001'::bytea;</literal></entry>
2002-11-11 21:14:04 +01:00
<entry><literal>\001</literal></entry>
2001-11-20 16:42:44 +01:00
</row>
<row>
2002-11-11 21:14:04 +01:00
<entry>32 to 126</entry>
<entry><quote>printable</quote> octets</entry>
2003-11-30 21:55:09 +01:00
<entry>client character set representation</entry>
2007-01-30 23:29:23 +01:00
<entry><literal>SELECT E'\\176'::bytea;</literal></entry>
2002-11-11 21:14:04 +01:00
<entry><literal>~</literal></entry>
2001-11-20 16:42:44 +01:00
</row>
</tbody>
</tgroup>
</table>
<para>
2002-11-15 04:11:18 +01:00
Depending on the front end to <productname>PostgreSQL</> you use,
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
you might have additional work to do in terms of escaping and
unescaping <type>bytea</type> strings. For example, you might also
2002-11-15 04:11:18 +01:00
have to escape line feeds and carriage returns if your interface
2003-03-13 02:30:29 +01:00
automatically translates these.
2001-11-20 16:42:44 +01:00
</para>
2002-11-15 04:11:18 +01:00
<para>
2003-11-01 02:56:29 +01:00
The <acronym>SQL</acronym> standard defines a different binary
string type, called <type>BLOB</type> or <type>BINARY LARGE
2005-01-08 06:19:18 +01:00
OBJECT</type>. The input format is different from
2003-11-01 02:56:29 +01:00
<type>bytea</type>, but the provided functions and operators are
mostly the same.
2002-11-15 04:11:18 +01:00
</para>
2002-11-11 21:14:04 +01:00
</sect1>
2001-09-09 19:21:59 +02:00
2001-11-20 16:42:44 +01:00
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-datetime">
1999-05-04 04:22:13 +02:00
<title>Date/Time Types</title>
2003-08-31 19:32:24 +02:00
<indexterm zone="datatype-datetime">
<primary>date</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>time</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>time without time zone</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>time with time zone</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>timestamp</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>timestamp with time zone</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>timestamp without time zone</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>interval</primary>
</indexterm>
<indexterm zone="datatype-datetime">
<primary>time span</primary>
</indexterm>
1999-05-04 04:22:13 +02:00
<para>
2001-11-21 06:53:41 +01:00
<productname>PostgreSQL</productname> supports the full set of
2002-11-11 21:14:04 +01:00
<acronym>SQL</acronym> date and time types, shown in <xref
2005-01-08 06:19:18 +01:00
linkend="datatype-datetime-table">. The operations available
on these data types are described in
<xref linkend="functions-datetime">.
1999-05-04 04:22:13 +02:00
</para>
1998-10-27 07:14:41 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-datetime-table">
2001-02-14 20:37:26 +01:00
<title>Date/Time Types</title>
2001-10-09 20:46:00 +02:00
<tgroup cols="6">
1999-05-04 04:22:13 +02:00
<thead>
<row>
2003-03-13 02:30:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
2000-01-23 02:27:39 +01:00
<entry>Description</entry>
2003-03-13 02:30:29 +01:00
<entry>Low Value</entry>
<entry>High Value</entry>
2000-01-23 02:27:39 +01:00
<entry>Resolution</entry>
1999-05-04 04:22:13 +02:00
</row>
</thead>
<tbody>
<row>
2002-10-31 23:18:42 +01:00
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
2000-01-23 02:27:39 +01:00
<entry>8 bytes</entry>
2009-04-27 18:27:36 +02:00
<entry>both date and time (no time zone)</entry>
2000-01-23 02:27:39 +01:00
<entry>4713 BC</entry>
2008-03-30 06:08:15 +02:00
<entry>294276 AD</entry>
2001-02-14 20:37:26 +01:00
<entry>1 microsecond / 14 digits</entry>
1999-05-04 04:22:13 +02:00
</row>
2000-03-14 23:52:53 +01:00
<row>
2002-10-31 23:18:42 +01:00
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
2000-03-14 23:52:53 +01:00
<entry>8 bytes</entry>
2003-03-13 02:30:29 +01:00
<entry>both date and time, with time zone</entry>
2001-09-28 10:15:35 +02:00
<entry>4713 BC</entry>
2008-03-30 06:08:15 +02:00
<entry>294276 AD</entry>
2001-02-14 20:37:26 +01:00
<entry>1 microsecond / 14 digits</entry>
2000-03-14 23:52:53 +01:00
</row>
1999-05-04 04:22:13 +02:00
<row>
2000-01-23 02:27:39 +01:00
<entry><type>date</type></entry>
<entry>4 bytes</entry>
2009-04-27 18:27:36 +02:00
<entry>date (no time of day)</entry>
2000-01-23 02:27:39 +01:00
<entry>4713 BC</entry>
2006-02-09 04:39:17 +01:00
<entry>5874897 AD</entry>
2000-01-23 02:27:39 +01:00
<entry>1 day</entry>
1999-05-04 04:22:13 +02:00
</row>
<row>
2001-12-29 19:35:54 +01:00
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
2001-08-31 03:55:25 +02:00
<entry>8 bytes</entry>
2009-04-27 18:27:36 +02:00
<entry>time of day (no date)</entry>
2005-10-22 21:33:57 +02:00
<entry>00:00:00</entry>
2005-10-14 13:47:57 +02:00
<entry>24:00:00</entry>
2005-01-17 19:47:15 +01:00
<entry>1 microsecond / 14 digits</entry>
1999-05-04 04:22:13 +02:00
</row>
2000-03-14 23:52:53 +01:00
<row>
2001-12-29 19:35:54 +01:00
<entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
2001-08-31 03:55:25 +02:00
<entry>12 bytes</entry>
2003-03-13 02:30:29 +01:00
<entry>times of day only, with time zone</entry>
2006-10-18 18:43:14 +02:00
<entry>00:00:00+1459</entry>
<entry>24:00:00-1459</entry>
2005-01-17 19:47:15 +01:00
<entry>1 microsecond / 14 digits</entry>
2000-03-14 23:52:53 +01:00
</row>
2008-11-09 01:28:35 +01:00
<row>
<entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
<entry>12 bytes</entry>
2009-04-27 18:27:36 +02:00
<entry>time interval</entry>
2008-11-09 01:28:35 +01:00
<entry>-178000000 years</entry>
<entry>178000000 years</entry>
<entry>1 microsecond / 14 digits</entry>
</row>
1999-05-04 04:22:13 +02:00
</tbody>
</tgroup>
</table>
2003-12-01 21:34:53 +01:00
<note>
<para>
Prior to <productname>PostgreSQL</productname> 7.3, writing just
<type>timestamp</type> was equivalent to <type>timestamp with
time zone</type>. This was changed for SQL compliance.
</para>
</note>
2001-12-08 04:24:23 +01:00
<para>
2002-11-11 21:14:04 +01:00
<type>time</type>, <type>timestamp</type>, and
<type>interval</type> accept an optional precision value
<replaceable>p</replaceable> which specifies the number of
fractional digits retained in the seconds field. By default, there
is no explicit bound on precision. The allowed range of
<replaceable>p</replaceable> is from 0 to 6 for the
2003-01-29 02:08:42 +01:00
<type>timestamp</type> and <type>interval</type> types.
2001-12-08 04:24:23 +01:00
</para>
2002-10-31 23:18:42 +01:00
<note>
<para>
2008-03-30 06:08:15 +02:00
When <type>timestamp</> values are stored as eight-byte integers
(currently the default), microsecond precision is available over
the full range of values. When <type>timestamp</> values are
stored as double precision floating-point numbers instead (a
deprecated compile-time option), the effective limit of precision
might be less than 6. <type>timestamp</type> values are stored as
seconds before or after midnight 2000-01-01. When
<type>timestamp</type> values are implemented using floating-point
numbers, microsecond precision is achieved for dates within a few
years of 2000-01-01, but the precision degrades for dates further
away. Note that using floating-point datetimes allows a larger
range of <type>timestamp</type> values to be represented than
shown above: from 4713 BC up to 5874897 AD.
</para>
<para>
The same compile-time option also determines whether
<type>time</type> and <type>interval</type> values are stored as
floating-point numbers or eight-byte integers. In the
floating-point case, large <type>interval</type> values degrade in
precision as the size of the interval increases.
2002-10-31 23:18:42 +01:00
</para>
</note>
2003-01-29 02:08:42 +01:00
<para>
For the <type>time</type> types, the allowed range of
<replaceable>p</replaceable> is from 0 to 6 when eight-byte integer
storage is used, or from 0 to 10 when floating-point storage is used.
</para>
2008-09-11 17:27:30 +02:00
<para>
The <type>interval</type> type has an additional option, which is
to restrict the set of stored fields by writing one of these phrases:
<programlisting>
YEAR
MONTH
DAY
HOUR
MINUTE
SECOND
YEAR TO MONTH
DAY TO HOUR
DAY TO MINUTE
DAY TO SECOND
HOUR TO MINUTE
2009-07-08 19:21:55 +02:00
HOUR TO SECOND
2008-09-11 17:27:30 +02:00
MINUTE TO SECOND
</programlisting>
Note that if both <replaceable>fields</replaceable> and
2009-07-08 19:21:55 +02:00
<replaceable>p</replaceable> are specified, the
2008-09-11 17:27:30 +02:00
<replaceable>fields</replaceable> must include <literal>SECOND</>,
since the precision applies only to the seconds.
</para>
2002-11-11 21:14:04 +01:00
<para>
The type <type>time with time zone</type> is defined by the SQL
standard, but the definition exhibits properties which lead to
questionable usefulness. In most cases, a combination of
<type>date</type>, <type>time</type>, <type>timestamp without time
2003-03-13 02:30:29 +01:00
zone</type>, and <type>timestamp with time zone</type> should
2002-11-11 21:14:04 +01:00
provide a complete range of date/time functionality required by
any application.
</para>
2002-01-04 18:02:25 +01:00
<para>
2001-09-28 10:15:35 +02:00
The types <type>abstime</type>
and <type>reltime</type> are lower precision types which are used internally.
2009-04-27 18:27:36 +02:00
You are discouraged from using these types in
applications; these internal types
2001-09-28 10:15:35 +02:00
might disappear in a future release.
</para>
1998-03-01 09:16:16 +01:00
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-datetime-input">
1999-01-19 17:08:26 +01:00
<title>Date/Time Input</title>
1998-03-01 09:16:16 +01:00
1999-01-19 17:08:26 +01:00
<para>
2000-01-23 02:27:39 +01:00
Date and time input is accepted in almost any reasonable format, including
2003-03-13 02:30:29 +01:00
ISO 8601, <acronym>SQL</acronym>-compatible,
traditional <productname>POSTGRES</productname>, and others.
2009-04-27 18:27:36 +02:00
For some formats, ordering of day, month, and year in date input is
2001-12-21 04:54:02 +01:00
ambiguous and there is support for specifying the expected
2004-03-09 17:57:47 +01:00
ordering of these fields. Set the <xref linkend="guc-datestyle"> parameter
2003-07-29 02:03:19 +02:00
to <literal>MDY</> to select month-day-year interpretation,
<literal>DMY</> to select day-month-year interpretation, or
<literal>YMD</> to select year-month-day interpretation.
1998-12-18 17:11:12 +01:00
</para>
<para>
2001-12-21 04:54:02 +01:00
<productname>PostgreSQL</productname> is more flexible in
2003-03-13 02:30:29 +01:00
handling date/time input than the
2001-12-21 04:54:02 +01:00
<acronym>SQL</acronym> standard requires.
2001-01-13 19:34:51 +01:00
See <xref linkend="datetime-appendix">
2001-12-21 04:54:02 +01:00
for the exact parsing rules of date/time input and for the
2002-01-04 18:02:25 +01:00
recognized text fields including months, days of the week, and
time zones.
1999-01-19 17:08:26 +01:00
</para>
1998-12-18 17:11:12 +01:00
1999-01-19 17:08:26 +01:00
<para>
2001-12-21 04:54:02 +01:00
Remember that any date or time literal input needs to be enclosed
in single quotes, like text strings. Refer to
<xref linkend="sql-syntax-constants-generic"> for more
information.
2002-11-11 21:14:04 +01:00
<acronym>SQL</acronym> requires the following syntax
2001-01-26 23:04:22 +01:00
<synopsis>
2001-12-08 04:24:23 +01:00
<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
2001-01-26 23:04:22 +01:00
</synopsis>
2009-06-17 23:58:49 +02:00
where <replaceable>p</replaceable> is an optional precision
specification giving the number of
2003-03-13 02:30:29 +01:00
fractional digits in the seconds field. Precision can be
specified for <type>time</type>, <type>timestamp</type>, and
<type>interval</type> types. The allowed values are mentioned
above. If no precision is specified in a constant specification,
it defaults to the precision of the literal value.
1999-01-19 17:08:26 +01:00
</para>
2000-01-23 02:27:39 +01:00
<sect3>
2002-11-11 21:14:04 +01:00
<title>Dates</title>
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>date</primary>
</indexterm>
1999-01-19 17:08:26 +01:00
<para>
2002-11-11 21:14:04 +01:00
<xref linkend="datatype-datetime-date-table"> shows some possible
inputs for the <type>date</type> type.
</para>
2000-03-14 23:52:53 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-datetime-date-table">
2001-02-14 20:37:26 +01:00
<title>Date Input</title>
1999-01-19 17:08:26 +01:00
<tgroup cols="2">
<thead>
2003-11-01 02:56:29 +01:00
<row>
<entry>Example</entry>
<entry>Description</entry>
</row>
1999-01-19 17:08:26 +01:00
</thead>
<tbody>
2003-11-01 02:56:29 +01:00
<row>
<entry>1999-01-08</entry>
2003-11-16 21:29:16 +01:00
<entry>ISO 8601; January 8 in any mode
2003-11-01 02:56:29 +01:00
(recommended format)</entry>
</row>
2009-04-27 18:27:36 +02:00
<row>
<entry>January 8, 1999</entry>
<entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
</row>
2003-11-01 02:56:29 +01:00
<row>
<entry>1/8/1999</entry>
<entry>January 8 in <literal>MDY</> mode;
August 1 in <literal>DMY</> mode</entry>
</row>
<row>
<entry>1/18/1999</entry>
<entry>January 18 in <literal>MDY</> mode;
rejected in other modes</entry>
</row>
<row>
<entry>01/02/03</entry>
<entry>January 2, 2003 in <literal>MDY</> mode;
February 1, 2003 in <literal>DMY</> mode;
February 3, 2001 in <literal>YMD</> mode
</entry>
</row>
2003-11-16 21:29:16 +01:00
<row>
<entry>1999-Jan-08</entry>
<entry>January 8 in any mode</entry>
</row>
<row>
<entry>Jan-08-1999</entry>
<entry>January 8 in any mode</entry>
</row>
<row>
<entry>08-Jan-1999</entry>
<entry>January 8 in any mode</entry>
</row>
<row>
<entry>99-Jan-08</entry>
<entry>January 8 in <literal>YMD</> mode, else error</entry>
</row>
<row>
<entry>08-Jan-99</entry>
<entry>January 8, except error in <literal>YMD</> mode</entry>
</row>
<row>
<entry>Jan-08-99</entry>
<entry>January 8, except error in <literal>YMD</> mode</entry>
</row>
2003-11-01 02:56:29 +01:00
<row>
<entry>19990108</entry>
2003-11-04 10:55:39 +01:00
<entry>ISO 8601; January 8, 1999 in any mode</entry>
2003-11-01 02:56:29 +01:00
</row>
<row>
<entry>990108</entry>
2003-11-04 10:55:39 +01:00
<entry>ISO 8601; January 8, 1999 in any mode</entry>
2003-11-01 02:56:29 +01:00
</row>
<row>
<entry>1999.008</entry>
<entry>year and day of year</entry>
</row>
<row>
<entry>J2451187</entry>
<entry>Julian day</entry>
</row>
<row>
<entry>January 8, 99 BC</entry>
2009-04-27 18:27:36 +02:00
<entry>year 99 BC</entry>
2003-11-01 02:56:29 +01:00
</row>
1999-01-19 17:08:26 +01:00
</tbody>
</tgroup>
</table>
2000-01-23 02:27:39 +01:00
</sect3>
1999-01-19 17:08:26 +01:00
2000-01-23 02:27:39 +01:00
<sect3>
2002-11-11 21:14:04 +01:00
<title>Times</title>
2000-07-14 17:26:21 +02:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>time</primary>
</indexterm>
2001-09-28 10:15:35 +02:00
<indexterm>
<primary>time without time zone</primary>
</indexterm>
2002-11-11 21:14:04 +01:00
<indexterm>
<primary>time with time zone</primary>
</indexterm>
2001-05-13 00:51:36 +02:00
2000-07-14 17:26:21 +02:00
<para>
2003-01-31 02:08:08 +01:00
The time-of-day types are <type>time [
(<replaceable>p</replaceable>) ] without time zone</type> and
<type>time [ (<replaceable>p</replaceable>) ] with time
2009-06-17 23:58:49 +02:00
zone</type>. <type>time</type> alone is equivalent to
2003-01-31 02:08:08 +01:00
<type>time without time zone</type>.
2000-07-14 17:26:21 +02:00
</para>
2000-03-14 23:52:53 +01:00
<para>
2003-03-13 02:30:29 +01:00
Valid input for these types consists of a time of day followed
by an optional time zone. (See <xref
2003-07-29 02:03:19 +02:00
linkend="datatype-datetime-time-table">
and <xref linkend="datatype-timezone-table">.) If a time zone is
2003-03-13 02:30:29 +01:00
specified in the input for <type>time without time zone</type>,
2006-10-16 21:58:27 +02:00
it is silently ignored. You can also specify a date but it will
2006-10-18 18:43:14 +02:00
be ignored, except when you use a time zone name that involves a
daylight-savings rule, such as
2006-07-06 03:46:38 +02:00
<literal>America/New_York</literal>. In this case specifying the date
2006-10-16 21:58:27 +02:00
is required in order to determine whether standard or daylight-savings
time applies. The appropriate time zone offset is recorded in the
<type>time with time zone</type> value.
2002-11-11 21:14:04 +01:00
</para>
2000-03-14 23:52:53 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-datetime-time-table">
2001-02-14 20:37:26 +01:00
<title>Time Input</title>
2000-03-14 23:52:53 +01:00
<tgroup cols="2">
2003-11-01 02:56:29 +01:00
<thead>
<row>
<entry>Example</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>04:05:06.789</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>04:05:06</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>04:05</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>040506</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>04:05 AM</literal></entry>
2009-06-17 23:58:49 +02:00
<entry>same as 04:05; AM does not affect value</entry>
2003-11-01 02:56:29 +01:00
</row>
<row>
<entry><literal>04:05 PM</literal></entry>
2005-01-22 23:56:36 +01:00
<entry>same as 16:05; input hour must be <= 12</entry>
2003-11-01 02:56:29 +01:00
</row>
<row>
<entry><literal>04:05:06.789-8</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>04:05:06-08:00</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>04:05-08:00</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>040506-08</literal></entry>
<entry>ISO 8601</entry>
</row>
<row>
<entry><literal>04:05:06 PST</literal></entry>
2006-10-16 21:58:27 +02:00
<entry>time zone specified by abbreviation</entry>
2003-11-01 02:56:29 +01:00
</row>
2006-07-06 03:46:38 +02:00
<row>
<entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry>
<entry>time zone specified by full name</entry>
</row>
2003-11-01 02:56:29 +01:00
</tbody>
</tgroup>
</table>
2000-03-14 23:52:53 +01:00
2003-07-29 02:03:19 +02:00
<table tocentry="1" id="datatype-timezone-table">
<title>Time Zone Input</title>
<tgroup cols="2">
2003-11-01 02:56:29 +01:00
<thead>
<row>
<entry>Example</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>PST</literal></entry>
2006-10-17 23:03:21 +02:00
<entry>Abbreviation (for Pacific Standard Time)</entry>
2003-11-01 02:56:29 +01:00
</row>
2006-07-06 03:46:38 +02:00
<row>
<entry><literal>America/New_York</literal></entry>
<entry>Full time zone name</entry>
</row>
2006-10-17 23:03:21 +02:00
<row>
<entry><literal>PST8PDT</literal></entry>
<entry>POSIX-style time zone specification</entry>
</row>
2003-11-01 02:56:29 +01:00
<row>
<entry><literal>-8:00</literal></entry>
<entry>ISO-8601 offset for PST</entry>
</row>
<row>
<entry><literal>-800</literal></entry>
<entry>ISO-8601 offset for PST</entry>
</row>
<row>
<entry><literal>-8</literal></entry>
<entry>ISO-8601 offset for PST</entry>
</row>
<row>
<entry><literal>zulu</literal></entry>
2003-11-16 21:29:16 +01:00
<entry>Military abbreviation for UTC</entry>
2003-11-01 02:56:29 +01:00
</row>
<row>
<entry><literal>z</literal></entry>
<entry>Short form of <literal>zulu</literal></entry>
</row>
</tbody>
2003-07-29 02:03:19 +02:00
</tgroup>
</table>
2004-08-10 02:55:08 +02:00
<para>
2006-09-22 18:20:00 +02:00
Refer to <xref linkend="datatype-timezones"> for more information on how
to specify time zones.
2004-08-10 02:55:08 +02:00
</para>
2000-01-23 02:27:39 +01:00
</sect3>
<sect3>
2003-03-13 02:30:29 +01:00
<title>Time Stamps</title>
2002-11-11 21:14:04 +01:00
<indexterm>
<primary>timestamp</primary>
</indexterm>
2001-09-28 10:15:35 +02:00
2002-11-22 00:31:20 +01:00
<indexterm>
<primary>timestamp with time zone</primary>
</indexterm>
2001-09-28 10:15:35 +02:00
<indexterm>
<primary>timestamp without time zone</primary>
</indexterm>
2002-11-11 21:14:04 +01:00
<para>
2009-04-27 18:27:36 +02:00
Valid input for the time stamp types consists of the concatenation
2004-08-10 02:55:08 +02:00
of a date and a time, followed by an optional time zone,
followed by an optional <literal>AD</literal> or <literal>BC</literal>.
(Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear
before the time zone, but this is not the preferred ordering.)
2007-02-01 01:28:19 +01:00
Thus:
2001-09-28 10:15:35 +02:00
2001-11-09 00:36:55 +01:00
<programlisting>
2001-09-28 10:15:35 +02:00
1999-01-08 04:05:06
2002-11-11 21:14:04 +01:00
</programlisting>
2007-02-01 01:28:19 +01:00
and:
2002-11-11 21:14:04 +01:00
<programlisting>
1999-01-08 04:05:06 -8:00
2001-11-09 00:36:55 +01:00
</programlisting>
2001-09-28 10:15:35 +02:00
2002-11-11 21:14:04 +01:00
are valid values, which follow the <acronym>ISO</acronym> 8601
2009-04-27 18:27:36 +02:00
standard. In addition, the common format:
2001-11-09 00:36:55 +01:00
<programlisting>
2001-09-28 10:15:35 +02:00
January 8 04:05:06 1999 PST
2001-11-09 00:36:55 +01:00
</programlisting>
2001-09-28 10:15:35 +02:00
is supported.
</para>
2001-12-08 04:24:23 +01:00
<para>
2009-06-17 23:58:49 +02:00
The <acronym>SQL</acronym> standard differentiates
<type>timestamp without time zone</type>
2005-10-22 21:33:57 +02:00
and <type>timestamp with time zone</type> literals by the presence of a
2009-06-17 23:58:49 +02:00
<quote>+</quote> or <quote>-</quote> symbol and time zone offset after
the time. Hence, according to the standard,
2009-04-27 18:27:36 +02:00
2004-11-27 22:27:08 +01:00
<programlisting>TIMESTAMP '2004-10-19 10:23:54'</programlisting>
2009-04-27 18:27:36 +02:00
2009-06-17 23:58:49 +02:00
is a <type>timestamp without time zone</type>, while
2009-04-27 18:27:36 +02:00
2004-11-27 22:27:08 +01:00
<programlisting>TIMESTAMP '2004-10-19 10:23:54+02'</programlisting>
2009-04-27 18:27:36 +02:00
2004-11-27 22:27:08 +01:00
is a <type>timestamp with time zone</type>.
2005-10-22 21:33:57 +02:00
<productname>PostgreSQL</productname> never examines the content of a
literal string before determining its type, and therefore will treat
both of the above as <type>timestamp without time zone</type>. To
ensure that a literal is treated as <type>timestamp with time
zone</type>, give it the correct explicit type:
2009-04-27 18:27:36 +02:00
2004-11-27 22:27:08 +01:00
<programlisting>TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'</programlisting>
2009-04-27 18:27:36 +02:00
In a literal that has been determined to be <type>timestamp without time
2005-10-22 21:33:57 +02:00
zone</type>, <productname>PostgreSQL</productname> will silently ignore
any time zone indication.
That is, the resulting value is derived from the date/time
2001-09-28 10:15:35 +02:00
fields in the input value, and is not adjusted for time zone.
</para>
2001-12-08 04:24:23 +01:00
2002-11-22 00:31:20 +01:00
<para>
For <type>timestamp with time zone</type>, the internally stored
2003-03-13 02:30:29 +01:00
value is always in UTC (Universal
Coordinated Time, traditionally known as Greenwich Mean Time,
<acronym>GMT</>). An input value that has an explicit
2002-11-22 00:31:20 +01:00
time zone specified is converted to UTC using the appropriate offset
for that time zone. If no time zone is stated in the input string,
then it is assumed to be in the time zone indicated by the system's
2004-03-09 17:57:47 +01:00
<xref linkend="guc-timezone"> parameter, and is converted to UTC using the
2003-03-13 02:30:29 +01:00
offset for the <varname>timezone</> zone.
2002-11-22 00:31:20 +01:00
</para>
<para>
When a <type>timestamp with time
zone</type> value is output, it is always converted from UTC to the
2003-03-13 02:30:29 +01:00
current <varname>timezone</> zone, and displayed as local time in that
2002-11-22 00:31:20 +01:00
zone. To see the time in another time zone, either change
2003-03-13 02:30:29 +01:00
<varname>timezone</> or use the <literal>AT TIME ZONE</> construct
2002-11-22 00:31:20 +01:00
(see <xref linkend="functions-datetime-zoneconvert">).
</para>
<para>
Conversions between <type>timestamp without time zone</type> and
<type>timestamp with time zone</type> normally assume that the
<type>timestamp without time zone</type> value should be taken or given
2009-04-27 18:27:36 +02:00
as <varname>timezone</> local time. A different time zone can
2002-11-22 00:31:20 +01:00
be specified for the conversion using <literal>AT TIME ZONE</>.
</para>
2000-01-23 02:27:39 +01:00
</sect3>
1999-01-19 17:08:26 +01:00
2000-01-23 02:27:39 +01:00
<sect3>
2003-03-13 02:30:29 +01:00
<title>Special Values</title>
2000-05-02 22:02:03 +02:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>time</primary>
2001-11-21 06:53:41 +01:00
<secondary>constants</secondary>
2001-05-13 00:51:36 +02:00
</indexterm>
<indexterm>
<primary>date</primary>
2001-11-21 06:53:41 +01:00
<secondary>constants</secondary>
2001-05-13 00:51:36 +02:00
</indexterm>
2000-05-02 22:02:03 +02:00
<para>
2004-08-10 02:55:08 +02:00
<productname>PostgreSQL</productname> supports several
2002-11-22 00:31:20 +01:00
special date/time input values for convenience, as shown in <xref
linkend="datatype-datetime-special-table">. The values
<literal>infinity</literal> and <literal>-infinity</literal>
are specially represented inside the system and will be displayed
2009-04-27 18:27:36 +02:00
unchanged; but the others are simply notational shorthands
2002-11-22 00:31:20 +01:00
that will be converted to ordinary date/time values when read.
2005-01-08 06:19:18 +01:00
(In particular, <literal>now</> and related strings are converted
to a specific time value as soon as they are read.)
2009-04-27 18:27:36 +02:00
All of these values need to be enclosed in single quotes when used
2005-01-08 06:19:18 +01:00
as constants in SQL commands.
2002-11-11 21:14:04 +01:00
</para>
2000-01-23 02:27:39 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-datetime-special-table">
2002-11-22 00:31:20 +01:00
<title>Special Date/Time Inputs</title>
2007-04-17 19:30:35 +02:00
<tgroup cols="3">
2003-11-01 02:56:29 +01:00
<thead>
<row>
<entry>Input String</entry>
2003-03-13 02:30:29 +01:00
<entry>Valid Types</entry>
2003-11-01 02:56:29 +01:00
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>epoch</literal></entry>
2003-03-13 02:30:29 +01:00
<entry><type>date</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>
</row>
<row>
<entry><literal>infinity</literal></entry>
2008-10-14 19:12:33 +02:00
<entry><type>date</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>later than all other time stamps</entry>
</row>
<row>
<entry><literal>-infinity</literal></entry>
2008-10-14 19:12:33 +02:00
<entry><type>date</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>earlier than all other time stamps</entry>
</row>
<row>
<entry><literal>now</literal></entry>
2003-03-13 02:30:29 +01:00
<entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>current transaction's start time</entry>
</row>
<row>
<entry><literal>today</literal></entry>
2003-03-13 02:30:29 +01:00
<entry><type>date</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>midnight today</entry>
</row>
<row>
<entry><literal>tomorrow</literal></entry>
2003-03-13 02:30:29 +01:00
<entry><type>date</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>midnight tomorrow</entry>
</row>
<row>
<entry><literal>yesterday</literal></entry>
2003-03-13 02:30:29 +01:00
<entry><type>date</type>, <type>timestamp</type></entry>
2003-11-01 02:56:29 +01:00
<entry>midnight yesterday</entry>
</row>
<row>
<entry><literal>allballs</literal></entry>
2003-03-13 02:30:29 +01:00
<entry><type>time</type></entry>
2003-11-01 02:56:29 +01:00
<entry>00:00:00.00 UTC</entry>
</row>
</tbody>
2001-11-21 06:53:41 +01:00
</tgroup>
</table>
1998-12-18 17:11:12 +01:00
2004-08-10 02:55:08 +02:00
<para>
The following <acronym>SQL</acronym>-compatible functions can also
be used to obtain the current time value for the corresponding data
type:
<literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>,
<literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,
<literal>LOCALTIMESTAMP</literal>. The latter four accept an
2005-12-22 22:45:19 +01:00
optional subsecond precision specification. (See <xref
2009-04-27 18:27:36 +02:00
linkend="functions-datetime-current">.) Note that these are
SQL functions and are <emphasis>not</> recognized in data input strings.
2004-08-10 02:55:08 +02:00
</para>
2001-11-21 06:53:41 +01:00
</sect3>
</sect2>
2000-01-23 02:27:39 +01:00
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-datetime-output">
2000-01-23 02:27:39 +01:00
<title>Date/Time Output</title>
1999-08-06 15:43:42 +02:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>date</primary>
<secondary>output format</secondary>
2003-08-31 19:32:24 +02:00
<seealso>formatting</seealso>
2001-05-13 00:51:36 +02:00
</indexterm>
<indexterm>
<primary>time</primary>
<secondary>output format</secondary>
2003-08-31 19:32:24 +02:00
<seealso>formatting</seealso>
2001-05-13 00:51:36 +02:00
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2009-06-17 23:58:49 +02:00
The output format of the date/time types can be set to one of the four
styles ISO 8601,
2009-04-27 18:27:36 +02:00
<acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</>
2009-06-17 23:58:49 +02:00
(Unix <application>date</> format), or
German. The default
2002-11-15 04:11:18 +01:00
is the <acronym>ISO</acronym> format. (The
<acronym>SQL</acronym> standard requires the use of the ISO 8601
2009-06-17 23:58:49 +02:00
format. The name of the <quote>SQL</quote> output format is a
historical accident.) <xref
2002-11-15 04:11:18 +01:00
linkend="datatype-datetime-output-table"> shows examples of each
output style. The output of the <type>date</type> and
2002-11-11 21:14:04 +01:00
<type>time</type> types is of course only the date or time part
in accordance with the given examples.
</para>
2000-01-23 02:27:39 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-datetime-output-table">
2001-02-14 20:37:26 +01:00
<title>Date/Time Output Styles</title>
2000-01-23 02:27:39 +01:00
<tgroup cols="3">
<thead>
2003-11-01 02:56:29 +01:00
<row>
<entry>Style Specification</entry>
<entry>Description</entry>
<entry>Example</entry>
</row>
2000-01-23 02:27:39 +01:00
</thead>
<tbody>
2003-11-01 02:56:29 +01:00
<row>
<entry>ISO</entry>
<entry>ISO 8601/SQL standard</entry>
<entry>1997-12-17 07:37:16-08</entry>
</row>
<row>
<entry>SQL</entry>
<entry>traditional style</entry>
<entry>12/17/1997 07:37:16.00 PST</entry>
</row>
<row>
<entry>POSTGRES</entry>
<entry>original style</entry>
<entry>Wed Dec 17 07:37:16 1997 PST</entry>
</row>
<row>
<entry>German</entry>
<entry>regional style</entry>
<entry>17.12.1997 07:37:16.00 PST</entry>
</row>
2000-01-23 02:27:39 +01:00
</tbody>
</tgroup>
</table>
1999-08-06 15:43:42 +02:00
<para>
2003-07-29 02:03:19 +02:00
In the <acronym>SQL</acronym> and POSTGRES styles, day appears before
month if DMY field ordering has been specified, otherwise month appears
before day.
(See <xref linkend="datatype-datetime-input">
2002-11-11 21:14:04 +01:00
for how this setting also affects interpretation of input values.)
<xref linkend="datatype-datetime-output2-table"> shows an
example.
1998-12-18 17:11:12 +01:00
</para>
1998-03-01 09:16:16 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-datetime-output2-table">
<title>Date Order Conventions</title>
2000-01-23 02:27:39 +01:00
<tgroup cols="3">
<thead>
2003-11-01 02:56:29 +01:00
<row>
<entry><varname>datestyle</varname> Setting</entry>
<entry>Input Ordering</entry>
<entry>Example Output</entry>
</row>
2000-01-23 02:27:39 +01:00
</thead>
<tbody>
2003-11-01 02:56:29 +01:00
<row>
<entry><literal>SQL, DMY</></entry>
<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
<entry>17/12/1997 15:37:16.00 CET</entry>
</row>
<row>
<entry><literal>SQL, MDY</></entry>
<entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry>
<entry>12/17/1997 07:37:16.00 PST</entry>
</row>
<row>
<entry><literal>Postgres, DMY</></entry>
<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
<entry>Wed 17 Dec 07:37:16 1997 PST</entry>
</row>
2000-01-23 02:27:39 +01:00
</tbody>
</tgroup>
</table>
1998-03-01 09:16:16 +01:00
1999-08-06 15:43:42 +02:00
<para>
2002-11-11 21:14:04 +01:00
The date/time styles can be selected by the user using the
2004-03-09 17:57:47 +01:00
<command>SET datestyle</command> command, the <xref
linkend="guc-datestyle"> parameter in the
2003-07-29 02:03:19 +02:00
<filename>postgresql.conf</filename> configuration file, or the
2002-11-11 21:14:04 +01:00
<envar>PGDATESTYLE</envar> environment variable on the server or
client. The formatting function <function>to_char</function>
(see <xref linkend="functions-formatting">) is also available as
2008-11-09 01:28:35 +01:00
a more flexible way to format date/time output.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
1998-12-18 17:11:12 +01:00
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-timezones">
2000-01-23 02:27:39 +01:00
<title>Time Zones</title>
1998-12-18 17:11:12 +01:00
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-timezones">
2003-08-31 19:32:24 +02:00
<primary>time zone</primary>
2001-05-13 00:51:36 +02:00
</indexterm>
2003-03-13 02:30:29 +01:00
<para>
Time zones, and time-zone conventions, are influenced by
political decisions, not just earth geometry. Time zones around the
world became somewhat standardized during the 1900's,
2004-08-10 02:55:08 +02:00
but continue to be prone to arbitrary changes, particularly with
respect to daylight-savings rules.
2008-02-16 22:51:04 +01:00
<productname>PostgreSQL</productname> uses the widely-used
<literal>zoneinfo</> time zone database for information about
historical time zone rules. For times in the future, the assumption
is that the latest known rules for a given time zone will
continue to be observed indefinitely far into the future.
2003-03-13 02:30:29 +01:00
</para>
1999-08-06 15:43:42 +02:00
<para>
2001-11-21 06:53:41 +01:00
<productname>PostgreSQL</productname> endeavors to be compatible with
2002-11-11 21:14:04 +01:00
the <acronym>SQL</acronym> standard definitions for typical usage.
However, the <acronym>SQL</acronym> standard has an odd mix of date and
2000-01-23 02:27:39 +01:00
time types and capabilities. Two obvious problems are:
1998-03-01 09:16:16 +01:00
2000-01-23 02:27:39 +01:00
<itemizedlist>
<listitem>
<para>
2003-11-01 02:56:29 +01:00
Although the <type>date</type> type
2009-04-27 18:27:36 +02:00
cannot have an associated time zone, the
2003-11-01 02:56:29 +01:00
<type>time</type> type can.
2004-08-10 02:55:08 +02:00
Time zones in the real world have little meaning unless
associated with a date as well as a time,
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
since the offset can vary through the year with daylight-saving
2003-11-01 02:56:29 +01:00
time boundaries.
2000-01-23 02:27:39 +01:00
</para>
</listitem>
1998-03-01 09:16:16 +01:00
2000-01-23 02:27:39 +01:00
<listitem>
<para>
2003-11-01 02:56:29 +01:00
The default time zone is specified as a constant numeric offset
2009-04-27 18:27:36 +02:00
from <acronym>UTC</>. It is therefore impossible to adapt to
2004-08-10 02:55:08 +02:00
daylight-saving time when doing date/time arithmetic across
2003-11-01 02:56:29 +01:00
<acronym>DST</acronym> boundaries.
2000-01-23 02:27:39 +01:00
</para>
</listitem>
1998-10-14 18:26:31 +02:00
2000-01-23 02:27:39 +01:00
</itemizedlist>
1999-08-06 15:43:42 +02:00
</para>
2000-01-23 02:27:39 +01:00
<para>
2002-11-15 04:11:18 +01:00
To address these difficulties, we recommend using date/time types
that contain both date and time when using time zones. We
2009-04-27 18:27:36 +02:00
do <emphasis>not</> recommend using the type <type>time with
2002-11-15 04:11:18 +01:00
time zone</type> (though it is supported by
2001-11-21 06:53:41 +01:00
<productname>PostgreSQL</productname> for legacy applications and
2004-08-10 02:55:08 +02:00
for compliance with the <acronym>SQL</acronym> standard).
<productname>PostgreSQL</productname> assumes
2003-03-13 02:30:29 +01:00
your local time zone for any type containing only date or time.
1999-08-06 15:43:42 +02:00
</para>
<para>
2004-08-10 02:55:08 +02:00
All timezone-aware dates and times are stored internally in
<acronym>UTC</acronym>. They are converted to local time
in the zone specified by the <xref linkend="guc-timezone"> configuration
parameter before being displayed to the client.
1999-08-06 15:43:42 +02:00
</para>
2006-09-22 18:20:00 +02:00
<para>
<productname>PostgreSQL</productname> allows you to specify time zones in
three different forms:
<itemizedlist>
<listitem>
<para>
A full time zone name, for example <literal>America/New_York</>.
The recognized time zone names are listed in the
<literal>pg_timezone_names</literal> view (see <xref
linkend="view-pg-timezone-names">).
<productname>PostgreSQL</productname> uses the widely-used
2008-02-16 22:51:04 +01:00
<literal>zoneinfo</> time zone data for this purpose, so the same
2006-09-22 18:20:00 +02:00
names are also recognized by much other software.
</para>
</listitem>
<listitem>
<para>
A time zone abbreviation, for example <literal>PST</>. Such a
specification merely defines a particular offset from UTC, in
2009-04-27 18:27:36 +02:00
contrast to full time zone names which can imply a set of daylight
2006-09-22 18:20:00 +02:00
savings transition-date rules as well. The recognized abbreviations
are listed in the <literal>pg_timezone_abbrevs</> view (see <xref
linkend="view-pg-timezone-abbrevs">). You cannot set the
2007-08-04 03:26:54 +02:00
configuration parameters <xref linkend="guc-timezone"> or
2009-04-27 18:27:36 +02:00
<xref linkend="guc-log-timezone"> to a time
2006-09-22 18:20:00 +02:00
zone abbreviation, but you can use abbreviations in
date/time input values and with the <literal>AT TIME ZONE</>
operator.
</para>
</listitem>
<listitem>
<para>
In addition to the timezone names and abbreviations,
2006-10-17 23:03:21 +02:00
<productname>PostgreSQL</productname> will accept POSIX-style time zone
2006-09-22 18:20:00 +02:00
specifications of the form <replaceable>STD</><replaceable>offset</> or
<replaceable>STD</><replaceable>offset</><replaceable>DST</>, where
<replaceable>STD</> is a zone abbreviation, <replaceable>offset</> is a
numeric offset in hours west from UTC, and <replaceable>DST</> is an
optional daylight-savings zone abbreviation, assumed to stand for one
hour ahead of the given offset. For example, if <literal>EST5EDT</>
were not already a recognized zone name, it would be accepted and would
2009-04-27 18:27:36 +02:00
be functionally equivalent to United States East Coast time. When a
2006-09-22 18:20:00 +02:00
daylight-savings zone name is present, it is assumed to be used
2007-03-14 18:38:06 +01:00
according to the same daylight-savings transition rules used in the
2008-02-16 22:51:04 +01:00
<literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
2007-03-14 18:38:06 +01:00
In a standard <productname>PostgreSQL</productname> installation,
<filename>posixrules</> is the same as <literal>US/Eastern</>, so
that POSIX-style time zone specifications follow USA daylight-savings
rules. If needed, you can adjust this behavior by replacing the
<filename>posixrules</> file.
2006-09-22 18:20:00 +02:00
</para>
</listitem>
</itemizedlist>
2009-06-17 23:58:49 +02:00
In short, this is the difference between abbreviations
2009-04-27 18:27:36 +02:00
and full names: abbreviations always represent a fixed offset from
2006-09-22 18:20:00 +02:00
UTC, whereas most of the full names imply a local daylight-savings time
2009-04-27 18:27:36 +02:00
rule, and so have two possible UTC offsets.
2006-09-22 18:20:00 +02:00
</para>
2007-03-14 18:38:06 +01:00
<para>
One should be wary that the POSIX-style time zone feature can
lead to silently accepting bogus input, since there is no check on the
reasonableness of the zone abbreviations. For example, <literal>SET
TIMEZONE TO FOOBAR0</> will work, leaving the system effectively using
a rather peculiar abbreviation for UTC.
2007-05-08 19:02:59 +02:00
Another issue to keep in mind is that in POSIX time zone names,
positive offsets are used for locations <emphasis>west</> of Greenwich.
Everywhere else, <productname>PostgreSQL</productname> follows the
ISO-8601 convention that positive timezone offsets are <emphasis>east</>
of Greenwich.
2007-03-14 18:38:06 +01:00
</para>
2006-10-16 21:58:27 +02:00
<para>
In all cases, timezone names are recognized case-insensitively.
(This is a change from <productname>PostgreSQL</productname> versions
2009-04-27 18:27:36 +02:00
prior to 8.2, which were case-sensitive in some contexts but not others.)
2006-10-16 21:58:27 +02:00
</para>
2006-09-22 18:20:00 +02:00
<para>
Neither full names nor abbreviations are hard-wired into the server;
they are obtained from configuration files stored under
<filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>
of the installation directory
(see <xref linkend="datetime-config-files">).
</para>
2000-01-23 02:27:39 +01:00
<para>
2004-08-10 02:55:08 +02:00
The <xref linkend="guc-timezone"> configuration parameter can
be set in the file <filename>postgresql.conf</>, or in any of the
other standard ways described in <xref linkend="runtime-config">.
There are also several special ways to set it:
1999-08-06 15:43:42 +02:00
2002-11-11 21:14:04 +01:00
<itemizedlist>
2000-01-23 02:27:39 +01:00
<listitem>
<para>
2004-08-10 02:55:08 +02:00
If <varname>timezone</> is not specified in
2009-04-27 18:27:36 +02:00
<filename>postgresql.conf</> or as a server command-line option,
2004-08-10 02:55:08 +02:00
the server attempts to use the value of the <envar>TZ</envar>
environment variable as the default time zone. If <envar>TZ</envar>
is not defined or is not any of the time zone names known to
<productname>PostgreSQL</productname>, the server attempts to
determine the operating system's default time zone by checking the
behavior of the C library function <literal>localtime()</>. The
default time zone is selected as the closest match among
<productname>PostgreSQL</productname>'s known time zones.
2007-08-04 03:26:54 +02:00
(These rules are also used to choose the default value of
2009-04-27 18:27:36 +02:00
<xref linkend="guc-log-timezone">, if not specified.)
2002-11-22 00:31:20 +01:00
</para>
</listitem>
<listitem>
<para>
2004-08-10 02:55:08 +02:00
The <acronym>SQL</acronym> command <command>SET TIME ZONE</command>
sets the time zone for the session. This is an alternative spelling
of <command>SET TIMEZONE TO</> with a more SQL-spec-compatible syntax.
2000-01-23 02:27:39 +01:00
</para>
</listitem>
2002-11-11 21:14:04 +01:00
2000-01-23 02:27:39 +01:00
<listitem>
<para>
2009-04-27 18:27:36 +02:00
The <envar>PGTZ</envar> environment variable is used by
<application>libpq</application> clients
to send a <command>SET TIME ZONE</command>
2003-11-01 02:56:29 +01:00
command to the server upon connection.
2000-01-23 02:27:39 +01:00
</para>
</listitem>
</itemizedlist>
</para>
1999-08-06 15:43:42 +02:00
</sect2>
2008-11-09 01:28:35 +01:00
<sect2 id="datatype-interval-input">
<title>Interval Input</title>
<indexterm>
<primary>interval</primary>
</indexterm>
<para>
2009-06-17 23:58:49 +02:00
<type>interval</type> values can be written using the following
2008-11-09 01:28:35 +01:00
verbose syntax:
2008-11-11 03:42:33 +01:00
<synopsis>
2008-11-09 01:28:35 +01:00
<optional>@</> <replaceable>quantity</> <replaceable>unit</> <optional><replaceable>quantity</> <replaceable>unit</>...</> <optional><replaceable>direction</></optional>
2008-11-11 03:42:33 +01:00
</synopsis>
2008-11-09 01:28:35 +01:00
where <replaceable>quantity</> is a number (possibly signed);
<replaceable>unit</> is <literal>microsecond</literal>,
<literal>millisecond</literal>, <literal>second</literal>,
<literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,
<literal>week</literal>, <literal>month</literal>, <literal>year</literal>,
<literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,
or abbreviations or plurals of these units;
<replaceable>direction</> can be <literal>ago</literal> or
empty. The at sign (<literal>@</>) is optional noise. The amounts
2009-04-27 18:27:36 +02:00
of the different units are implicitly added with appropriate
2008-11-09 01:28:35 +01:00
sign accounting. <literal>ago</literal> negates all the fields.
This syntax is also used for interval output, if
<xref linkend="guc-intervalstyle"> is set to
<literal>postgres_verbose</>.
</para>
<para>
Quantities of days, hours, minutes, and seconds can be specified without
explicit unit markings. For example, <literal>'1 12:59:10'</> is read
the same as <literal>'1 day 12 hours 59 min 10 sec'</>. Also,
a combination of years and months can be specified with a dash;
for example <literal>'200-10'</> is read the same as <literal>'200 years
10 months'</>. (These shorter forms are in fact the only ones allowed
by the <acronym>SQL</acronym> standard, and are used for output when
<varname>IntervalStyle</> is set to <literal>sql_standard</literal>.)
</para>
2008-11-11 03:42:33 +01:00
<para>
Interval values can also be written as ISO 8601 time intervals, using
either the <quote>format with designators</> of the standard's section
4.4.3.2 or the <quote>alternative format</> of section 4.4.3.3. The
format with designators looks like this:
<synopsis>
P <replaceable>quantity</> <replaceable>unit</> <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional> <optional> T <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional></optional>
</synopsis>
The string must start with a <literal>P</>, and may include a
<literal>T</> that introduces the time-of-day units. The
available unit abbreviations are given in <xref
linkend="datatype-interval-iso8601-units">. Units may be
omitted, and may be specified in any order, but units smaller than
a day must appear after <literal>T</>. In particular, the meaning of
<literal>M</> depends on whether it is before or after
<literal>T</>.
</para>
<table id="datatype-interval-iso8601-units">
<title>ISO 8601 interval unit abbreviations</title>
<tgroup cols="2">
<thead>
<row>
<entry>Abbreviation</entry>
<entry>Meaning</entry>
</row>
</thead>
<tbody>
<row>
<entry>Y</entry>
<entry>Years</entry>
</row>
<row>
<entry>M</entry>
<entry>Months (in the date part)</entry>
</row>
<row>
<entry>W</entry>
<entry>Weeks</entry>
</row>
<row>
<entry>D</entry>
<entry>Days</entry>
</row>
<row>
<entry>H</entry>
<entry>Hours</entry>
</row>
<row>
<entry>M</entry>
<entry>Minutes (in the time part)</entry>
</row>
<row>
<entry>S</entry>
<entry>Seconds</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
In the alternative format:
<synopsis>
P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> </optional> <optional> T <replaceable>hours</>:<replaceable>minutes</>:<replaceable>seconds</> </optional>
</synopsis>
the string must begin with <literal>P</literal>, and a
<literal>T</> separates the date and time parts of the interval.
The values are given as numbers similar to ISO 8601 dates.
</para>
2008-11-09 01:28:35 +01:00
<para>
When writing an interval constant with a <replaceable>fields</>
2009-07-08 19:21:55 +02:00
specification, or when assigning a string to an interval column that was
defined with a <replaceable>fields</> specification, the interpretation of
2008-11-09 01:28:35 +01:00
unmarked quantities depends on the <replaceable>fields</>. For
example <literal>INTERVAL '1' YEAR</> is read as 1 year, whereas
2009-07-08 19:21:55 +02:00
<literal>INTERVAL '1'</> means 1 second. Also, field values
<quote>to the right</> of the least significant field allowed by the
<replaceable>fields</> specification are silently discarded. For
example, writing <literal>INTERVAL '1 day 2:03:04' HOUR TO MINUTE</>
results in dropping the seconds field, but not the day field.
2008-11-09 01:28:35 +01:00
</para>
<para>
According to the <acronym>SQL</> standard all fields of an interval
value must have the same sign, so a leading negative sign applies to all
fields; for example the negative sign in the interval literal
<literal>'-1 2:03:04'</> applies to both the days and hour/minute/second
parts. <productname>PostgreSQL</> allows the fields to have different
signs, and traditionally treats each field in the textual representation
as independently signed, so that the hour/minute/second part is
considered positive in this example. If <varname>IntervalStyle</> is
set to <literal>sql_standard</literal> then a leading sign is considered
to apply to all fields (but only if no additional signs appear).
Otherwise the traditional <productname>PostgreSQL</> interpretation is
used. To avoid ambiguity, it's recommended to attach an explicit sign
to each field if any field is negative.
</para>
<para>
Internally <type>interval</> values are stored as months, days,
and seconds. This is done because the number of days in a month
varies, and a day can have 23 or 25 hours if a daylight savings
2008-11-09 18:09:48 +01:00
time adjustment is involved. The months and days fields are integers
while the seconds field can store fractions. Because intervals are
usually created from constant strings or <type>timestamp</> subtraction,
this storage method works well in most cases. Functions
2008-11-09 01:28:35 +01:00
<function>justify_days</> and <function>justify_hours</> are
available for adjusting days and hours that overflow their normal
ranges.
</para>
2008-11-09 18:09:48 +01:00
<para>
In the verbose input format, and in some fields of the more compact
input formats, field values can have fractional parts; for example
<literal>'1.5 week'</> or <literal>'01:02:03.45'</>. Such input is
converted to the appropriate number of months, days, and seconds
for storage. When this would result in a fractional number of
months or days, the fraction is added to the lower-order fields
using the conversion factors 1 month = 30 days and 1 day = 24 hours.
For example, <literal>'1.5 month'</> becomes 1 month and 15 days.
Only seconds will ever be shown as fractional on output.
</para>
2008-11-11 03:42:33 +01:00
<para>
<xref linkend="datatype-interval-input-examples"> shows some examples
of valid <type>interval</> input.
</para>
<table id="datatype-interval-input-examples">
<title>Interval Input</title>
<tgroup cols="2">
<thead>
<row>
<entry>Example</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>1-2</entry>
<entry>SQL standard format: 1 year 2 months</entry>
</row>
<row>
<entry>3 4:05:06</entry>
<entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry>
</row>
<row>
<entry>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
<entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
</row>
<row>
<entry>P1Y2M3DT4H5M6S</entry>
<entry>ISO 8601 <quote>format with designators</>: same meaning as above</entry>
</row>
<row>
<entry>P0001-02-03T04:05:06</entry>
<entry>ISO 8601 <quote>alternative format</>: same meaning as above</entry>
</row>
</tbody>
</tgroup>
</table>
2008-11-09 01:28:35 +01:00
</sect2>
<sect2 id="datatype-interval-output">
<title>Interval Output</title>
<indexterm>
<primary>interval</primary>
<secondary>output format</secondary>
<seealso>formatting</seealso>
</indexterm>
<para>
The output format of the interval type can be set to one of the
2008-11-11 03:42:33 +01:00
four styles <literal>sql_standard</>, <literal>postgres</>,
<literal>postgres_verbose</>, or <literal>iso_8601</>,
2008-11-09 01:28:35 +01:00
using the command <literal>SET intervalstyle</literal>.
The default is the <literal>postgres</> format.
<xref linkend="interval-style-output-table"> shows examples of each
output style.
</para>
<para>
The <literal>sql_standard</> style produces output that conforms to
the SQL standard's specification for interval literal strings, if
the interval value meets the standard's restrictions (either year-month
only or day-time only, with no mixing of positive
and negative components). Otherwise the output looks like a standard
year-month literal string followed by a day-time literal string,
with explicit signs added to disambiguate mixed-sign intervals.
</para>
<para>
The output of the <literal>postgres</> style matches the output of
<productname>PostgreSQL</> releases prior to 8.4 when the
<xref linkend="guc-datestyle"> parameter was set to <literal>ISO</>.
</para>
<para>
The output of the <literal>postgres_verbose</> style matches the output of
<productname>PostgreSQL</> releases prior to 8.4 when the
<varname>DateStyle</> parameter was set to non-<literal>ISO</> output.
</para>
2008-11-11 03:42:33 +01:00
<para>
The output of the <literal>iso_8601</> style matches the <quote>format
with designators</> described in section 4.4.3.2 of the
ISO 8601 standard.
</para>
2008-11-09 01:28:35 +01:00
<table id="interval-style-output-table">
<title>Interval Output Style Examples</title>
<tgroup cols="4">
<thead>
<row>
<entry>Style Specification</entry>
<entry>Year-Month Interval</entry>
<entry>Day-Time Interval</entry>
<entry>Mixed Interval</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>sql_standard</></entry>
<entry>1-2</entry>
<entry>3 4:05:06</entry>
<entry>-1-2 +3 -4:05:06</entry>
</row>
<row>
<entry><literal>postgres</></entry>
<entry>1 year 2 mons</entry>
<entry>3 days 04:05:06</entry>
<entry>-1 year -2 mons +3 days -04:05:06</entry>
</row>
<row>
<entry><literal>postgres_verbose</></entry>
<entry>@ 1 year 2 mons</entry>
<entry>@ 3 days 4 hours 5 mins 6 secs</entry>
<entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry>
</row>
2008-11-11 03:42:33 +01:00
<row>
<entry><literal>iso_8601</></entry>
<entry>P1Y2M</entry>
<entry>P3DT4H5M6S</entry>
<entry>P-1Y-2M3DT-4H-5M-6S</entry>
</row>
2008-11-09 01:28:35 +01:00
</tbody>
</tgroup>
</table>
</sect2>
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-datetime-internals">
2000-01-23 02:27:39 +01:00
<title>Internals</title>
1999-08-06 15:43:42 +02:00
<para>
2001-11-21 06:53:41 +01:00
<productname>PostgreSQL</productname> uses Julian dates
2009-04-27 18:27:36 +02:00
for all date/time calculations. This has the useful property of correctly
calculating dates from 4713 BC
2000-01-23 02:27:39 +01:00
to far into the future, using the assumption that the length of the
year is 365.2425 days.
1999-08-06 15:43:42 +02:00
</para>
<para>
2000-01-23 02:27:39 +01:00
Date conventions before the 19th century make for interesting reading,
2001-02-14 20:37:26 +01:00
but are not consistent enough to warrant coding into a date/time handler.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
</sect1>
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-boolean">
1999-08-06 15:43:42 +02:00
<title>Boolean Type</title>
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-boolean">
<primary>Boolean</primary>
<secondary>data type</secondary>
</indexterm>
<indexterm zone="datatype-boolean">
<primary>true</primary>
</indexterm>
<indexterm zone="datatype-boolean">
<primary>false</primary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2001-11-21 06:53:41 +01:00
<productname>PostgreSQL</productname> provides the
2002-11-11 21:14:04 +01:00
standard <acronym>SQL</acronym> type <type>boolean</type>.
2001-02-14 20:37:26 +01:00
<type>boolean</type> can have one of only two states:
<quote>true</quote> or <quote>false</quote>. A third state,
<quote>unknown</quote>, is represented by the
2002-11-11 21:14:04 +01:00
<acronym>SQL</acronym> null value.
2001-01-13 19:34:51 +01:00
</para>
1999-08-06 15:43:42 +02:00
<para>
2001-02-14 20:37:26 +01:00
Valid literal values for the <quote>true</quote> state are:
<simplelist>
<member><literal>TRUE</literal></member>
<member><literal>'t'</literal></member>
<member><literal>'true'</literal></member>
<member><literal>'y'</literal></member>
<member><literal>'yes'</literal></member>
2009-03-09 15:34:35 +01:00
<member><literal>'on'</literal></member>
2001-02-14 20:37:26 +01:00
<member><literal>'1'</literal></member>
</simplelist>
For the <quote>false</quote> state, the following values can be
used:
<simplelist>
<member><literal>FALSE</literal></member>
<member><literal>'f'</literal></member>
<member><literal>'false'</literal></member>
<member><literal>'n'</literal></member>
<member><literal>'no'</literal></member>
2009-03-09 15:34:35 +01:00
<member><literal>'off'</literal></member>
2001-02-14 20:37:26 +01:00
<member><literal>'0'</literal></member>
</simplelist>
2009-06-17 23:58:49 +02:00
Leading or trailing whitespace is ignored, and case does not matter.
The key words
<literal>TRUE</literal> and <literal>FALSE</literal> are the preferred
(<acronym>SQL</acronym>-compliant) usage.
1999-08-06 15:43:42 +02:00
</para>
2001-02-14 20:37:26 +01:00
<example id="datatype-boolean-example">
<title>Using the <type>boolean</type> type</title>
<programlisting>
CREATE TABLE test1 (a boolean, b text);
INSERT INTO test1 VALUES (TRUE, 'sic est');
INSERT INTO test1 VALUES (FALSE, 'non est');
SELECT * FROM test1;
a | b
---+---------
t | sic est
f | non est
SELECT * FROM test1 WHERE a;
a | b
---+---------
t | sic est
</programlisting>
</example>
1999-08-06 15:43:42 +02:00
<para>
2001-02-14 20:37:26 +01:00
<xref linkend="datatype-boolean-example"> shows that
<type>boolean</type> values are output using the letters
<literal>t</literal> and <literal>f</literal>.
</para>
<para>
<type>boolean</type> uses 1 byte of storage.
1999-08-06 15:43:42 +02:00
</para>
</sect1>
2007-04-02 05:49:42 +02:00
<sect1 id="datatype-enum">
<title>Enumerated Types</title>
<indexterm zone="datatype-enum">
<primary>data type</primary>
<secondary>enumerated (enum)</secondary>
</indexterm>
2008-12-19 02:34:19 +01:00
<indexterm zone="datatype-enum">
<primary>enumerated types</primary>
</indexterm>
2007-04-02 05:49:42 +02:00
<para>
Enumerated (enum) types are data types that
2009-04-27 18:27:36 +02:00
comprise a static, ordered set of values.
They are equivalent to the <type>enum</type>
types supported in a number of programming languages. An example of an enum
2007-04-02 05:49:42 +02:00
type might be the days of the week, or a set of status values for
a piece of data.
</para>
<sect2>
<title>Declaration of Enumerated Types</title>
<para>
Enum types are created using the <xref
linkend="sql-createtype" endterm="sql-createtype-title"> command,
for example:
<programlisting>
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
</programlisting>
Once created, the enum type can be used in table and function
definitions much like any other type:
</para>
<example>
<title>Basic Enum Usage</title>
<programlisting>
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
CREATE TABLE person (
name text,
current_mood mood
);
INSERT INTO person VALUES ('Moe', 'happy');
SELECT * FROM person WHERE current_mood = 'happy';
name | current_mood
------+--------------
Moe | happy
(1 row)
</programlisting>
</example>
</sect2>
<sect2>
<title>Ordering</title>
<para>
The ordering of the values in an enum type is the
2009-04-27 18:27:36 +02:00
order in which the values were listed when the type was created.
2007-04-02 05:49:42 +02:00
All standard comparison operators and related
aggregate functions are supported for enums. For example:
</para>
<example>
<title>Enum Ordering</title>
<programlisting>
INSERT INTO person VALUES ('Larry', 'sad');
INSERT INTO person VALUES ('Curly', 'ok');
SELECT * FROM person WHERE current_mood > 'sad';
name | current_mood
-------+--------------
Moe | happy
Curly | ok
(2 rows)
SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;
name | current_mood
-------+--------------
Curly | ok
Moe | happy
(2 rows)
2009-04-27 18:27:36 +02:00
SELECT name
FROM person
WHERE current_mood = (SELECT MIN(current_mood) FROM person);
2007-04-02 05:49:42 +02:00
name
-------
Larry
(1 row)
</programlisting>
</example>
</sect2>
<sect2>
<title>Type Safety</title>
<para>
2009-04-27 18:27:36 +02:00
Each enumerated data type is separate and cannot
be compared with other enumerated types.
2007-04-02 05:49:42 +02:00
</para>
<example>
<title>Lack of Casting</title>
<programlisting>
CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');
2009-04-27 18:27:36 +02:00
CREATE TABLE holidays (
num_weeks integer,
2007-04-02 05:49:42 +02:00
happiness happiness
);
INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');
INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy');
INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic');
INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad');
ERROR: invalid input value for enum happiness: "sad"
SELECT person.name, holidays.num_weeks FROM person, holidays
WHERE person.current_mood = holidays.happiness;
ERROR: operator does not exist: mood = happiness
</programlisting>
</example>
<para>
If you really need to do something like that, you can either
write a custom operator or add explicit casts to your query:
</para>
<example>
<title>Comparing Different Enums by Casting to Text</title>
<programlisting>
SELECT person.name, holidays.num_weeks FROM person, holidays
WHERE person.current_mood::text = holidays.happiness::text;
name | num_weeks
------+-----------
Moe | 4
(1 row)
</programlisting>
</example>
</sect2>
<sect2>
<title>Implementation Details</title>
<para>
An enum value occupies four bytes on disk. The length of an enum
value's textual label is limited by the <symbol>NAMEDATALEN</symbol>
setting compiled into <productname>PostgreSQL</productname>; in standard
builds this means at most 63 bytes.
</para>
<para>
Enum labels are case sensitive, so
<type>'happy'</type> is not the same as <type>'HAPPY'</type>.
2009-04-27 18:27:36 +02:00
White space in the labels is significant too.
2007-04-02 05:49:42 +02:00
</para>
2008-12-19 02:34:19 +01:00
<para>
The translations from internal enum values to textual labels are
kept in the system catalog
<link linkend="catalog-pg-enum"><structname>pg_enum</structname></link>.
Querying this catalog directly can be useful.
</para>
2007-04-02 05:49:42 +02:00
</sect2>
</sect1>
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-geometric">
1999-08-06 15:43:42 +02:00
<title>Geometric Types</title>
<para>
2002-11-11 21:14:04 +01:00
Geometric data types represent two-dimensional spatial
objects. <xref linkend="datatype-geo-table"> shows the geometric
2002-11-15 04:11:18 +01:00
types available in <productname>PostgreSQL</productname>. The
most fundamental type, the point, forms the basis for all of the
other types.
1999-08-06 15:43:42 +02:00
</para>
2002-11-11 21:14:04 +01:00
<table id="datatype-geo-table">
2001-02-14 20:37:26 +01:00
<title>Geometric Types</title>
1999-08-06 15:43:42 +02:00
<tgroup cols="4">
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Representation</entry>
<entry>Description</entry>
1999-08-06 15:43:42 +02:00
</row>
</thead>
<tbody>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>point</type></entry>
<entry>16 bytes</entry>
2009-04-27 18:27:36 +02:00
<entry>Point on a plane</entry>
2003-11-01 02:56:29 +01:00
<entry>(x,y)</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>line</type></entry>
<entry>32 bytes</entry>
<entry>Infinite line (not fully implemented)</entry>
<entry>((x1,y1),(x2,y2))</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>lseg</type></entry>
<entry>32 bytes</entry>
<entry>Finite line segment</entry>
<entry>((x1,y1),(x2,y2))</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>box</type></entry>
<entry>32 bytes</entry>
<entry>Rectangular box</entry>
<entry>((x1,y1),(x2,y2))</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>path</type></entry>
<entry>16+16n bytes</entry>
<entry>Closed path (similar to polygon)</entry>
<entry>((x1,y1),...)</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>path</type></entry>
<entry>16+16n bytes</entry>
<entry>Open path</entry>
<entry>[(x1,y1),...]</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>polygon</type></entry>
<entry>40+16n bytes</entry>
<entry>Polygon (similar to closed path)</entry>
<entry>((x1,y1),...)</entry>
1999-08-06 15:43:42 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>circle</type></entry>
<entry>24 bytes</entry>
<entry>Circle</entry>
2009-04-27 18:27:36 +02:00
<entry><(x,y),r> (center point and radius)</entry>
1999-08-06 15:43:42 +02:00
</row>
</tbody>
</tgroup>
</table>
<para>
A rich set of functions and operators is available to perform various geometric
operations such as scaling, translation, rotation, and determining
2002-11-11 21:14:04 +01:00
intersections. They are explained in <xref linkend="functions-geometry">.
1999-08-06 15:43:42 +02:00
</para>
<sect2>
2003-03-13 02:30:29 +01:00
<title>Points</title>
1998-03-01 09:16:16 +01:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>point</primary>
</indexterm>
1998-12-18 17:11:12 +01:00
<para>
Points are the fundamental two-dimensional building block for geometric types.
2003-03-13 02:30:29 +01:00
Values of type <type>point</type> are specified using the following syntax:
1998-03-01 09:16:16 +01:00
2002-11-11 21:14:04 +01:00
<synopsis>
2000-05-02 22:02:03 +02:00
( <replaceable>x</replaceable> , <replaceable>y</replaceable> )
<replaceable>x</replaceable> , <replaceable>y</replaceable>
2002-11-11 21:14:04 +01:00
</synopsis>
2000-05-02 22:02:03 +02:00
2003-03-13 02:30:29 +01:00
where <replaceable>x</> and <replaceable>y</> are the respective
2009-04-27 18:27:36 +02:00
coordinates, as floating-point numbers.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
1998-03-01 09:16:16 +01:00
1999-08-06 15:43:42 +02:00
<sect2>
2003-03-13 02:30:29 +01:00
<title>Line Segments</title>
1998-03-01 09:16:16 +01:00
2001-05-13 00:51:36 +02:00
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>lseg</primary>
</indexterm>
<indexterm>
<primary>line segment</primary>
2001-05-13 00:51:36 +02:00
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
Line segments (<type>lseg</type>) are represented by pairs of points.
2003-03-13 02:30:29 +01:00
Values of type <type>lseg</type> are specified using the following syntax:
2000-05-02 22:02:03 +02:00
2002-11-11 21:14:04 +01:00
<synopsis>
2000-05-02 22:02:03 +02:00
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
2002-11-11 21:14:04 +01:00
</synopsis>
2000-05-02 22:02:03 +02:00
2003-03-13 02:30:29 +01:00
where
<literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
and
<literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
are the end points of the line segment.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
1998-03-01 09:16:16 +01:00
1999-08-06 15:43:42 +02:00
<sect2>
2003-03-13 02:30:29 +01:00
<title>Boxes</title>
1998-03-01 09:16:16 +01:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>box (data type)</primary>
</indexterm>
2003-08-31 19:32:24 +02:00
<indexterm>
<primary>rectangle</primary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2000-12-22 19:57:50 +01:00
Boxes are represented by pairs of points that are opposite
1999-08-06 15:43:42 +02:00
corners of the box.
2004-12-23 06:37:40 +01:00
Values of type <type>box</type> are specified using the following syntax:
1998-03-01 09:16:16 +01:00
2002-11-11 21:14:04 +01:00
<synopsis>
2000-05-02 22:02:03 +02:00
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
2002-11-11 21:14:04 +01:00
</synopsis>
2000-05-02 22:02:03 +02:00
2003-03-13 02:30:29 +01:00
where
<literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
and
<literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
2004-12-23 06:37:40 +01:00
are any two opposite corners of the box.
2000-05-02 22:02:03 +02:00
</para>
1999-08-06 15:43:42 +02:00
2000-05-02 22:02:03 +02:00
<para>
2009-06-17 23:58:49 +02:00
Boxes are output using the first syntax.
Any two opposite corners can be supplied on input, but the values
will be reordered as needed to store the
2009-04-27 18:27:36 +02:00
upper right and lower left corners.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
<sect2>
2003-03-13 02:30:29 +01:00
<title>Paths</title>
1999-08-06 15:43:42 +02:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>path (data type)</primary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2004-12-23 06:37:40 +01:00
Paths are represented by lists of connected points. Paths can be
2001-09-13 17:55:24 +02:00
<firstterm>open</firstterm>, where
2009-04-27 18:27:36 +02:00
the first and last points in the list are considered not connected, or
2004-12-23 06:37:40 +01:00
<firstterm>closed</firstterm>,
2005-01-08 06:19:18 +01:00
where the first and last points are considered connected.
1999-08-06 15:43:42 +02:00
</para>
<para>
2003-03-13 02:30:29 +01:00
Values of type <type>path</type> are specified using the following syntax:
1999-08-06 15:43:42 +02:00
2002-11-11 21:14:04 +01:00
<synopsis>
2000-05-02 22:02:03 +02:00
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ]
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
2002-11-11 21:14:04 +01:00
</synopsis>
2000-05-02 22:02:03 +02:00
2003-03-13 02:30:29 +01:00
where the points are the end points of the line segments
comprising the path. Square brackets (<literal>[]</>) indicate
an open path, while parentheses (<literal>()</>) indicate a
closed path.
2000-05-02 22:02:03 +02:00
</para>
1999-08-06 15:43:42 +02:00
2000-05-02 22:02:03 +02:00
<para>
2009-06-17 23:58:49 +02:00
Paths are output using the first or second syntax, as appropriate.
1999-08-06 15:43:42 +02:00
</para>
</sect2>
<sect2>
2003-03-13 02:30:29 +01:00
<title>Polygons</title>
1999-08-06 15:43:42 +02:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>polygon</primary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2004-12-23 06:37:40 +01:00
Polygons are represented by lists of points (the vertexes of the
2009-04-27 18:27:36 +02:00
polygon). Polygons are very similar to closed paths, but are
stored differently
1999-08-06 15:43:42 +02:00
and have their own set of support routines.
</para>
<para>
2003-03-13 02:30:29 +01:00
Values of type <type>polygon</type> are specified using the following syntax:
1999-08-06 15:43:42 +02:00
2002-11-11 21:14:04 +01:00
<synopsis>
2000-05-02 22:02:03 +02:00
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
2002-11-11 21:14:04 +01:00
</synopsis>
2000-05-02 22:02:03 +02:00
2003-03-13 02:30:29 +01:00
where the points are the end points of the line segments
comprising the boundary of the polygon.
2000-05-02 22:02:03 +02:00
</para>
1999-08-06 15:43:42 +02:00
2000-05-02 22:02:03 +02:00
<para>
1999-08-06 15:43:42 +02:00
Polygons are output using the first syntax.
</para>
</sect2>
1998-03-01 09:16:16 +01:00
1999-08-06 15:43:42 +02:00
<sect2>
2003-03-13 02:30:29 +01:00
<title>Circles</title>
1998-03-01 09:16:16 +01:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>circle</primary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2009-04-27 18:27:36 +02:00
Circles are represented by a center point and radius.
2003-03-13 02:30:29 +01:00
Values of type <type>circle</type> are specified using the following syntax:
1998-03-01 09:16:16 +01:00
2002-11-11 21:14:04 +01:00
<synopsis>
2000-05-02 22:02:03 +02:00
< ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> >
( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> )
( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable>
<replaceable>x</replaceable> , <replaceable>y</replaceable> , <replaceable>r</replaceable>
2002-11-11 21:14:04 +01:00
</synopsis>
2000-05-02 22:02:03 +02:00
2003-03-13 02:30:29 +01:00
where
<literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal>
2009-04-27 18:27:36 +02:00
is the center point and <replaceable>r</replaceable> is the radius of the circle.
2000-05-02 22:02:03 +02:00
</para>
1999-08-06 15:43:42 +02:00
2000-05-02 22:02:03 +02:00
<para>
1999-08-06 15:43:42 +02:00
Circles are output using the first syntax.
</para>
</sect2>
</sect1>
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-net-types">
2003-03-13 02:30:29 +01:00
<title>Network Address Types</title>
1999-08-06 15:43:42 +02:00
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-net-types">
<primary>network</primary>
2003-08-31 19:32:24 +02:00
<secondary>data types</secondary>
2001-05-13 00:51:36 +02:00
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2003-06-25 00:21:24 +02:00
<productname>PostgreSQL</> offers data types to store IPv4, IPv6, and MAC
2004-12-23 06:37:40 +01:00
addresses, as shown in <xref linkend="datatype-net-types-table">. It
2009-04-27 18:27:36 +02:00
is better to use these types instead of plain text types to store
2009-06-17 23:58:49 +02:00
network addresses, because
2009-04-27 18:27:36 +02:00
these types offer input error checking and specialized
2005-01-08 06:19:18 +01:00
operators and functions (see <xref linkend="functions-net">).
2002-11-11 21:14:04 +01:00
</para>
1999-08-06 15:43:42 +02:00
2001-01-13 19:34:51 +01:00
<table tocentry="1" id="datatype-net-types-table">
2003-03-13 02:30:29 +01:00
<title>Network Address Types</title>
<tgroup cols="3">
1999-08-06 15:43:42 +02:00
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
1999-08-06 15:43:42 +02:00
</row>
</thead>
<tbody>
2000-10-04 17:47:45 +02:00
1999-08-06 15:43:42 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>cidr</type></entry>
2007-04-06 21:22:38 +02:00
<entry>7 or 19 bytes</entry>
2004-12-23 06:37:40 +01:00
<entry>IPv4 and IPv6 networks</entry>
1999-08-06 15:43:42 +02:00
</row>
2000-10-04 17:47:45 +02:00
1999-08-06 15:43:42 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>inet</type></entry>
2007-04-06 21:22:38 +02:00
<entry>7 or 19 bytes</entry>
2003-11-01 02:56:29 +01:00
<entry>IPv4 and IPv6 hosts and networks</entry>
1999-08-06 15:43:42 +02:00
</row>
2000-10-04 17:47:45 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>macaddr</type></entry>
<entry>6 bytes</entry>
<entry>MAC addresses</entry>
2000-10-04 17:47:45 +02:00
</row>
1999-08-06 15:43:42 +02:00
</tbody>
</tgroup>
</table>
2000-10-04 17:47:45 +02:00
<para>
2003-06-25 00:21:24 +02:00
When sorting <type>inet</type> or <type>cidr</type> data types,
IPv4 addresses will always sort before IPv6 addresses, including
2009-04-27 18:27:36 +02:00
IPv4 addresses encapsulated or mapped to IPv6 addresses, such as
2008-01-02 20:53:13 +01:00
::10.2.3.4 or ::ffff:10.4.3.2.
2000-10-04 17:47:45 +02:00
</para>
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-inet">
2000-11-10 21:13:27 +01:00
<title><type>inet</type></title>
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>inet (data type)</primary>
</indexterm>
2000-11-10 21:13:27 +01:00
<para>
2003-06-25 00:21:24 +02:00
The <type>inet</type> type holds an IPv4 or IPv6 host address, and
2009-04-27 18:27:36 +02:00
optionally its subnet, all in one field.
The subnet is represented by the number of network address bits
present in the host address (the
2003-06-25 00:21:24 +02:00
<quote>netmask</quote>). If the netmask is 32 and the address is IPv4,
then the value does not indicate a subnet, only a single host.
2004-12-23 06:37:40 +01:00
In IPv6, the address length is 128 bits, so 128 bits specify a
2003-06-25 00:21:24 +02:00
unique host address. Note that if you
2009-04-27 18:27:36 +02:00
want to accept only networks, you should use the
2000-11-10 21:13:27 +01:00
<type>cidr</type> type rather than <type>inet</type>.
</para>
<para>
2003-06-25 00:21:24 +02:00
The input format for this type is
<replaceable class="parameter">address/y</replaceable>
where
<replaceable class="parameter">address</replaceable>
is an IPv4 or IPv6 address and
<replaceable class="parameter">y</replaceable>
is the number of bits in the netmask. If the
<replaceable class="parameter">/y</replaceable>
2009-06-17 23:58:49 +02:00
portion is missing, the
2004-12-23 06:37:40 +01:00
netmask is 32 for IPv4 and 128 for IPv6, so the value represents
2003-06-25 00:21:24 +02:00
just a single host. On display, the
<replaceable class="parameter">/y</replaceable>
portion is suppressed if the netmask specifies a single host.
2000-11-10 21:13:27 +01:00
</para>
</sect2>
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-cidr">
2000-10-04 17:47:45 +02:00
<title><type>cidr</></title>
1999-08-06 15:43:42 +02:00
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>cidr</primary>
</indexterm>
1999-08-06 15:43:42 +02:00
<para>
2003-06-25 00:21:24 +02:00
The <type>cidr</type> type holds an IPv4 or IPv6 network specification.
2000-11-10 21:13:27 +01:00
Input and output formats follow Classless Internet Domain Routing
conventions.
2003-03-13 02:30:29 +01:00
The format for specifying networks is <replaceable
2003-06-25 00:21:24 +02:00
class="parameter">address/y</> where <replaceable
class="parameter">address</> is the network represented as an
IPv4 or IPv6 address, and <replaceable
2000-10-04 17:47:45 +02:00
class="parameter">y</> is the number of bits in the netmask. If
2000-12-22 19:00:24 +01:00
<replaceable class="parameter">y</> is omitted, it is calculated
2003-03-13 02:30:29 +01:00
using assumptions from the older classful network numbering system, except
2009-04-27 18:27:36 +02:00
it will be at least large enough to include all of the octets
2003-03-13 02:30:29 +01:00
written in the input. It is an error to specify a network address
that has bits set to the right of the specified netmask.
1999-08-06 15:43:42 +02:00
</para>
<para>
2002-11-11 21:14:04 +01:00
<xref linkend="datatype-net-cidr-table"> shows some examples.
</para>
1998-10-27 07:14:41 +01:00
2002-11-11 21:14:04 +01:00
<table id="datatype-net-cidr-table">
2000-10-04 17:47:45 +02:00
<title><type>cidr</> Type Input Examples</title>
2000-12-22 19:00:24 +01:00
<tgroup cols="3">
1998-12-18 17:11:12 +01:00
<thead>
2003-11-01 02:56:29 +01:00
<row>
<entry><type>cidr</type> Input</entry>
<entry><type>cidr</type> Output</entry>
<entry><literal><function>abbrev</function>(<type>cidr</type>)</literal></entry>
</row>
1998-12-18 17:11:12 +01:00
</thead>
<tbody>
2003-11-01 02:56:29 +01:00
<row>
<entry>192.168.100.128/25</entry>
<entry>192.168.100.128/25</entry>
<entry>192.168.100.128/25</entry>
</row>
<row>
<entry>192.168/24</entry>
<entry>192.168.0.0/24</entry>
<entry>192.168.0/24</entry>
</row>
<row>
<entry>192.168/25</entry>
<entry>192.168.0.0/25</entry>
<entry>192.168.0.0/25</entry>
</row>
<row>
<entry>192.168.1</entry>
<entry>192.168.1.0/24</entry>
<entry>192.168.1/24</entry>
</row>
<row>
<entry>192.168</entry>
<entry>192.168.0.0/24</entry>
<entry>192.168.0/24</entry>
</row>
<row>
<entry>128.1</entry>
<entry>128.1.0.0/16</entry>
<entry>128.1/16</entry>
</row>
<row>
<entry>128</entry>
<entry>128.0.0.0/16</entry>
<entry>128.0/16</entry>
</row>
<row>
<entry>128.1.2</entry>
<entry>128.1.2.0/24</entry>
<entry>128.1.2/24</entry>
</row>
<row>
<entry>10.1.2</entry>
<entry>10.1.2.0/24</entry>
<entry>10.1.2/24</entry>
</row>
<row>
<entry>10.1</entry>
<entry>10.1.0.0/16</entry>
<entry>10.1/16</entry>
</row>
<row>
<entry>10</entry>
<entry>10.0.0.0/8</entry>
<entry>10/8</entry>
</row>
<row>
<entry>10.1.2.3/32</entry>
<entry>10.1.2.3/32</entry>
2003-06-25 00:21:24 +02:00
<entry>10.1.2.3/32</entry>
2003-11-01 02:56:29 +01:00
</row>
2003-06-25 00:21:24 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry>2001:4f8:3:ba::/64</entry>
<entry>2001:4f8:3:ba::/64</entry>
<entry>2001:4f8:3:ba::/64</entry>
</row>
2003-06-25 00:21:24 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
<entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
<entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1</entry>
</row>
<row>
<entry>::ffff:1.2.3.0/120</entry>
<entry>::ffff:1.2.3.0/120</entry>
2003-06-25 00:21:24 +02:00
<entry>::ffff:1.2.3/120</entry>
2003-11-01 02:56:29 +01:00
</row>
<row>
<entry>::ffff:1.2.3.0/128</entry>
<entry>::ffff:1.2.3.0/128</entry>
2003-06-25 00:21:24 +02:00
<entry>::ffff:1.2.3.0/128</entry>
2003-11-01 02:56:29 +01:00
</row>
1998-12-18 17:11:12 +01:00
</tbody>
</tgroup>
</table>
2000-11-11 20:50:31 +01:00
</sect2>
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-inet-vs-cidr">
2003-03-13 02:30:29 +01:00
<title><type>inet</type> vs. <type>cidr</type></title>
1998-12-18 17:11:12 +01:00
1999-08-06 15:43:42 +02:00
<para>
2000-11-10 21:13:27 +01:00
The essential difference between <type>inet</type> and <type>cidr</type>
data types is that <type>inet</type> accepts values with nonzero bits to
the right of the netmask, whereas <type>cidr</type> does not.
2003-03-13 02:30:29 +01:00
</para>
2000-11-10 21:13:27 +01:00
<tip>
<para>
2003-11-01 02:56:29 +01:00
If you do not like the output format for <type>inet</type> or
<type>cidr</type> values, try the functions <function>host</>,
<function>text</>, and <function>abbrev</>.
</para>
2000-11-10 21:13:27 +01:00
</tip>
1999-08-06 15:43:42 +02:00
</sect2>
2000-10-04 17:47:45 +02:00
2001-01-13 19:34:51 +01:00
<sect2 id="datatype-macaddr">
2000-10-04 17:47:45 +02:00
<title><type>macaddr</></>
2001-05-13 00:51:36 +02:00
<indexterm>
<primary>macaddr (data type)</primary>
</indexterm>
<indexterm>
<primary>MAC address</primary>
<see>macaddr</see>
</indexterm>
2000-10-04 17:47:45 +02:00
<para>
2008-10-03 17:37:18 +02:00
The <type>macaddr</> type stores MAC addresses, known for example
from Ethernet card hardware addresses (although MAC addresses are
used for other purposes as well). Input is accepted in the
following formats:
2001-10-09 20:46:00 +02:00
<simplelist>
2008-10-03 17:37:18 +02:00
<member><literal>'08:00:2b:01:02:03'</></member>
<member><literal>'08-00-2b-01-02-03'</></member>
2001-10-09 20:46:00 +02:00
<member><literal>'08002b:010203'</></member>
<member><literal>'08002b-010203'</></member>
<member><literal>'0800.2b01.0203'</></member>
2008-10-03 17:37:18 +02:00
<member><literal>'08002b010203'</></member>
2001-10-09 20:46:00 +02:00
</simplelist>
2008-10-03 17:37:18 +02:00
These examples would all specify the same address. Upper and
lower case is accepted for the digits
2000-10-04 17:47:45 +02:00
<literal>a</> through <literal>f</>. Output is always in the
2008-10-03 17:37:18 +02:00
first of the forms shown.
</para>
<para>
IEEE Std 802-2001 specifies the second shown form (with hyphens)
as the canonical form for MAC addresses, and specifies the first
form (with colons) as the bit-reversed notation, so that
08-00-2b-01-02-03 = 01:00:4D:08:04:0C. This convention is widely
ignored nowadays, and it is only relevant for obsolete network
protocols (such as Token Ring). PostgreSQL makes no provisions
for bit reversal, and all accepted formats use the canonical LSB
order.
</para>
<para>
The remaining four input formats are not part of any standard.
2000-10-04 17:47:45 +02:00
</para>
</sect2>
1999-08-06 15:43:42 +02:00
</sect1>
2001-01-13 19:34:51 +01:00
<sect1 id="datatype-bit">
<title>Bit String Types</title>
2001-05-13 00:51:36 +02:00
<indexterm zone="datatype-bit">
2003-08-31 19:32:24 +02:00
<primary>bit string</primary>
2001-05-13 00:51:36 +02:00
<secondary>data type</secondary>
</indexterm>
2001-01-13 19:34:51 +01:00
<para>
Bit strings are strings of 1's and 0's. They can be used to store
or visualize bit masks. There are two SQL bit types:
2003-03-13 02:30:29 +01:00
<type>bit(<replaceable>n</replaceable>)</type> and <type>bit
varying(<replaceable>n</replaceable>)</type>, where
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
<replaceable>n</replaceable> is a positive integer.
2001-05-22 18:37:17 +02:00
</para>
<para>
2003-03-13 02:30:29 +01:00
<type>bit</type> type data must match the length
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
<replaceable>n</replaceable> exactly; it is an error to attempt to
2003-03-13 02:30:29 +01:00
store shorter or longer bit strings. <type>bit varying</type> data is
2001-05-22 18:37:17 +02:00
of variable length up to the maximum length
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
<replaceable>n</replaceable>; longer strings will be rejected.
2003-03-13 02:30:29 +01:00
Writing <type>bit</type> without a length is equivalent to
<literal>bit(1)</literal>, while <type>bit varying</type> without a length
2001-05-22 18:37:17 +02:00
specification means unlimited length.
</para>
<note>
<para>
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
If one explicitly casts a bit-string value to
2003-03-13 02:30:29 +01:00
<type>bit(<replaceable>n</>)</type>, it will be truncated or
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
zero-padded on the right to be exactly <replaceable>n</> bits,
without raising an error. Similarly,
if one explicitly casts a bit-string value to
2003-03-13 02:30:29 +01:00
<type>bit varying(<replaceable>n</>)</type>, it will be truncated
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
on the right if it is more than <replaceable>n</> bits.
</para>
</note>
2001-05-22 18:37:17 +02:00
<para>
Refer to <xref
2001-01-13 19:34:51 +01:00
linkend="sql-syntax-bit-strings"> for information about the syntax
of bit string constants. Bit-logical operators and string
manipulation functions are available; see <xref
2004-06-16 03:27:00 +02:00
linkend="functions-bitstring">.
2001-01-13 19:34:51 +01:00
</para>
2001-05-22 18:37:17 +02:00
<example>
<title>Using the bit string types</title>
2001-01-13 19:34:51 +01:00
<programlisting>
CREATE TABLE test (a BIT(3), b BIT VARYING(5));
INSERT INTO test VALUES (B'101', B'00');
2001-05-22 18:37:17 +02:00
INSERT INTO test VALUES (B'10', B'101');
<computeroutput>
2003-09-13 00:17:24 +02:00
ERROR: bit string length 2 does not match type bit(3)
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
</computeroutput>
INSERT INTO test VALUES (B'10'::bit(3), B'101');
SELECT * FROM test;
<computeroutput>
a | b
-----+-----
101 | 00
100 | 101
2001-05-22 18:37:17 +02:00
</computeroutput>
2001-01-13 19:34:51 +01:00
</programlisting>
2001-05-22 18:37:17 +02:00
</example>
2001-01-13 19:34:51 +01:00
2007-04-06 21:22:38 +02:00
<para>
A bit string value requires 1 byte for each group of 8 bits, plus
5 or 8 bytes overhead depending on the length of the string
(but long values may be compressed or moved out-of-line, as explained
in <xref linkend="datatype-character"> for character strings).
</para>
2001-01-13 19:34:51 +01:00
</sect1>
2007-10-21 22:04:37 +02:00
<sect1 id="datatype-textsearch">
<title>Text Search Types</title>
2007-04-20 23:51:46 +02:00
2007-10-21 22:04:37 +02:00
<indexterm zone="datatype-textsearch">
<primary>full text search</primary>
<secondary>data types</secondary>
2007-04-20 23:51:46 +02:00
</indexterm>
2007-10-21 22:04:37 +02:00
<indexterm zone="datatype-textsearch">
<primary>text search</primary>
<secondary>data types</secondary>
</indexterm>
2007-04-21 19:26:18 +02:00
<para>
2007-10-21 22:04:37 +02:00
<productname>PostgreSQL</productname> provides two data types that
are designed to support full text search, which is the activity of
searching through a collection of natural-language <firstterm>documents</>
to locate those that best match a <firstterm>query</>.
2009-06-17 23:58:49 +02:00
The <type>tsvector</type> type represents a document in a form optimized
for text search; the <type>tsquery</type> type similarly represents
2009-04-27 18:27:36 +02:00
a text query.
2007-10-21 22:04:37 +02:00
<xref linkend="textsearch"> provides a detailed explanation of this
facility, and <xref linkend="functions-textsearch"> summarizes the
related functions and operators.
2007-04-21 19:26:18 +02:00
</para>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<sect2 id="datatype-tsvector">
<title><type>tsvector</type></title>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<indexterm>
<primary>tsvector (data type)</primary>
</indexterm>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<para>
A <type>tsvector</type> value is a sorted list of distinct
2009-06-17 23:58:49 +02:00
<firstterm>lexemes</>, which are words that have been
2009-04-27 18:27:36 +02:00
<firstterm>normalized</> to merge different variants of the same word
(see <xref linkend="textsearch"> for details). Sorting and
2007-10-21 22:04:37 +02:00
duplicate-elimination are done automatically during input, as shown in
this example:
2007-08-29 22:37:14 +02:00
<programlisting>
SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;
tsvector
----------------------------------------------------
2008-05-16 18:31:02 +02:00
'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'
2007-08-29 22:37:14 +02:00
</programlisting>
2008-05-16 18:31:02 +02:00
To represent
2007-11-21 05:01:37 +01:00
lexemes containing whitespace or punctuation, surround them with quotes:
2007-10-21 22:04:37 +02:00
<programlisting>
SELECT $$the lexeme ' ' contains spaces$$::tsvector;
tsvector
-------------------------------------------
2008-05-16 18:31:02 +02:00
' ' 'contains' 'lexeme' 'spaces' 'the'
2007-10-21 22:04:37 +02:00
</programlisting>
2009-04-27 18:27:36 +02:00
(We use dollar-quoted string literals in this example and the next one
to avoid the confusion of having to double quote marks within the
2007-11-21 05:01:37 +01:00
literals.) Embedded quotes and backslashes must be doubled:
2007-08-29 22:37:14 +02:00
<programlisting>
2007-10-21 22:04:37 +02:00
SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
tsvector
------------------------------------------------
2008-05-16 18:31:02 +02:00
'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
2007-08-29 22:37:14 +02:00
</programlisting>
2009-04-27 18:27:36 +02:00
Optionally, integer <firstterm>positions</>
can be attached to lexemes:
2007-08-29 22:37:14 +02:00
<programlisting>
SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
tsvector
-------------------------------------------------------------------------------
2008-05-16 18:31:02 +02:00
'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4
2007-08-29 22:37:14 +02:00
</programlisting>
2007-10-21 22:04:37 +02:00
A position normally indicates the source word's location in the
document. Positional information can be used for
<firstterm>proximity ranking</firstterm>. Position values can
2009-04-27 18:27:36 +02:00
range from 1 to 16383; larger numbers are silently set to 16383.
2008-01-12 22:51:36 +01:00
Duplicate positions for the same lexeme are discarded.
2007-10-21 22:04:37 +02:00
</para>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<para>
Lexemes that have positions can further be labeled with a
<firstterm>weight</>, which can be <literal>A</literal>,
<literal>B</literal>, <literal>C</literal>, or <literal>D</literal>.
<literal>D</literal> is the default and hence is not shown on output:
2007-08-29 22:37:14 +02:00
<programlisting>
2007-10-21 22:04:37 +02:00
SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
tsvector
----------------------------
'a':1A 'cat':5 'fat':2B,4C
</programlisting>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
Weights are typically used to reflect document structure, for example
by marking title words differently from body words. Text search
ranking functions can assign different priorities to the different
weight markers.
</para>
<para>
It is important to understand that the
<type>tsvector</type> type itself does not perform any normalization;
2009-04-27 18:27:36 +02:00
it assumes the words it is given are normalized appropriately
2007-10-21 22:04:37 +02:00
for the application. For example,
<programlisting>
select 'The Fat Rats'::tsvector;
tsvector
--------------------
2008-05-16 18:31:02 +02:00
'Fat' 'Rats' 'The'
2007-08-29 22:37:14 +02:00
</programlisting>
2007-10-21 22:04:37 +02:00
For most English-text-searching applications the above words would
be considered non-normalized, but <type>tsvector</type> doesn't care.
Raw document text should usually be passed through
<function>to_tsvector</> to normalize the words appropriately
for searching:
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<programlisting>
SELECT to_tsvector('english', 'The Fat Rats');
to_tsvector
-----------------
'fat':2 'rat':3
</programlisting>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
Again, see <xref linkend="textsearch"> for more detail.
</para>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
</sect2>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<sect2 id="datatype-tsquery">
<title><type>tsquery</type></title>
<indexterm>
<primary>tsquery (data type)</primary>
</indexterm>
<para>
A <type>tsquery</type> value stores lexemes that are to be
2009-06-17 23:58:49 +02:00
searched for, and combines them honoring the boolean operators
2007-10-21 22:04:37 +02:00
<literal>&</literal> (AND), <literal>|</literal> (OR), and
<literal>!</> (NOT). Parentheses can be used to enforce grouping
of the operators:
2007-08-29 22:37:14 +02:00
<programlisting>
2007-10-21 22:04:37 +02:00
SELECT 'fat & rat'::tsquery;
tsquery
2007-08-29 22:37:14 +02:00
---------------
2007-10-21 22:04:37 +02:00
'fat' & 'rat'
SELECT 'fat & (rat | cat)'::tsquery;
tsquery
---------------------------
'fat' & ( 'rat' | 'cat' )
SELECT 'fat & rat & ! cat'::tsquery;
tsquery
------------------------
'fat' & 'rat' & !'cat'
</programlisting>
In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
and <literal>&</literal> (AND) binds more tightly than
<literal>|</literal> (OR).
</para>
<para>
Optionally, lexemes in a <type>tsquery</type> can be labeled with
one or more weight letters, which restricts them to match only
2009-04-27 18:27:36 +02:00
<type>tsvector</> lexemes with matching weights:
2007-10-21 22:04:37 +02:00
<programlisting>
2007-08-29 22:37:14 +02:00
SELECT 'fat:ab & cat'::tsquery;
tsquery
------------------
'fat':AB & 'cat'
</programlisting>
2007-10-21 22:04:37 +02:00
</para>
2007-08-29 22:37:14 +02:00
2008-05-16 18:31:02 +02:00
<para>
Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</>
to specify prefix matching:
<programlisting>
SELECT 'super:*'::tsquery;
tsquery
-----------
'super':*
</programlisting>
This query will match any word in a <type>tsvector</> that begins
with <quote>super</>.
</para>
2007-10-21 22:04:37 +02:00
<para>
2009-04-27 18:27:36 +02:00
Quoting rules for lexemes are the same as described previously for
2007-10-21 22:04:37 +02:00
lexemes in <type>tsvector</>; and, as with <type>tsvector</>,
2009-04-27 18:27:36 +02:00
any required normalization of words must be done before converting
to the <type>tsquery</> type. The <function>to_tsquery</>
2007-10-21 22:04:37 +02:00
function is convenient for performing such normalization:
2007-08-29 22:37:14 +02:00
<programlisting>
2007-12-04 00:49:51 +01:00
SELECT to_tsquery('Fat:ab & Cats');
2007-10-21 22:04:37 +02:00
to_tsquery
------------------
2007-12-04 00:49:51 +01:00
'fat':AB & 'cat'
2007-08-29 22:37:14 +02:00
</programlisting>
2007-10-21 22:04:37 +02:00
</para>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
</sect2>
</sect1>
<sect1 id="datatype-uuid">
<title><acronym>UUID</acronym> Type</title>
2007-08-29 22:37:14 +02:00
2007-10-21 22:04:37 +02:00
<indexterm zone="datatype-uuid">
<primary>UUID</primary>
</indexterm>
<para>
The data type <type>uuid</type> stores Universally Unique Identifiers
(UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.
2009-04-27 18:27:36 +02:00
(Some systems refer to this data type as a globally unique identifier, or
GUID,<indexterm><primary>GUID</primary></indexterm> instead.) This
2007-10-21 22:04:37 +02:00
identifier is a 128-bit quantity that is generated by an algorithm chosen
to make it very unlikely that the same identifier will be generated by
anyone else in the known universe using the same algorithm. Therefore,
for distributed systems, these identifiers provide a better uniqueness
2009-04-27 18:27:36 +02:00
guarantee than sequence generators, which
2007-10-21 22:04:37 +02:00
are only unique within a single database.
</para>
<para>
A UUID is written as a sequence of lower-case hexadecimal digits,
in several groups separated by hyphens, specifically a group of 8
digits followed by three groups of 4 digits followed by a group of
12 digits, for a total of 32 digits representing the 128 bits. An
example of a UUID in this standard form is:
<programlisting>
a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11
</programlisting>
<productname>PostgreSQL</productname> also accepts the following
alternative forms for input:
use of upper-case digits, the standard format surrounded by
2008-11-03 23:14:40 +01:00
braces, omitting some or all hyphens, adding a hyphen after any
group of four digits. Examples are:
2007-10-21 22:04:37 +02:00
<programlisting>
A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11
{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}
a0eebc999c0b4ef8bb6d6bb9bd380a11
2008-11-03 23:14:40 +01:00
a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}
2007-10-21 22:04:37 +02:00
</programlisting>
Output is always in the standard form.
</para>
<para>
<productname>PostgreSQL</productname> provides storage and comparison
functions for UUIDs, but the core database does not include any
function for generating UUIDs, because no single algorithm is well
suited for every application. The contrib module
<filename>contrib/uuid-ossp</filename> provides functions that implement
several standard algorithms.
Alternatively, UUIDs could be generated by client applications or
other libraries invoked through a server-side function.
</para>
2007-08-29 22:37:14 +02:00
</sect1>
2007-04-02 17:27:02 +02:00
<sect1 id="datatype-xml">
<title><acronym>XML</> Type</title>
<indexterm zone="datatype-xml">
<primary>XML</primary>
</indexterm>
<para>
2009-04-27 18:27:36 +02:00
The <type>xml</type> data type can be used to store XML data. Its
2007-04-02 17:27:02 +02:00
advantage over storing XML data in a <type>text</type> field is that it
2009-06-17 23:58:49 +02:00
checks the input values for well-formedness, and there are support
functions to perform type-safe operations on it; see <xref
2007-04-05 03:46:27 +02:00
linkend="functions-xml">. Use of this data type requires the
installation to have been built with <command>configure
--with-libxml</>.
2007-04-02 17:27:02 +02:00
</para>
<para>
2007-04-05 03:46:27 +02:00
The <type>xml</type> type can store well-formed
2007-04-02 17:27:02 +02:00
<quote>documents</quote>, as defined by the XML standard, as well
as <quote>content</quote> fragments, which are defined by the
production <literal>XMLDecl? content</literal> in the XML
standard. Roughly, this means that content fragments can have
more than one top-level element or character node. The expression
<literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>
can be used to evaluate whether a particular <type>xml</type>
value is a full document or only a content fragment.
</para>
2007-05-21 19:10:29 +02:00
<sect2>
<title>Creating XML Values</title>
2007-04-02 17:27:02 +02:00
<para>
To produce a value of type <type>xml</type> from character data,
use the function
<function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>
<synopsis>
XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
</synopsis>
Examples:
<programlisting><![CDATA[
2008-02-13 23:46:55 +01:00
XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
2007-05-21 19:10:29 +02:00
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
2007-04-02 17:27:02 +02:00
]]></programlisting>
While this is the only way to convert character strings into XML
values according to the SQL standard, the PostgreSQL-specific
syntaxes:
<programlisting><![CDATA[
xml '<foo>bar</foo>'
'<foo>bar</foo>'::xml
]]></programlisting>
can also be used.
</para>
<para>
2009-04-27 18:27:36 +02:00
The <type>xml</type> type does not validate input values
2009-06-17 23:58:49 +02:00
against a document type declaration
(DTD),<indexterm><primary>DTD</primary></indexterm>
even when the input value specifies a DTD.
2007-04-02 17:27:02 +02:00
</para>
<para>
2009-04-27 18:27:36 +02:00
The inverse operation, producing a character string value from
2007-04-02 17:27:02 +02:00
<type>xml</type>, uses the function
<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
<synopsis>
XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
</synopsis>
2009-04-27 18:27:36 +02:00
<replaceable>type</replaceable> can be
2007-04-02 17:27:02 +02:00
<type>character</type>, <type>character varying</type>, or
2009-06-17 23:58:49 +02:00
<type>text</type> (or an alias for one of those). Again, according
2007-04-02 17:27:02 +02:00
to the SQL standard, this is the only way to convert between type
<type>xml</type> and character types, but PostgreSQL also allows
you to simply cast the value.
</para>
<para>
2009-04-27 18:27:36 +02:00
When a character string value is cast to or from type
2007-04-02 17:27:02 +02:00
<type>xml</type> without going through <type>XMLPARSE</type> or
<type>XMLSERIALIZE</type>, respectively, the choice of
<literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
determined by the <quote>XML option</quote>
<indexterm><primary>XML option</primary></indexterm>
session configuration parameter, which can be set using the
2009-04-27 18:27:36 +02:00
standard command:
2007-04-02 17:27:02 +02:00
<synopsis>
SET XML OPTION { DOCUMENT | CONTENT };
</synopsis>
or the more PostgreSQL-like syntax
<synopsis>
SET xmloption TO { DOCUMENT | CONTENT };
</synopsis>
The default is <literal>CONTENT</literal>, so all forms of XML
data are allowed.
</para>
2007-05-21 19:10:29 +02:00
</sect2>
2007-04-02 17:27:02 +02:00
2007-05-21 19:10:29 +02:00
<sect2>
<title>Encoding Handling</title>
2007-04-02 17:27:02 +02:00
<para>
Care must be taken when dealing with multiple character encodings
on the client, server, and in the XML data passed through them.
When using the text mode to pass queries to the server and query
results to the client (which is the normal mode), PostgreSQL
converts all character data passed between the client and the
server and vice versa to the character encoding of the respective
end; see <xref linkend="multibyte">. This includes string
representations of XML values, such as in the above examples.
This would ordinarily mean that encoding declarations contained in
2009-04-27 18:27:36 +02:00
XML data can become invalid as the character data is converted
2009-06-17 23:58:49 +02:00
to other encodings while travelling between client and server,
2009-04-27 18:27:36 +02:00
because the embedded encoding declaration is not changed. To cope
with this behavior, encoding declarations contained in
character strings presented for input to the <type>xml</type> type
are <emphasis>ignored</emphasis>, and content is assumed
2007-04-02 17:27:02 +02:00
to be in the current server encoding. Consequently, for correct
2009-04-27 18:27:36 +02:00
processing, character strings of XML data must be sent
2007-04-02 17:27:02 +02:00
from the client in the current client encoding. It is the
2009-04-27 18:27:36 +02:00
responsibility of the client to either convert documents to the
2009-06-17 23:58:49 +02:00
current client encoding before sending them to the server, or to
2007-04-02 17:27:02 +02:00
adjust the client encoding appropriately. On output, values of
type <type>xml</type> will not have an encoding declaration, and
2009-04-27 18:27:36 +02:00
clients should assume all data is in the current client
2007-04-02 17:27:02 +02:00
encoding.
</para>
<para>
2009-04-27 18:27:36 +02:00
When using binary mode to pass query parameters to the server
2007-05-03 17:05:56 +02:00
and query results back to the client, no character set conversion
2007-04-02 17:27:02 +02:00
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
2009-04-27 18:27:36 +02:00
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
2007-04-02 17:27:02 +02:00
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>
<para>
Needless to say, processing XML data with PostgreSQL will be less
2009-04-27 18:27:36 +02:00
error-prone and more efficient if the XML data encoding, client encoding,
2007-04-02 17:27:02 +02:00
and server encoding are the same. Since XML data is internally
processed in UTF-8, computations will be most efficient if the
server encoding is also UTF-8.
</para>
2009-06-10 22:25:41 +02:00
<caution>
<para>
Some XML-related functions may not work at all on non-ASCII data
when the server encoding is not UTF-8. This is known to be an
issue for <function>xpath()</> in particular.
</para>
</caution>
2007-05-21 19:10:29 +02:00
</sect2>
<sect2>
<title>Accessing XML Values</title>
<para>
The <type>xml</type> data type is unusual in that it does not
provide any comparison operators. This is because there is no
well-defined and universally useful comparison algorithm for XML
data. One consequence of this is that you cannot retrieve rows by
comparing an <type>xml</type> column against a search value. XML
values should therefore typically be accompanied by a separate key
field such as an ID. An alternative solution for comparing XML
values is to convert them to character strings first, but note
that character string comparison has little to do with a useful
XML comparison method.
</para>
<para>
Since there are no comparison operators for the <type>xml</type>
data type, it is not possible to create an index directly on a
column of this type. If speedy searches in XML data are desired,
2009-04-27 18:27:36 +02:00
possible workarounds include casting the expression to a
2007-05-21 19:10:29 +02:00
character string type and indexing that, or indexing an XPath
2009-04-27 18:27:36 +02:00
expression. Of course, the actual query would have to be adjusted
2007-05-21 19:10:29 +02:00
to search by the indexed expression.
</para>
<para>
2009-04-27 18:27:36 +02:00
The text-search functionality in PostgreSQL can also be used to speed
up full-document searches of XML data. The necessary
preprocessing support is, however, not yet available in the PostgreSQL
distribution.
2007-05-21 19:10:29 +02:00
</para>
</sect2>
2007-04-02 17:27:02 +02:00
</sect1>
2003-03-13 02:30:29 +01:00
&array;
2004-06-07 06:04:47 +02:00
&rowtypes;
2002-04-25 04:56:56 +02:00
<sect1 id="datatype-oid">
<title>Object Identifier Types</title>
<indexterm zone="datatype-oid">
<primary>object identifier</primary>
<secondary>data type</secondary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>oid</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regproc</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regprocedure</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regoper</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regoperator</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regclass</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regtype</primary>
</indexterm>
2007-08-21 03:11:32 +02:00
<indexterm zone="datatype-oid">
<primary>regconfig</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>regdictionary</primary>
</indexterm>
2002-04-25 22:14:43 +02:00
<indexterm zone="datatype-oid">
<primary>xid</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>cid</primary>
</indexterm>
<indexterm zone="datatype-oid">
<primary>tid</primary>
</indexterm>
2002-04-25 04:56:56 +02:00
<para>
Object identifiers (OIDs) are used internally by
2003-12-01 23:08:02 +01:00
<productname>PostgreSQL</productname> as primary keys for various
2005-03-13 10:36:31 +01:00
system tables. OIDs are not added to user-created tables, unless
<literal>WITH OIDS</literal> is specified when the table is
created, or the <xref linkend="guc-default-with-oids">
configuration variable is enabled. Type <type>oid</> represents
an object identifier. There are also several alias types for
<type>oid</>: <type>regproc</>, <type>regprocedure</>,
2007-08-21 03:11:32 +02:00
<type>regoper</>, <type>regoperator</>, <type>regclass</>,
<type>regtype</>, <type>regconfig</>, and <type>regdictionary</>.
<xref linkend="datatype-oid-table"> shows an overview.
2002-04-25 04:56:56 +02:00
</para>
<para>
2003-12-01 23:08:02 +01:00
The <type>oid</> type is currently implemented as an unsigned
four-byte integer. Therefore, it is not large enough to provide
database-wide uniqueness in large databases, or even in large
individual tables. So, using a user-created table's OID column as
a primary key is discouraged. OIDs are best used only for
references to system tables.
2002-04-25 04:56:56 +02:00
</para>
<para>
2003-03-13 02:30:29 +01:00
The <type>oid</> type itself has few operations beyond comparison.
2005-03-13 10:36:31 +01:00
It can be cast to integer, however, and then manipulated using the
standard integer operators. (Beware of possible
signed-versus-unsigned confusion if you do this.)
2002-04-25 04:56:56 +02:00
</para>
<para>
2003-03-13 02:30:29 +01:00
The OID alias types have no operations of their own except
2002-04-25 04:56:56 +02:00
for specialized input and output routines. These routines are able
to accept and display symbolic names for system objects, rather than
the raw numeric value that type <type>oid</> would use. The alias
2005-01-08 06:19:18 +01:00
types allow simplified lookup of OID values for objects. For example,
to examine the <structname>pg_attribute</> rows related to a table
2007-02-01 01:28:19 +01:00
<literal>mytable</>, one could write:
2005-01-08 06:19:18 +01:00
<programlisting>
SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;
</programlisting>
2007-02-01 01:28:19 +01:00
rather than:
2005-01-08 06:19:18 +01:00
<programlisting>
SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');
</programlisting>
While that doesn't look all that bad by itself, it's still oversimplified.
A far more complicated sub-select would be needed to
select the right OID if there are multiple tables named
<literal>mytable</> in different schemas.
The <type>regclass</> input converter handles the table lookup according
to the schema path setting, and so it does the <quote>right thing</>
automatically. Similarly, casting a table's OID to
<type>regclass</> is handy for symbolic display of a numeric OID.
2002-04-25 04:56:56 +02:00
</para>
2002-11-11 21:14:04 +01:00
<table id="datatype-oid-table">
2002-04-25 04:56:56 +02:00
<title>Object Identifier Types</title>
<tgroup cols="4">
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>References</entry>
<entry>Description</entry>
<entry>Value Example</entry>
2002-04-25 04:56:56 +02:00
</row>
</thead>
<tbody>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>oid</></entry>
<entry>any</entry>
<entry>numeric object identifier</entry>
<entry><literal>564182</></entry>
2002-04-25 04:56:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>regproc</></entry>
<entry><structname>pg_proc</></entry>
<entry>function name</entry>
<entry><literal>sum</></entry>
2002-04-25 04:56:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>regprocedure</></entry>
<entry><structname>pg_proc</></entry>
<entry>function with argument types</entry>
<entry><literal>sum(int4)</></entry>
2002-04-25 04:56:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>regoper</></entry>
<entry><structname>pg_operator</></entry>
<entry>operator name</entry>
<entry><literal>+</></entry>
2002-04-25 04:56:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>regoperator</></entry>
<entry><structname>pg_operator</></entry>
<entry>operator with argument types</entry>
<entry><literal>*(integer,integer)</> or <literal>-(NONE,integer)</></entry>
2002-04-25 04:56:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>regclass</></entry>
<entry><structname>pg_class</></entry>
<entry>relation name</entry>
<entry><literal>pg_type</></entry>
2002-04-25 04:56:56 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>regtype</></entry>
<entry><structname>pg_type</></entry>
<entry>data type name</entry>
<entry><literal>integer</></entry>
2002-04-25 04:56:56 +02:00
</row>
2007-08-21 03:11:32 +02:00
<row>
<entry><type>regconfig</></entry>
<entry><structname>pg_ts_config</></entry>
<entry>text search configuration</entry>
<entry><literal>english</></entry>
</row>
<row>
<entry><type>regdictionary</></entry>
<entry><structname>pg_ts_dict</></entry>
<entry>text search dictionary</entry>
<entry><literal>simple</></entry>
</row>
2002-04-25 04:56:56 +02:00
</tbody>
</tgroup>
</table>
<para>
2002-04-25 22:14:43 +02:00
All of the OID alias types accept schema-qualified names, and will
2002-04-25 04:56:56 +02:00
display schema-qualified names on output if the object would not
be found in the current search path without being qualified.
The <type>regproc</> and <type>regoper</> alias types will only
accept input names that are unique (not overloaded), so they are
of limited use; for most uses <type>regprocedure</> or
2009-04-27 18:27:36 +02:00
<type>regoperator</> are more appropriate. For <type>regoperator</>,
2002-11-11 21:14:04 +01:00
unary operators are identified by writing <literal>NONE</> for the unused
2002-04-25 04:56:56 +02:00
operand.
</para>
2005-10-03 01:50:16 +02:00
<para>
2009-04-27 18:27:36 +02:00
An additional property of the OID alias types is the creation of
dependencies. If a
2005-10-03 01:50:16 +02:00
constant of one of these types appears in a stored expression
(such as a column default expression or view), it creates a dependency
on the referenced object. For example, if a column has a default
expression <literal>nextval('my_seq'::regclass)</>,
<productname>PostgreSQL</productname>
understands that the default expression depends on the sequence
<literal>my_seq</>; the system will not let the sequence be dropped
without first removing the default expression.
</para>
2002-04-25 22:14:43 +02:00
<para>
Another identifier type used by the system is <type>xid</>, or transaction
2002-09-21 20:32:54 +02:00
(abbreviated <abbrev>xact</>) identifier. This is the data type of the system columns
2003-03-13 02:30:29 +01:00
<structfield>xmin</> and <structfield>xmax</>. Transaction identifiers are 32-bit quantities.
2002-04-25 22:14:43 +02:00
</para>
<para>
2002-11-15 04:11:18 +01:00
A third identifier type used by the system is <type>cid</>, or
command identifier. This is the data type of the system columns
2003-03-13 02:30:29 +01:00
<structfield>cmin</> and <structfield>cmax</>. Command identifiers are also 32-bit quantities.
2002-04-25 22:14:43 +02:00
</para>
<para>
A final identifier type used by the system is <type>tid</>, or tuple
2003-11-01 02:56:29 +01:00
identifier (row identifier). This is the data type of the system column
2002-04-25 22:14:43 +02:00
<structfield>ctid</>. A tuple ID is a pair
(block number, tuple index within block) that identifies the
2003-11-01 02:56:29 +01:00
physical location of the row within its table.
2002-04-25 22:14:43 +02:00
</para>
2003-03-13 02:30:29 +01:00
<para>
(The system columns are further explained in <xref
linkend="ddl-system-columns">.)
</para>
2002-04-25 04:56:56 +02:00
</sect1>
2002-08-22 02:01:51 +02:00
<sect1 id="datatype-pseudo">
<title>Pseudo-Types</title>
<indexterm zone="datatype-pseudo">
<primary>record</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>any</primary>
</indexterm>
2007-06-07 01:00:50 +02:00
<indexterm zone="datatype-pseudo">
<primary>anyelement</primary>
</indexterm>
2002-08-22 02:01:51 +02:00
<indexterm zone="datatype-pseudo">
<primary>anyarray</primary>
</indexterm>
2003-08-10 00:50:22 +02:00
<indexterm zone="datatype-pseudo">
2007-06-07 01:00:50 +02:00
<primary>anynonarray</primary>
2003-08-10 00:50:22 +02:00
</indexterm>
2007-04-02 05:49:42 +02:00
<indexterm zone="datatype-pseudo">
<primary>anyenum</primary>
</indexterm>
2002-08-22 02:01:51 +02:00
<indexterm zone="datatype-pseudo">
<primary>void</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>trigger</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>language_handler</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>cstring</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>internal</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>opaque</primary>
</indexterm>
<para>
2002-11-11 21:14:04 +01:00
The <productname>PostgreSQL</productname> type system contains a
number of special-purpose entries that are collectively called
<firstterm>pseudo-types</>. A pseudo-type cannot be used as a
column data type, but it can be used to declare a function's
argument or result type. Each of the available pseudo-types is
useful in situations where a function's behavior does not
2002-11-15 04:11:18 +01:00
correspond to simply taking or returning a value of a specific
<acronym>SQL</acronym> data type. <xref
linkend="datatype-pseudotypes-table"> lists the existing
pseudo-types.
2002-08-22 02:01:51 +02:00
</para>
2002-11-11 21:14:04 +01:00
<table id="datatype-pseudotypes-table">
2002-08-22 02:01:51 +02:00
<title>Pseudo-Types</title>
<tgroup cols="2">
<thead>
<row>
2003-11-01 02:56:29 +01:00
<entry>Name</entry>
<entry>Description</entry>
2002-08-22 02:01:51 +02:00
</row>
</thead>
<tbody>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>any</></entry>
2009-04-27 18:27:36 +02:00
<entry>Indicates that a function accepts any input data type.</entry>
2002-08-22 02:01:51 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>anyarray</></entry>
<entry>Indicates that a function accepts any array data type
(see <xref linkend="extend-types-polymorphic">).</entry>
2003-08-10 00:50:22 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>anyelement</></entry>
<entry>Indicates that a function accepts any data type
(see <xref linkend="extend-types-polymorphic">).</entry>
2002-08-22 02:01:51 +02:00
</row>
2007-04-02 05:49:42 +02:00
<row>
<entry><type>anyenum</></entry>
<entry>Indicates that a function accepts any enum data type
(see <xref linkend="extend-types-polymorphic"> and
<xref linkend="datatype-enum">).</entry>
</row>
2007-06-07 01:00:50 +02:00
<row>
<entry><type>anynonarray</></entry>
<entry>Indicates that a function accepts any non-array data type
(see <xref linkend="extend-types-polymorphic">).</entry>
</row>
2002-08-22 02:01:51 +02:00
<row>
2003-11-01 02:56:29 +01:00
<entry><type>cstring</></entry>
<entry>Indicates that a function accepts or returns a null-terminated C string.</entry>
2002-08-22 02:01:51 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>internal</></entry>
<entry>Indicates that a function accepts or returns a server-internal
data type.</entry>
2002-08-22 02:01:51 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>language_handler</></entry>
<entry>A procedural language call handler is declared to return <type>language_handler</>.</entry>
2002-08-22 02:01:51 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>record</></entry>
<entry>Identifies a function returning an unspecified row type.</entry>
2002-08-22 02:01:51 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>trigger</></entry>
<entry>A trigger function is declared to return <type>trigger.</></entry>
2003-08-31 19:32:24 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>void</></entry>
<entry>Indicates that a function returns no value.</entry>
2002-08-22 02:01:51 +02:00
</row>
<row>
2003-11-01 02:56:29 +01:00
<entry><type>opaque</></entry>
<entry>An obsolete type name that formerly served all the above purposes.</entry>
2002-08-22 02:01:51 +02:00
</row>
</tbody>
</tgroup>
</table>
<para>
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
Functions coded in C (whether built-in or dynamically loaded) can be
2002-09-21 20:32:54 +02:00
declared to accept or return any of these pseudo data types. It is up to
2002-08-22 02:01:51 +02:00
the function author to ensure that the function will behave safely
when a pseudo-type is used as an argument type.
</para>
<para>
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
Functions coded in procedural languages can use pseudo-types only as
2002-08-22 02:01:51 +02:00
allowed by their implementation languages. At present the procedural
languages all forbid use of a pseudo-type as argument type, and allow
2003-08-10 00:50:22 +02:00
only <type>void</> and <type>record</> as a result type (plus
<type>trigger</> when the function is used as a trigger). Some also
2007-04-02 05:49:42 +02:00
support polymorphic functions using the types <type>anyarray</>,
2007-06-07 01:00:50 +02:00
<type>anyelement</>, <type>anyenum</>, and <type>anynonarray</>.
2002-08-22 02:01:51 +02:00
</para>
<para>
2002-11-15 04:11:18 +01:00
The <type>internal</> pseudo-type is used to declare functions
that are meant only to be called internally by the database
2009-04-27 18:27:36 +02:00
system, and not by direct invocation in an <acronym>SQL</acronym>
2002-11-15 04:11:18 +01:00
query. If a function has at least one <type>internal</>-type
argument then it cannot be called from <acronym>SQL</acronym>. To
preserve the type safety of this restriction it is important to
follow this coding rule: do not create any function that is
declared to return <type>internal</> unless it has at least one
<type>internal</> argument.
2002-08-22 02:01:51 +02:00
</para>
</sect1>
1999-08-06 15:43:42 +02:00
</chapter>