Data Types
data type
type
data type
PostgreSQL has a rich set of native data
types available to users. Users may add new types to
PostgreSQL using the command.
shows all the built-in general-purpose data
types. Most of the alternative names listed in the
Aliases
column are the names used internally by
PostgreSQL for historical reasons. In
addition, some internally used or deprecated types are available,
but they are not listed here.
Data Types
Name
Aliases
Description
bigint
int8
signed eight-byte integer
bigserial
serial8
autoincrementing eight-byte integer
bit [ (n) ]
fixed-length bit string
bit varying [ (n) ]
varbit
variable-length bit string
boolean
bool
logical Boolean (true/false)
box
rectangular box in the plane
bytea
binary data (byte array>)
character varying [ (n) ]
varchar [ (n) ]
variable-length character string
character [ (n) ]
char [ (n) ]
fixed-length character string
cidr
IPv4 or IPv6 network address
circle
circle in the plane
date
calendar date (year, month, day)
double precision
float8
double precision floating-point number
inet
IPv4 or IPv6 host address
integer
int, int4
signed four-byte integer
interval [ (p) ]
time span
line
infinite line in the plane
lseg
line segment in the plane
macaddr
MAC address
money
currency amount
numeric [ (p,
s) ]
decimal [ (p,
s) ]
exact numeric of selectable precision
path
geometric path in the plane
point
geometric point in the plane
polygon
closed geometric path in the plane
real
float4
single precision floating-point number
smallint
int2
signed two-byte integer
serial
serial4
autoincrementing four-byte integer
text
variable-length character string
time [ (p) ] [ without time zone ]
time of day
time [ (p) ] with time zone
timetz
time of day, including time zone
timestamp [ (p) ] [ without time zone ]
date and time
timestamp [ (p) ] with time zone
timestamptz
date and time, including time zone
xml
XML data
Compatibility
The following types (or spellings thereof) are specified by
SQL: bit, bit
varying, boolean, char,
character varying, character,
varchar, date, double
precision, integer, interval,
numeric, decimal, real,
smallint, time (with or without time zone),
timestamp (with or without time zone),
xml.
Each data type has an external representation determined by its input
and output functions. Many of the built-in types have
obvious external formats. However, several types are either unique
to PostgreSQL, such as geometric
paths, or have several possibilities for formats, such as the date
and time types.
Some of the input and output functions are not invertible. That is,
the result of an output function may lose accuracy when compared to
the original input.
Numeric Types
data type
numeric
Numeric types consist of two-, four-, and eight-byte integers,
four- and eight-byte floating-point numbers, and selectable-precision
decimals. lists the
available types.
Numeric Types
Name
Storage Size
Description
Range
smallint>
2 bytes
small-range integer
-32768 to +32767
integer>
4 bytes
usual choice for integer
-2147483648 to +2147483647
bigint>
8 bytes
large-range integer
-9223372036854775808 to 9223372036854775807
decimal>
variable
user-specified precision, exact
no limit
numeric>
variable
user-specified precision, exact
no limit
real>
4 bytes
variable-precision, inexact
6 decimal digits precision
double precision>
8 bytes
variable-precision, inexact
15 decimal digits precision
serial>
4 bytes
autoincrementing integer
1 to 2147483647
bigserial
8 bytes
large autoincrementing integer
1 to 9223372036854775807
The syntax of constants for the numeric types is described in
. The numeric types have a
full set of corresponding arithmetic operators and
functions. Refer to for more
information. The following sections describe the types in detail.
Integer Types
integer
smallint
bigint
int4
integer
int2
smallint
int8
bigint
The types smallint, integer, and
bigint store whole numbers, that is, numbers without
fractional components, of various ranges. Attempts to store
values outside of the allowed range will result in an error.
The type integer is the usual choice, as it offers
the best balance between range, storage size, and performance.
The smallint type is generally only used if disk
space is at a premium. The bigint type should only
be used if the integer range is not sufficient,
because the latter is definitely faster.
The bigint type may not function correctly on all
platforms, since it relies on compiler support for eight-byte
integers. On a machine without such support, bigint
acts the same as integer (but still takes up eight
bytes of storage). However, we are not aware of any reasonable
platform where this is actually the case.
SQL only specifies the integer types
integer (or int) and
smallint. The type bigint, and the
type names int2, int4, and
int8 are extensions, which are shared with various
other SQL database systems.
Arbitrary Precision Numbers
numeric (data type)
arbitrary precision numbers
decimal
numeric
The type numeric can store numbers with up to 1000
digits of precision and perform calculations exactly. It is
especially recommended for storing monetary amounts and other
quantities where exactness is required. However, arithmetic on
numeric values is very slow compared to the integer
types, or to the floating-point types described in the next section.
In what follows we use these terms: The
scale of a numeric is the
count of decimal digits in the fractional part, to the right of
the decimal point. The precision of a
numeric is the total count of significant digits in
the whole number, that is, the number of digits to both sides of
the decimal point. So the number 23.5141 has a precision of 6
and a scale of 4. Integers can be considered to have a scale of
zero.
Both the maximum precision and the maximum scale of a
numeric column can be
configured. To declare a column of type numeric use
the syntax
NUMERIC(precision, scale)
The precision must be positive, the scale zero or positive.
Alternatively,
NUMERIC(precision)
selects a scale of 0. Specifying
NUMERIC
without any precision or scale creates a column in which numeric
values of any precision and scale can be stored, up to the
implementation limit on precision. A column of this kind will
not coerce input values to any particular scale, whereas
numeric columns with a declared scale will coerce
input values to that scale. (The SQL standard
requires a default scale of 0, i.e., coercion to integer
precision. We find this a bit useless. If you're concerned
about portability, always specify the precision and scale
explicitly.)
If the scale of a value to be stored is greater than the declared
scale of the column, the system will round the value to the specified
number of fractional digits. Then, if the number of digits to the
left of the decimal point exceeds the declared precision minus the
declared scale, an error is raised.
Numeric values are physically stored without any extra leading or
trailing zeroes. Thus, the declared precision and scale of a column
are maximums, not fixed allocations. (In this sense the numeric>
type is more akin to varchar(n>)
than to char(n>).) The actual storage
requirement is two bytes for each group of four decimal digits,
plus eight bytes overhead.
NaN
not a number
not a number
numeric (data type)
In addition to ordinary numeric values, the numeric
type allows the special value NaN>, meaning
not-a-number
. Any operation on NaN>
yields another NaN>. When writing this value
as a constant in a SQL command, you must put quotes around it,
for example UPDATE table SET x = 'NaN'>. On input,
the string NaN> is recognized in a case-insensitive manner.
In most implementations of the not-a-number> concept,
NaN> is not considered equal to any other numeric
value (including NaN>). In order to allow
numeric> values to be sorted and used in tree-based
indexes, PostgreSQL> treats NaN>
values as equal, and greater than all non-NaN>
values.
The types decimal and numeric are
equivalent. Both types are part of the SQL
standard.
Floating-Point Types
real
double precision
float4
real
float8
double precision
floating point
The data types real and double
precision are inexact, variable-precision numeric types.
In practice, these types are usually implementations of
IEEE Standard 754 for Binary Floating-Point
Arithmetic (single and double precision, respectively), to the
extent that the underlying processor, operating system, and
compiler support it.
Inexact means that some values cannot be converted exactly to the
internal format and are stored as approximations, so that storing
and printing back out a value may show slight discrepancies.
Managing these errors and how they propagate through calculations
is the subject of an entire branch of mathematics and computer
science and will not be discussed further here, except for the
following points:
If you require exact storage and calculations (such as for
monetary amounts), use the numeric type instead.
If you want to do complicated calculations with these types
for anything important, especially if you rely on certain
behavior in boundary cases (infinity, underflow), you should
evaluate the implementation carefully.
Comparing two floating-point values for equality may or may
not work as expected.
On most platforms, the real type has a range of at least
1E-37 to 1E+37 with a precision of at least 6 decimal digits. The
double precision type typically has a range of around
1E-307 to 1E+308 with a precision of at least 15 digits. Values that
are too large or too small will cause an error. Rounding may
take place if the precision of an input number is too high.
Numbers too close to zero that are not representable as distinct
from zero will cause an underflow error.
not a number
double precision
In addition to ordinary numeric values, the floating-point types
have several special values:
Infinity
-Infinity
NaN
These represent the IEEE 754 special values
infinity
, negative infinity
, and
not-a-number
, respectively. (On a machine whose
floating-point arithmetic does not follow IEEE 754, these values
will probably not work as expected.) When writing these values
as constants in a SQL command, you must put quotes around them,
for example UPDATE table SET x = 'Infinity'>. On input,
these strings are recognized in a case-insensitive manner.
IEEE754 specifies that NaN> should not compare equal
to any other floating-point value (including NaN>).
In order to allow floating-point values to be sorted and used
in tree-based indexes, PostgreSQL> treats
NaN> values as equal, and greater than all
non-NaN> values.
PostgreSQL also supports the SQL-standard
notations float and
float(p) for specifying
inexact numeric types. Here, p specifies
the minimum acceptable precision in binary digits.
PostgreSQL accepts
float(1) to float(24) as selecting the
real type, while
float(25) to float(53) select
double precision. Values of p
outside the allowed range draw an error.
float with no precision specified is taken to mean
double precision.
Prior to PostgreSQL 7.4, the precision in
float(p) was taken to mean
so many decimal digits. This has been corrected to match the SQL
standard, which specifies that the precision is measured in binary
digits. The assumption that real and
double precision have exactly 24 and 53 bits in the
mantissa respectively is correct for IEEE-standard floating point
implementations. On non-IEEE platforms it may be off a little, but
for simplicity the same ranges of p are used
on all platforms.
Serial Types
serial
bigserial
serial4
serial8
auto-increment
serial
sequence
and serial type
The data types serial and bigserial
are not true types, but merely
a notational convenience for setting up unique identifier columns
(similar to the AUTO_INCREMENT property
supported by some other databases). In the current
implementation, specifying
CREATE TABLE tablename (
colname SERIAL
);
is equivalent to specifying:
CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
Thus, we have created an integer column and arranged for its default
values to be assigned from a sequence generator. A NOT NULL>
constraint is applied to ensure that a null value cannot be explicitly
inserted, either. (In most cases you would also want to attach a
UNIQUE> or PRIMARY KEY> constraint to prevent
duplicate values from being inserted by accident, but this is
not automatic.) Lastly, the sequence is marked as owned by>
the column, so that it will be dropped if the column or table is dropped.
Prior to PostgreSQL 7.3, serial
implied UNIQUE. This is no longer automatic. If
you wish a serial column to be in a unique constraint or a
primary key, it must now be specified, same as with
any other data type.
To insert the next value of the sequence into the serial
column, specify that the serial
column should be assigned its default value. This can be done
either by excluding the column from the list of columns in
the INSERT statement, or through the use of
the DEFAULT key word.
The type names serial and serial4 are
equivalent: both create integer columns. The type
names bigserial and serial8 work just
the same way, except that they create a bigint
column. bigserial should be used if you anticipate
the use of more than 231> identifiers over the
lifetime of the table.
The sequence created for a serial column is
automatically dropped when the owning column is dropped.
You can drop the sequence without dropping the column, but this
will force removal of the column default expression.
Monetary Types
The money type is deprecated. Use
numeric or decimal instead, in
combination with the to_char function.
The money type stores a currency amount with a fixed
fractional precision; see .
Input is accepted in a variety of formats, including integer and
floating-point literals, as well as typical
currency formatting, such as '$1,000.00'.
Output is generally in the latter form but depends on the locale.
Monetary Types
Name
Storage Size
Description
Range
money
4 bytes
currency amount
-21474836.48 to +21474836.47
Character Types
character string
data types
string
character string
character
character varying
text
char
varchar
Character Types
Name
Description
character varying(n>), varchar(n>)
variable-length with limit
character(n>), char(n>)
fixed-length, blank padded
text
variable unlimited length
shows the
general-purpose character types available in
PostgreSQL.
SQL defines two primary character types:
character varying(n>) and
character(n>), where n>
is a positive integer. Both of these types can store strings up to
n> characters in length. An attempt to store a
longer string into a column of these types will result in an
error, unless the excess characters are all spaces, in which case
the string will be truncated to the maximum length. (This somewhat
bizarre exception is required by the SQL
standard.) If the string to be stored is shorter than the declared
length, values of type character will be space-padded;
values of type character varying will simply store the
shorter
string.
If one explicitly casts a value to character
varying(n>) or
character(n>), then an over-length
value will be truncated to n> characters without
raising an error. (This too is required by the
SQL standard.)
The notations varchar(n>) and
char(n>) are aliases for character
varying(n>) and
character(n>), respectively.
character without length specifier is equivalent to
character(1). If character varying is used
without length specifier, the type accepts strings of any size. The
latter is a PostgreSQL> extension.
In addition, PostgreSQL provides the
text type, which stores strings of any length.
Although the type text is not in the
SQL standard, several other SQL database
management systems have it as well.
Values of type character are physically padded
with spaces to the specified width n>, and are
stored and displayed that way. However, the padding spaces are
treated as semantically insignificant. Trailing spaces are
disregarded when comparing two values of type character,
and they will be removed when converting a character value
to one of the other string types. Note that trailing spaces
are> semantically significant in
character varying and text values.
The storage requirement for data of these types is 4 bytes plus the
actual string, and in case of character plus the
padding. Long strings are compressed by the system automatically, so
the physical requirement on disk may be less. Long values are also
stored in background tables so they do not interfere with rapid
access to the shorter column values. In any case, the longest
possible character string that can be stored is about 1 GB. (The
maximum value that will be allowed for n> in the data
type declaration is less than that. It wouldn't be very useful to
change this because with multibyte character encodings the number of
characters and bytes can be quite different anyway. If you desire to
store long strings with no specific upper limit, use
text or character varying without a length
specifier, rather than making up an arbitrary length limit.)
There are no performance differences between these three types,
apart from the increased storage size when using the blank-padded
type. While character(n>) has performance
advantages in some other database systems, it has no such advantages in
PostgreSQL. In most situations
text or character varying should be used
instead.
Refer to for information about
the syntax of string literals, and to
for information about available operators and functions. The
database character set determines the character set used to store
textual values; for more information on character set support,
refer to .
Using the character types
CREATE TABLE test1 (a character(4));
INSERT INTO test1 VALUES ('ok');
SELECT a, char_length(a) FROM test1; --
a | char_length
------+-------------
ok | 2
CREATE TABLE test2 (b varchar(5));
INSERT INTO test2 VALUES ('ok');
INSERT INTO test2 VALUES ('good ');
INSERT INTO test2 VALUES ('too long');
ERROR: value too long for type character varying(5)
INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
SELECT b, char_length(b) FROM test2;
b | char_length
-------+-------------
ok | 2
good | 5
too l | 5
The char_length function is discussed in
.
There are two other fixed-length character types in
PostgreSQL, shown in . The name
type exists only for storage of identifiers
in the internal system catalogs and is not intended for use by the general user. Its
length is currently defined as 64 bytes (63 usable characters plus
terminator) but should be referenced using the constant
NAMEDATALEN. The length is set at compile time (and
is therefore adjustable for special uses); the default maximum
length may change in a future release. The type "char"
(note the quotes) is different from char(1) in that it
only uses one byte of storage. It is internally used in the system
catalogs as a poor-man's enumeration type.
Special Character Types
Name
Storage Size
Description
"char"
1 byte
single-character internal type
name
64 bytes
internal type for object names
Binary Data Types
binary data
bytea
The bytea data type allows storage of binary strings;
see .
Binary Data Types
Name
Storage Size
Description
bytea
4 bytes plus the actual binary string
variable-length binary string
A binary string is a sequence of octets (or bytes). Binary
strings are distinguished from character strings by two
characteristics: First, binary strings specifically allow storing
octets of value zero and other non-printable
octets (usually, octets outside the range 32 to 126).
Character strings disallow zero octets, and also disallow any
other octet values and sequences of octet values that are invalid
according to the database's selected character set encoding.
Second, operations on binary strings process the actual bytes,
whereas the processing of character strings depends on locale settings.
In short, binary strings are appropriate for storing data that the
programmer thinks of as raw bytes>, whereas character
strings are appropriate for storing text.
When entering bytea values, octets of certain
values must be escaped (but all octet
values can be escaped) when used as part
of a string literal in an SQL statement. In
general, to escape an octet, it is converted into the three-digit
octal number equivalent of its decimal octet value, and preceded
by two backslashes (or one backslash if
standard_conforming_strings> is off>).
shows the characters
that must be escaped, and gives the alternate escape sequences
where applicable.
bytea> Literal Escaped Octets
Decimal Octet Value
Description
Escaped Input Representation
Example
Output Representation
0
zero octet
'\\000'
SELECT '\\000'::bytea;
\000
39
single quote
'\'' or '\\047'
SELECT '\''::bytea;
'
92
backslash
'\\\\' or '\\134'
SELECT '\\\\'::bytea;
\\
0 to 31 and 127 to 255
non-printable
octets
'\\xxx'> (octal value)
SELECT '\\001'::bytea;
\001
The requirement to escape non-printable
octets actually
varies depending on locale settings. In some instances you can get away
with leaving them unescaped. Note that the result in each of the examples
in was exactly one octet in
length, even though the output representation of the zero octet and
backslash are more than one character.
The reason that you have to write so many backslashes, as shown
in , is that an input
string written as a string literal must pass through two parse
phases in the PostgreSQL server.
The first backslash of each pair is interpreted as an escape
character by the string-literal parser (assuming
standard_conforming_strings> is off>)
and is therefore consumed, leaving the second backslash of the
pair. The remaining backslash is then recognized by the
bytea input function as starting either a three
digit octal value or escaping another backslash. For example,
a string literal passed to the server as '\\001'
becomes \001 after passing through the
string-literal parser. The \001 is then sent
to the bytea input function, where it is converted
to a single octet with a decimal value of 1. Note that the
apostrophe character is not treated specially by bytea,
so it follows the normal rules for string literals. (See also
.)
Bytea octets are also escaped in the output. In general, each
non-printable
octet is converted into
its equivalent three-digit octal value and preceded by one backslash.
Most printable
octets are represented by their standard
representation in the client character set. The octet with decimal
value 92 (backslash) has a special alternative output representation.
Details are in .
bytea> Output Escaped Octets
Decimal Octet Value
Description
Escaped Output Representation
Example
Output Result
92
backslash
\\
SELECT '\\134'::bytea;
\\
0 to 31 and 127 to 255
non-printable
octets
\xxx> (octal value)
SELECT '\\001'::bytea;
\001
32 to 126
printable
octets
client character set representation
SELECT '\\176'::bytea;
~
Depending on the front end to PostgreSQL> you use,
you may have additional work to do in terms of escaping and
unescaping bytea strings. For example, you may also
have to escape line feeds and carriage returns if your interface
automatically translates these.
The SQL standard defines a different binary
string type, called BLOB or BINARY LARGE
OBJECT. The input format is different from
bytea, but the provided functions and operators are
mostly the same.
Date/Time Types
date
time
time without time zone
time with time zone
timestamp
timestamp with time zone
timestamp without time zone
interval
time span
PostgreSQL supports the full set of
SQL date and time types, shown in . The operations available
on these data types are described in
.
Date/Time Types
Name
Storage Size
Description
Low Value
High Value
Resolution
timestamp [ (p) ] [ without time zone ]
8 bytes
both date and time
4713 BC
5874897 AD
1 microsecond / 14 digits
timestamp [ (p) ] with time zone
8 bytes
both date and time, with time zone
4713 BC
5874897 AD
1 microsecond / 14 digits
interval [ (p) ]
12 bytes
time intervals
-178000000 years
178000000 years
1 microsecond / 14 digits
date
4 bytes
dates only
4713 BC
5874897 AD
1 day
time [ (p) ] [ without time zone ]
8 bytes
times of day only
00:00:00
24:00:00
1 microsecond / 14 digits
time [ (p) ] with time zone
12 bytes
times of day only, with time zone
00:00:00+1459
24:00:00-1459
1 microsecond / 14 digits
Prior to PostgreSQL 7.3, writing just
timestamp was equivalent to timestamp with
time zone. This was changed for SQL compliance.
time, timestamp, and
interval accept an optional precision value
p which specifies the number of
fractional digits retained in the seconds field. By default, there
is no explicit bound on precision. The allowed range of
p is from 0 to 6 for the
timestamp and interval types.
When timestamp> values are stored as double precision floating-point
numbers (currently the default), the effective limit of precision
may be less than 6. timestamp values are stored as seconds
before or after midnight 2000-01-01. Microsecond precision is achieved for
dates within a few years of 2000-01-01, but the precision degrades for
dates further away. When timestamp values are stored as
eight-byte integers (a compile-time
option), microsecond precision is available over the full range of
values. However eight-byte integer timestamps have a more limited range of
dates than shown above: from 4713 BC up to 294276 AD. The same
compile-time option also determines whether time and
interval values are stored as floating-point or eight-byte
integers. In the floating-point case, large interval values
degrade in precision as the size of the interval increases.
For the time types, the allowed range of
p is from 0 to 6 when eight-byte integer
storage is used, or from 0 to 10 when floating-point storage is used.
The type time with time zone is defined by the SQL
standard, but the definition exhibits properties which lead to
questionable usefulness. In most cases, a combination of
date, time, timestamp without time
zone, and timestamp with time zone should
provide a complete range of date/time functionality required by
any application.
The types abstime
and reltime are lower precision types which are used internally.
You are discouraged from using these types in new
applications and are encouraged to move any old
ones over when appropriate. Any or all of these internal types
might disappear in a future release.
Date/Time Input
Date and time input is accepted in almost any reasonable format, including
ISO 8601, SQL-compatible,
traditional POSTGRES, and others.
For some formats, ordering of month, day, and year in date input is
ambiguous and there is support for specifying the expected
ordering of these fields. Set the parameter
to MDY> to select month-day-year interpretation,
DMY> to select day-month-year interpretation, or
YMD> to select year-month-day interpretation.
PostgreSQL is more flexible in
handling date/time input than the
SQL standard requires.
See
for the exact parsing rules of date/time input and for the
recognized text fields including months, days of the week, and
time zones.
Remember that any date or time literal input needs to be enclosed
in single quotes, like text strings. Refer to
for more
information.
SQL requires the following syntax
type [ (p) ] 'value'
where p in the optional precision
specification is an integer corresponding to the number of
fractional digits in the seconds field. Precision can be
specified for time, timestamp, and
interval types. The allowed values are mentioned
above. If no precision is specified in a constant specification,
it defaults to the precision of the literal value.
Dates
date
shows some possible
inputs for the date type.
Date Input
Example
Description
January 8, 1999
unambiguous in any datestyle input mode
1999-01-08
ISO 8601; January 8 in any mode
(recommended format)
1/8/1999
January 8 in MDY> mode;
August 1 in DMY> mode
1/18/1999
January 18 in MDY> mode;
rejected in other modes
01/02/03
January 2, 2003 in MDY> mode;
February 1, 2003 in DMY> mode;
February 3, 2001 in YMD> mode
1999-Jan-08
January 8 in any mode
Jan-08-1999
January 8 in any mode
08-Jan-1999
January 8 in any mode
99-Jan-08
January 8 in YMD> mode, else error
08-Jan-99
January 8, except error in YMD> mode
Jan-08-99
January 8, except error in YMD> mode
19990108
ISO 8601; January 8, 1999 in any mode
990108
ISO 8601; January 8, 1999 in any mode
1999.008
year and day of year
J2451187
Julian day
January 8, 99 BC
year 99 before the Common Era
Times
time
time without time zone
time with time zone
The time-of-day types are time [
(p) ] without time zone and
time [ (p) ] with time
zone. Writing just time is equivalent to
time without time zone.
Valid input for these types consists of a time of day followed
by an optional time zone. (See
and .) If a time zone is
specified in the input for time without time zone,
it is silently ignored. You can also specify a date but it will
be ignored, except when you use a time zone name that involves a
daylight-savings rule, such as
America/New_York. In this case specifying the date
is required in order to determine whether standard or daylight-savings
time applies. The appropriate time zone offset is recorded in the
time with time zone value.
Time Input
Example
Description
04:05:06.789
ISO 8601
04:05:06
ISO 8601
04:05
ISO 8601
040506
ISO 8601
04:05 AM
same as 04:05; AM does not affect value
04:05 PM
same as 16:05; input hour must be <= 12
04:05:06.789-8
ISO 8601
04:05:06-08:00
ISO 8601
04:05-08:00
ISO 8601
040506-08
ISO 8601
04:05:06 PST
time zone specified by abbreviation
2003-04-12 04:05:06 America/New_York
time zone specified by full name
Time Zone Input
Example
Description
PST
Abbreviation (for Pacific Standard Time)
America/New_York
Full time zone name
PST8PDT
POSIX-style time zone specification
-8:00
ISO-8601 offset for PST
-800
ISO-8601 offset for PST
-8
ISO-8601 offset for PST
zulu
Military abbreviation for UTC
z
Short form of zulu
Refer to for more information on how
to specify time zones.
Time Stamps
timestamp
timestamp with time zone
timestamp without time zone
Valid input for the time stamp types consists of a concatenation
of a date and a time, followed by an optional time zone,
followed by an optional AD or BC.
(Alternatively, AD/BC can appear
before the time zone, but this is not the preferred ordering.)
Thus
1999-01-08 04:05:06
and
1999-01-08 04:05:06 -8:00
are valid values, which follow the ISO 8601
standard. In addition, the wide-spread format
January 8 04:05:06 1999 PST
is supported.
The SQL standard differentiates timestamp without time zone
and timestamp with time zone literals by the presence of a
+
or -
. Hence, according to the standard,
TIMESTAMP '2004-10-19 10:23:54'
is a timestamp without time zone, while
TIMESTAMP '2004-10-19 10:23:54+02'
is a timestamp with time zone.
PostgreSQL never examines the content of a
literal string before determining its type, and therefore will treat
both of the above as timestamp without time zone. To
ensure that a literal is treated as timestamp with time
zone, give it the correct explicit type:
TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'
In a literal that has been decided to be timestamp without time
zone, PostgreSQL will silently ignore
any time zone indication.
That is, the resulting value is derived from the date/time
fields in the input value, and is not adjusted for time zone.
For timestamp with time zone, the internally stored
value is always in UTC (Universal
Coordinated Time, traditionally known as Greenwich Mean Time,
GMT>). An input value that has an explicit
time zone specified is converted to UTC using the appropriate offset
for that time zone. If no time zone is stated in the input string,
then it is assumed to be in the time zone indicated by the system's
parameter, and is converted to UTC using the
offset for the timezone> zone.
When a timestamp with time
zone value is output, it is always converted from UTC to the
current timezone> zone, and displayed as local time in that
zone. To see the time in another time zone, either change
timezone> or use the AT TIME ZONE> construct
(see ).
Conversions between timestamp without time zone and
timestamp with time zone normally assume that the
timestamp without time zone value should be taken or given
as timezone> local time. A different zone reference can
be specified for the conversion using AT TIME ZONE>.
Intervals
interval
interval values can be written with the following syntax:
@> quantity> unit> quantity> unit>...> direction>
Where: quantity> is a number (possibly signed);
unit> is second,
minute, hour, day,
week, month, year,
decade, century, millennium,
or abbreviations or plurals of these units;
direction> can be ago or
empty. The at sign (@>) is optional noise. The amounts
of different units are implicitly added up with appropriate
sign accounting.
Quantities of days, hours, minutes, and seconds can be specified without
explicit unit markings. For example, '1 12:59:10'> is read
the same as '1 day 12 hours 59 min 10 sec'>.
The optional subsecond precision p should
be between 0 and 6, and defaults to the precision of the input literal.
Internally interval> values are stored as months, days,
and seconds. This is done because the number of days in a month
varies, and a day can have 23 or 25 hours if a daylight savings
time adjustment is involved. Because intervals are usually created
from constant strings or timestamp> subtraction, this
storage method works well in most cases. Functions
justify_days> and justify_hours> are
available for adjusting days and hours that overflow their normal
periods.
Special Values
time
constants
date
constants
PostgreSQL supports several
special date/time input values for convenience, as shown in . The values
infinity and -infinity
are specially represented inside the system and will be displayed
the same way; but the others are simply notational shorthands
that will be converted to ordinary date/time values when read.
(In particular, now> and related strings are converted
to a specific time value as soon as they are read.)
All of these values need to be written in single quotes when used
as constants in SQL commands.
Special Date/Time Inputs
Input String
Valid Types
Description
epoch
date, timestamp
1970-01-01 00:00:00+00 (Unix system time zero)
infinity
timestamp
later than all other time stamps
-infinity
timestamp
earlier than all other time stamps
now
date, time, timestamp
current transaction's start time
today
date, timestamp
midnight today
tomorrow
date, timestamp
midnight tomorrow
yesterday
date, timestamp
midnight yesterday
allballs
time
00:00:00.00 UTC
The following SQL-compatible functions can also
be used to obtain the current time value for the corresponding data
type:
CURRENT_DATE, CURRENT_TIME,
CURRENT_TIMESTAMP, LOCALTIME,
LOCALTIMESTAMP. The latter four accept an
optional subsecond precision specification. (See .) Note however that these are
SQL functions and are not> recognized as data input strings.
Date/Time Output
date
output format
formatting
time
output format
formatting
The output format of the date/time types can be set to one of the four
styles ISO 8601,
SQL (Ingres), traditional POSTGRES, and
German, using the command SET datestyle. The default
is the ISO format. (The
SQL standard requires the use of the ISO 8601
format. The name of the SQL
output format is a
historical accident.) shows examples of each
output style. The output of the date and
time types is of course only the date or time part
in accordance with the given examples.
Date/Time Output Styles
Style Specification
Description
Example
ISO
ISO 8601/SQL standard
1997-12-17 07:37:16-08
SQL
traditional style
12/17/1997 07:37:16.00 PST
POSTGRES
original style
Wed Dec 17 07:37:16 1997 PST
German
regional style
17.12.1997 07:37:16.00 PST
In the SQL and POSTGRES styles, day appears before
month if DMY field ordering has been specified, otherwise month appears
before day.
(See
for how this setting also affects interpretation of input values.)
shows an
example.
Date Order Conventions
datestyle Setting
Input Ordering
Example Output
SQL, DMY>
day/month/year
17/12/1997 15:37:16.00 CET
SQL, MDY>
month/day/year
12/17/1997 07:37:16.00 PST
Postgres, DMY>
day/month/year
Wed 17 Dec 07:37:16 1997 PST
interval output looks like the input format, except
that units like century or
week are converted to years and days and
ago is converted to an appropriate sign. In
ISO mode the output looks like
quantity> unit> ... > > days> > hours>:minutes>:seconds>
The date/time styles can be selected by the user using the
SET datestyle command, the parameter in the
postgresql.conf configuration file, or the
PGDATESTYLE environment variable on the server or
client. The formatting function to_char
(see ) is also available as
a more flexible way to format the date/time output.
Time Zones
time zone
Time zones, and time-zone conventions, are influenced by
political decisions, not just earth geometry. Time zones around the
world became somewhat standardized during the 1900's,
but continue to be prone to arbitrary changes, particularly with
respect to daylight-savings rules.
PostgreSQL currently supports daylight-savings
rules over the time period 1902 through 2038 (corresponding to the full
range of conventional Unix system time). Times outside that range are
taken to be in standard time> for the selected time zone, no
matter what part of the year they fall in.
PostgreSQL endeavors to be compatible with
the SQL standard definitions for typical usage.
However, the SQL standard has an odd mix of date and
time types and capabilities. Two obvious problems are:
Although the date type
does not have an associated time zone, the
time type can.
Time zones in the real world have little meaning unless
associated with a date as well as a time,
since the offset may vary through the year with daylight-saving
time boundaries.
The default time zone is specified as a constant numeric offset
from UTC>. It is therefore not possible to adapt to
daylight-saving time when doing date/time arithmetic across
DST boundaries.
To address these difficulties, we recommend using date/time types
that contain both date and time when using time zones. We
recommend not using the type time with
time zone (though it is supported by
PostgreSQL for legacy applications and
for compliance with the SQL standard).
PostgreSQL assumes
your local time zone for any type containing only date or time.
All timezone-aware dates and times are stored internally in
UTC. They are converted to local time
in the zone specified by the configuration
parameter before being displayed to the client.
PostgreSQL allows you to specify time zones in
three different forms:
A full time zone name, for example America/New_York>.
The recognized time zone names are listed in the
pg_timezone_names view (see ).
PostgreSQL uses the widely-used
zic> time zone data for this purpose, so the same
names are also recognized by much other software.
A time zone abbreviation, for example PST>. Such a
specification merely defines a particular offset from UTC, in
contrast to full time zone names which may imply a set of daylight
savings transition-date rules as well. The recognized abbreviations
are listed in the pg_timezone_abbrevs> view (see ). You cannot set the
configuration parameter using a time
zone abbreviation, but you can use abbreviations in
date/time input values and with the AT TIME ZONE>
operator.
In addition to the timezone names and abbreviations,
PostgreSQL will accept POSIX-style time zone
specifications of the form STD>offset> or
STD>offset>DST>, where
STD> is a zone abbreviation, offset> is a
numeric offset in hours west from UTC, and DST> is an
optional daylight-savings zone abbreviation, assumed to stand for one
hour ahead of the given offset. For example, if EST5EDT>
were not already a recognized zone name, it would be accepted and would
be functionally equivalent to USA East Coast time. When a
daylight-savings zone name is present, it is assumed to be used
according to USA time zone rules, so this feature is of limited use
outside North America. One should also be wary that this provision can
lead to silently accepting bogus input, since there is no check on the
reasonableness of the zone abbreviations. For example, SET
TIMEZONE TO FOOBAR0> will work, leaving the system effectively using
a rather peculiar abbreviation for GMT.
There is a conceptual and practical difference between the abbreviations
and the full names: abbreviations always represent a fixed offset from
UTC, whereas most of the full names imply a local daylight-savings time
rule and so have two possible UTC offsets.
In all cases, timezone names are recognized case-insensitively.
(This is a change from PostgreSQL versions
prior to 8.2, which were case-sensitive in some contexts and not others.)
Neither full names nor abbreviations are hard-wired into the server;
they are obtained from configuration files stored under
.../share/timezone/> and .../share/timezonesets/>
of the installation directory
(see ).
The configuration parameter can
be set in the file postgresql.conf>, or in any of the
other standard ways described in .
There are also several special ways to set it:
If timezone> is not specified in
postgresql.conf> nor as a server command-line option,
the server attempts to use the value of the TZ
environment variable as the default time zone. If TZ
is not defined or is not any of the time zone names known to
PostgreSQL, the server attempts to
determine the operating system's default time zone by checking the
behavior of the C library function localtime()>. The
default time zone is selected as the closest match among
PostgreSQL's known time zones.
The SQL command SET TIME ZONE
sets the time zone for the session. This is an alternative spelling
of SET TIMEZONE TO> with a more SQL-spec-compatible syntax.
The PGTZ environment variable, if set at the
client, is used by libpq
applications to send a SET TIME ZONE
command to the server upon connection.
Internals
PostgreSQL uses Julian dates
for all date/time calculations. They have the nice property of correctly
predicting/calculating any date more recent than 4713 BC
to far into the future, using the assumption that the length of the
year is 365.2425 days.
Date conventions before the 19th century make for interesting reading,
but are not consistent enough to warrant coding into a date/time handler.
Boolean Type
Boolean
data type
true
false
PostgreSQL provides the
standard SQL type boolean.
boolean can have one of only two states:
true
or false
. A third state,
unknown
, is represented by the
SQL null value.
Valid literal values for the true
state are:
TRUE
't'
'true'
'y'
'yes'
'1'
For the false
state, the following values can be
used:
FALSE
'f'
'false'
'n'
'no'
'0'
Using the key words TRUE and
FALSE is preferred (and
SQL-compliant).
Using the boolean type
CREATE TABLE test1 (a boolean, b text);
INSERT INTO test1 VALUES (TRUE, 'sic est');
INSERT INTO test1 VALUES (FALSE, 'non est');
SELECT * FROM test1;
a | b
---+---------
t | sic est
f | non est
SELECT * FROM test1 WHERE a;
a | b
---+---------
t | sic est
shows that
boolean values are output using the letters
t and f.
boolean uses 1 byte of storage.
Geometric Types
Geometric data types represent two-dimensional spatial
objects. shows the geometric
types available in PostgreSQL. The
most fundamental type, the point, forms the basis for all of the
other types.
Geometric Types
Name
Storage Size
Representation
Description
point
16 bytes
Point on the plane
(x,y)
line
32 bytes
Infinite line (not fully implemented)
((x1,y1),(x2,y2))
lseg
32 bytes
Finite line segment
((x1,y1),(x2,y2))
box
32 bytes
Rectangular box
((x1,y1),(x2,y2))
path
16+16n bytes
Closed path (similar to polygon)
((x1,y1),...)
path
16+16n bytes
Open path
[(x1,y1),...]
polygon
40+16n bytes
Polygon (similar to closed path)
((x1,y1),...)
circle
24 bytes
Circle
<(x,y),r> (center and radius)
A rich set of functions and operators is available to perform various geometric
operations such as scaling, translation, rotation, and determining
intersections. They are explained in .
Points
point
Points are the fundamental two-dimensional building block for geometric types.
Values of type point are specified using the following syntax:
( x , y )
x , y
where x> and y> are the respective
coordinates as floating-point numbers.
Line Segments
lseg
line segment
Line segments (lseg) are represented by pairs of points.
Values of type lseg are specified using the following syntax:
( ( x1 , y1 ) , ( x2 , y2 ) )
( x1 , y1 ) , ( x2 , y2 )
x1 , y1 , x2 , y2
where
(x1,y1)
and
(x2,y2)
are the end points of the line segment.
Boxes
box (data type)
rectangle
Boxes are represented by pairs of points that are opposite
corners of the box.
Values of type box are specified using the following syntax:
( ( x1 , y1 ) , ( x2 , y2 ) )
( x1 , y1 ) , ( x2 , y2 )
x1 , y1 , x2 , y2
where
(x1,y1)
and
(x2,y2)
are any two opposite corners of the box.
Boxes are output using the first syntax.
The corners are reordered on input to store
the upper right corner, then the lower left corner.
Other corners of the box can be entered, but the lower
left and upper right corners are determined from the input and stored.
Paths
path (data type)
Paths are represented by lists of connected points. Paths can be
open, where
the first and last points in the list are not considered connected, or
closed,
where the first and last points are considered connected.
Values of type path are specified using the following syntax:
( ( x1 , y1 ) , ... , ( xn , yn ) )
[ ( x1 , y1 ) , ... , ( xn , yn ) ]
( x1 , y1 ) , ... , ( xn , yn )
( x1 , y1 , ... , xn , yn )
x1 , y1 , ... , xn , yn
where the points are the end points of the line segments
comprising the path. Square brackets ([]>) indicate
an open path, while parentheses (()>) indicate a
closed path.
Paths are output using the first syntax.
Polygons
polygon
Polygons are represented by lists of points (the vertexes of the
polygon). Polygons should probably be
considered equivalent to closed paths, but are stored differently
and have their own set of support routines.
Values of type polygon are specified using the following syntax:
( ( x1 , y1 ) , ... , ( xn , yn ) )
( x1 , y1 ) , ... , ( xn , yn )
( x1 , y1 , ... , xn , yn )
x1 , y1 , ... , xn , yn
where the points are the end points of the line segments
comprising the boundary of the polygon.
Polygons are output using the first syntax.
Circles
circle
Circles are represented by a center point and a radius.
Values of type circle are specified using the following syntax:
< ( x , y ) , r >
( ( x , y ) , r )
( x , y ) , r
x , y , r
where
(x,y)
is the center and r is the radius of the circle.
Circles are output using the first syntax.
Network Address Types
network
data types
PostgreSQL> offers data types to store IPv4, IPv6, and MAC
addresses, as shown in . It
is preferable to use these types instead of plain text types to store
network addresses, because
these types offer input error checking and several specialized
operators and functions (see ).
Network Address Types
Name
Storage Size
Description
cidr
12 or 24 bytes
IPv4 and IPv6 networks
inet
12 or 24 bytes
IPv4 and IPv6 hosts and networks
macaddr
6 bytes
MAC addresses
When sorting inet or cidr data types,
IPv4 addresses will always sort before IPv6 addresses, including
IPv4 addresses encapsulated or mapped into IPv6 addresses, such as
::10.2.3.4 or ::ffff::10.4.3.2.
inet
inet (data type)
The inet type holds an IPv4 or IPv6 host address, and
optionally the identity of the subnet it is in, all in one field.
The subnet identity is represented by stating how many bits of
the host address represent the network address (the
netmask
). If the netmask is 32 and the address is IPv4,
then the value does not indicate a subnet, only a single host.
In IPv6, the address length is 128 bits, so 128 bits specify a
unique host address. Note that if you
want to accept networks only, you should use the
cidr type rather than inet.
The input format for this type is
address/y
where
address
is an IPv4 or IPv6 address and
y
is the number of bits in the netmask. If the
/y
part is left off, then the
netmask is 32 for IPv4 and 128 for IPv6, so the value represents
just a single host. On display, the
/y
portion is suppressed if the netmask specifies a single host.
cidr>
cidr
The cidr type holds an IPv4 or IPv6 network specification.
Input and output formats follow Classless Internet Domain Routing
conventions.
The format for specifying networks is address/y> where address> is the network represented as an
IPv4 or IPv6 address, and y> is the number of bits in the netmask. If
y> is omitted, it is calculated
using assumptions from the older classful network numbering system, except
that it will be at least large enough to include all of the octets
written in the input. It is an error to specify a network address
that has bits set to the right of the specified netmask.
shows some examples.
cidr> Type Input Examples
cidr Input
cidr Output
abbrev(cidr)
192.168.100.128/25
192.168.100.128/25
192.168.100.128/25
192.168/24
192.168.0.0/24
192.168.0/24
192.168/25
192.168.0.0/25
192.168.0.0/25
192.168.1
192.168.1.0/24
192.168.1/24
192.168
192.168.0.0/24
192.168.0/24
128.1
128.1.0.0/16
128.1/16
128
128.0.0.0/16
128.0/16
128.1.2
128.1.2.0/24
128.1.2/24
10.1.2
10.1.2.0/24
10.1.2/24
10.1
10.1.0.0/16
10.1/16
10
10.0.0.0/8
10/8
10.1.2.3/32
10.1.2.3/32
10.1.2.3/32
2001:4f8:3:ba::/64
2001:4f8:3:ba::/64
2001:4f8:3:ba::/64
2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128
2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128
2001:4f8:3:ba:2e0:81ff:fe22:d1f1
::ffff:1.2.3.0/120
::ffff:1.2.3.0/120
::ffff:1.2.3/120
::ffff:1.2.3.0/128
::ffff:1.2.3.0/128
::ffff:1.2.3.0/128
inet vs. cidr
The essential difference between inet and cidr
data types is that inet accepts values with nonzero bits to
the right of the netmask, whereas cidr does not.
If you do not like the output format for inet or
cidr values, try the functions host>,
text>, and abbrev>.
macaddr>>
macaddr (data type)
MAC address
macaddr
The macaddr> type stores MAC addresses, i.e., Ethernet
card hardware addresses (although MAC addresses are used for
other purposes as well). Input is accepted in various customary
formats, including
'08002b:010203'>
'08002b-010203'>
'0800.2b01.0203'>
'08-00-2b-01-02-03'>
'08:00:2b:01:02:03'>
which would all specify the same
address. Upper and lower case is accepted for the digits
a> through f>. Output is always in the
last of the forms shown.
Bit String Types
bit string
data type
Bit strings are strings of 1's and 0's. They can be used to store
or visualize bit masks. There are two SQL bit types:
bit(n) and bit
varying(n), where
n is a positive integer.
bit type data must match the length
n exactly; it is an error to attempt to
store shorter or longer bit strings. bit varying data is
of variable length up to the maximum length
n; longer strings will be rejected.
Writing bit without a length is equivalent to
bit(1), while bit varying without a length
specification means unlimited length.
If one explicitly casts a bit-string value to
bit(n>), it will be truncated or
zero-padded on the right to be exactly n> bits,
without raising an error. Similarly,
if one explicitly casts a bit-string value to
bit varying(n>), it will be truncated
on the right if it is more than n> bits.
Refer to for information about the syntax
of bit string constants. Bit-logical operators and string
manipulation functions are available; see .
Using the bit string types
CREATE TABLE test (a BIT(3), b BIT VARYING(5));
INSERT INTO test VALUES (B'101', B'00');
INSERT INTO test VALUES (B'10', B'101');
ERROR: bit string length 2 does not match type bit(3)
INSERT INTO test VALUES (B'10'::bit(3), B'101');
SELECT * FROM test;
a | b
-----+-----
101 | 00
100 | 101
&array;
&rowtypes;
Object Identifier Types
object identifier
data type
oid
regproc
regprocedure
regoper
regoperator
regclass
regtype
xid
cid
tid
Object identifiers (OIDs) are used internally by
PostgreSQL as primary keys for various
system tables. OIDs are not added to user-created tables, unless
WITH OIDS is specified when the table is
created, or the
configuration variable is enabled. Type oid> represents
an object identifier. There are also several alias types for
oid>: regproc>, regprocedure>,
regoper>, regoperator>, regclass>, and
regtype>. shows an
overview.
The oid> type is currently implemented as an unsigned
four-byte integer. Therefore, it is not large enough to provide
database-wide uniqueness in large databases, or even in large
individual tables. So, using a user-created table's OID column as
a primary key is discouraged. OIDs are best used only for
references to system tables.
The oid> type itself has few operations beyond comparison.
It can be cast to integer, however, and then manipulated using the
standard integer operators. (Beware of possible
signed-versus-unsigned confusion if you do this.)
The OID alias types have no operations of their own except
for specialized input and output routines. These routines are able
to accept and display symbolic names for system objects, rather than
the raw numeric value that type oid> would use. The alias
types allow simplified lookup of OID values for objects. For example,
to examine the pg_attribute> rows related to a table
mytable>, one could write
SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;
rather than
SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');
While that doesn't look all that bad by itself, it's still oversimplified.
A far more complicated sub-select would be needed to
select the right OID if there are multiple tables named
mytable> in different schemas.
The regclass> input converter handles the table lookup according
to the schema path setting, and so it does the right thing>
automatically. Similarly, casting a table's OID to
regclass> is handy for symbolic display of a numeric OID.
Object Identifier Types
Name
References
Description
Value Example
oid>
any
numeric object identifier
564182>
regproc>
pg_proc>
function name
sum>
regprocedure>
pg_proc>
function with argument types
sum(int4)>
regoper>
pg_operator>
operator name
+>
regoperator>
pg_operator>
operator with argument types
*(integer,integer)> or -(NONE,integer)>
regclass>
pg_class>
relation name
pg_type>
regtype>
pg_type>
data type name
integer>
All of the OID alias types accept schema-qualified names, and will
display schema-qualified names on output if the object would not
be found in the current search path without being qualified.
The regproc> and regoper> alias types will only
accept input names that are unique (not overloaded), so they are
of limited use; for most uses regprocedure> or
regoperator> is more appropriate. For regoperator>,
unary operators are identified by writing NONE> for the unused
operand.
An additional property of the OID alias types is that if a
constant of one of these types appears in a stored expression
(such as a column default expression or view), it creates a dependency
on the referenced object. For example, if a column has a default
expression nextval('my_seq'::regclass)>,
PostgreSQL
understands that the default expression depends on the sequence
my_seq>; the system will not let the sequence be dropped
without first removing the default expression.
Another identifier type used by the system is xid>, or transaction
(abbreviated xact>) identifier. This is the data type of the system columns
xmin> and xmax>. Transaction identifiers are 32-bit quantities.
A third identifier type used by the system is cid>, or
command identifier. This is the data type of the system columns
cmin> and cmax>. Command identifiers are also 32-bit quantities.
A final identifier type used by the system is tid>, or tuple
identifier (row identifier). This is the data type of the system column
ctid>. A tuple ID is a pair
(block number, tuple index within block) that identifies the
physical location of the row within its table.
(The system columns are further explained in .)
Pseudo-Types
record
any
anyarray
anyelement
void
trigger
language_handler
cstring
internal
opaque
The PostgreSQL type system contains a
number of special-purpose entries that are collectively called
pseudo-types>. A pseudo-type cannot be used as a
column data type, but it can be used to declare a function's
argument or result type. Each of the available pseudo-types is
useful in situations where a function's behavior does not
correspond to simply taking or returning a value of a specific
SQL data type. lists the existing
pseudo-types.
Pseudo-Types
Name
Description
any>
Indicates that a function accepts any input data type whatever.
anyarray>
Indicates that a function accepts any array data type
(see ).
anyelement>
Indicates that a function accepts any data type
(see ).
cstring>
Indicates that a function accepts or returns a null-terminated C string.
internal>
Indicates that a function accepts or returns a server-internal
data type.
language_handler>
A procedural language call handler is declared to return language_handler>.
record>
Identifies a function returning an unspecified row type.
trigger>
A trigger function is declared to return trigger.>
void>
Indicates that a function returns no value.
opaque>
An obsolete type name that formerly served all the above purposes.
Functions coded in C (whether built-in or dynamically loaded) may be
declared to accept or return any of these pseudo data types. It is up to
the function author to ensure that the function will behave safely
when a pseudo-type is used as an argument type.
Functions coded in procedural languages may use pseudo-types only as
allowed by their implementation languages. At present the procedural
languages all forbid use of a pseudo-type as argument type, and allow
only void> and record> as a result type (plus
trigger> when the function is used as a trigger). Some also
support polymorphic functions using the types anyarray> and
anyelement>.
The internal> pseudo-type is used to declare functions
that are meant only to be called internally by the database
system, and not by direct invocation in a SQL
query. If a function has at least one internal>-type
argument then it cannot be called from SQL. To
preserve the type safety of this restriction it is important to
follow this coding rule: do not create any function that is
declared to return internal> unless it has at least one
internal> argument.
XML> Type
XML
The data type xml can be used to store XML data. Its
advantage over storing XML data in a text field is that it
checks the input values for well-formedness, and there are support
functions to perform type-safe operations on it; see .
In particular, the xml type can store well-formed
documents
, as defined by the XML standard, as well
as content
fragments, which are defined by the
production XMLDecl? content in the XML
standard. Roughly, this means that content fragments can have
more than one top-level element or character node. The expression
xmlvalue IS DOCUMENT
can be used to evaluate whether a particular xml
value is a full document or only a content fragment.
To produce a value of type xml from character data,
use the function xmlparse:
XMLPARSE ( { DOCUMENT | CONTENT } value)
Examples:
Manual...')
XMLPARSE (CONTENT 'abcbarfoo')
]]>
While this is the only way to convert character strings into XML
values according to the SQL standard, the PostgreSQL-specific
syntaxes
bar'
'bar'::xml
]]>
can also be used.
The xml type does not validate its input values
against a possibly included document type declaration (DTD).
The inverse operation, producing character string type values from
xml, uses the function
xmlserialize:
XMLSERIALIZE ( { DOCUMENT | CONTENT } value AS type )
type can be one of
character, character varying, or
text (or an alias name for those). Again, according
to the SQL standard, this is the only way to convert between type
xml and character types, but PostgreSQL also allows
you to simply cast the value.
Care must be taken when dealing with multiple character encodings
on the client, server, and in the XML data passed through them.
When using the text mode to pass queries to the server and query
results to the client (which is the normal mode), PostgreSQL
converts all character data passed between the client and the
server and vice versa to the character encoding of the respective
end; see . This includes string
representations of XML values, such as in the above examples.
This would ordinarily mean that encoding declarations contained in
XML data might become invalid as the character data is converted
to other encodings while travelling between client and server,
while the embedded encoding declaration is not changed. To cope
with this behavior, an encoding declaration contained in a
character string presented for input to the xml type
is ignored, and the content is always assumed
to be in the current server encoding. Consequently, for correct
processing, such character strings of XML data must be sent off
from the client in the current client encoding. It is the
responsibility of the client to either convert the document to the
current client encoding before sending it off to the server or to
adjust the client encoding appropriately. On output, values of
type xml will not have an encoding declaration, and
clients must assume that the data is in the current client
encoding.
When using the binary mode to pass query parameters to the server
and query results back the the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16 at
all). On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
Needless to say, processing XML data with PostgreSQL will be less
error-prone and more efficient if data encoding, client encoding,
and server encoding are the same. Since XML data is internally
processed in UTF-8, computations will be most efficient if the
server encoding is also UTF-8.
XML> (Extensible Markup Language) support is not
just the existance of an xml data type, but a
variety of features supported by a database system. These
capabilities include import/export, indexing, searching,
transforming, and XML> to SQL> mapping.
PostgreSQL> supports some but not all of these
XML> capabilities. For an overview of XML>
use in databases, see >.
Import/Export
There is no facility for mapping XML> to relational
tables. An external tool must be used for this. One simple way to
export XML> is to use psql> in
HTML> mode (\pset format html>), and convert
the XHTML> output to XML using an external tool.
Indexing
/contrib/xml2> functions can be used in expression
indexes to index specific XML> fields. To index the
full contents of XML> documents, the full-text indexing
tool /contrib/tsearch2> can be used. Of course,
Tsearch2 indexes have no XML> awareness so additional
/contrib/xml2> checks should be added to queries.
Searching
XPath searches are implemented using /contrib/xml2>.
It processes XML> text documents and returns results
based on the requested query.
Transforming
/contrib/xml2> supports XSLT> (Extensible
Stylesheet Language Transformation).
XML to SQL Mapping
This involves converting XML> data to and from
relational structures. PostgreSQL> has no internal
support for such mapping, and relies on external tools to do such
conversions.