Data Types

Data Types data type type data type PostgreSQL has a rich set of native data types available to users. Users may add new types to PostgreSQL using the command. shows all the built-in general-purpose data types. Most of the alternative names listed in the Aliases column are the names used internally by PostgreSQL for historical reasons. In addition, some internally used or deprecated types are available, but they are not listed here. Data Types Name Aliases Description bigint int8 signed eight-byte integer bigserial serial8 autoincrementing eight-byte integer bit [ (n) ] fixed-length bit string bit varying [ (n) ] varbit variable-length bit string boolean bool logical Boolean (true/false) box rectangular box in the plane bytea binary data (byte array) character varying [ (n) ] varchar [ (n) ] variable-length character string character [ (n) ] char [ (n) ] fixed-length character string cidr IPv4 or IPv6 network address circle circle in the plane date calendar date (year, month, day) double precision float8 double precision floating-point number inet IPv4 or IPv6 host address integer int, int4 signed four-byte integer interval [ (p) ] time span line infinite line in the plane lseg line segment in the plane macaddr MAC address money currency amount numeric [ (p, s) ] decimal [ (p, s) ] exact numeric of selectable precision path geometric path in the plane point geometric point in the plane polygon closed geometric path in the plane real float4 single precision floating-point number smallint int2 signed two-byte integer serial serial4 autoincrementing four-byte integer text variable-length character string time [ (p) ] [ without time zone ] time of day time [ (p) ] with time zone timetz time of day, including time zone timestamp [ (p) ] [ without time zone ] date and time timestamp [ (p) ] with time zone timestamptz date and time, including time zone xml XML data

Compatibility The following types (or spellings thereof) are specified by SQL: bit, bit varying, boolean, char, character varying, character, varchar, date, double precision, integer, interval, numeric, decimal, real, smallint, time (with or without time zone), timestamp (with or without time zone), xml. Each data type has an external representation determined by its input and output functions. Many of the built-in types have obvious external formats. However, several types are either unique to PostgreSQL, such as geometric paths, or have several possibilities for formats, such as the date and time types. Some of the input and output functions are not invertible. That is, the result of an output function may lose accuracy when compared to the original input. Numeric Types data type numeric Numeric types consist of two-, four-, and eight-byte integers, four- and eight-byte floating-point numbers, and selectable-precision decimals. lists the available types. Numeric Types Name Storage Size Description Range smallint 2 bytes small-range integer -32768 to +32767 integer 4 bytes usual choice for integer -2147483648 to +2147483647 bigint 8 bytes large-range integer -9223372036854775808 to 9223372036854775807 decimal variable user-specified precision, exact no limit numeric variable user-specified precision, exact no limit real 4 bytes variable-precision, inexact 6 decimal digits precision double precision 8 bytes variable-precision, inexact 15 decimal digits precision serial 4 bytes autoincrementing integer 1 to 2147483647 bigserial 8 bytes large autoincrementing integer 1 to 9223372036854775807

The syntax of constants for the numeric types is described in . The numeric types have a full set of corresponding arithmetic operators and functions. Refer to for more information. The following sections describe the types in detail. Integer Types integer smallint bigint int4 integer int2 smallint int8 bigint The types smallint, integer, and bigint store whole numbers, that is, numbers without fractional components, of various ranges. Attempts to store values outside of the allowed range will result in an error. The type integer is the usual choice, as it offers the best balance between range, storage size, and performance. The smallint type is generally only used if disk space is at a premium. The bigint type should only be used if the integer range is not sufficient, because the latter is definitely faster. The bigint type may not function correctly on all platforms, since it relies on compiler support for eight-byte integers. On a machine without such support, bigint acts the same as integer (but still takes up eight bytes of storage). However, we are not aware of any reasonable platform where this is actually the case. SQL only specifies the integer types integer (or int) and smallint. The type bigint, and the type names int2, int4, and int8 are extensions, which are shared with various other SQL database systems. Arbitrary Precision Numbers numeric (data type) arbitrary precision numbers decimal numeric The type numeric can store numbers with up to 1000 digits of precision and perform calculations exactly. It is especially recommended for storing monetary amounts and other quantities where exactness is required. However, arithmetic on numeric values is very slow compared to the integer types, or to the floating-point types described in the next section. In what follows we use these terms: The scale of a numeric is the count of decimal digits in the fractional part, to the right of the decimal point. The precision of a numeric is the total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point. So the number 23.5141 has a precision of 6 and a scale of 4. Integers can be considered to have a scale of zero. Both the maximum precision and the maximum scale of a numeric column can be configured. To declare a column of type numeric use the syntax NUMERIC(precision, scale) The precision must be positive, the scale zero or positive. Alternatively, NUMERIC(precision) selects a scale of 0. Specifying NUMERIC without any precision or scale creates a column in which numeric values of any precision and scale can be stored, up to the implementation limit on precision. A column of this kind will not coerce input values to any particular scale, whereas numeric columns with a declared scale will coerce input values to that scale. (The SQL standard requires a default scale of 0, i.e., coercion to integer precision. We find this a bit useless. If you're concerned about portability, always specify the precision and scale explicitly.) If the scale of a value to be stored is greater than the declared scale of the column, the system will round the value to the specified number of fractional digits. Then, if the number of digits to the left of the decimal point exceeds the declared precision minus the declared scale, an error is raised. Numeric values are physically stored without any extra leading or trailing zeroes. Thus, the declared precision and scale of a column are maximums, not fixed allocations. (In this sense the numeric type is more akin to varchar(n) than to char(n).) The actual storage requirement is two bytes for each group of four decimal digits, plus eight bytes overhead. NaN not a number not a number numeric (data type) In addition to ordinary numeric values, the numeric type allows the special value NaN, meaning not-a-number. Any operation on NaN yields another NaN. When writing this value as a constant in a SQL command, you must put quotes around it, for example UPDATE table SET x = 'NaN'. On input, the string NaN is recognized in a case-insensitive manner. In most implementations of the not-a-number concept, NaN is not considered equal to any other numeric value (including NaN). In order to allow numeric values to be sorted and used in tree-based indexes, PostgreSQL treats NaN values as equal, and greater than all non-NaN values. The types decimal and numeric are equivalent. Both types are part of the SQL standard. Floating-Point Types real double precision float4 real float8 double precision floating point The data types real and double precision are inexact, variable-precision numeric types. In practice, these types are usually implementations of IEEE Standard 754 for Binary Floating-Point Arithmetic (single and double precision, respectively), to the extent that the underlying processor, operating system, and compiler support it. Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and printing back out a value may show slight discrepancies. Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed further here, except for the following points: If you require exact storage and calculations (such as for monetary amounts), use the numeric type instead. If you want to do complicated calculations with these types for anything important, especially if you rely on certain behavior in boundary cases (infinity, underflow), you should evaluate the implementation carefully. Comparing two floating-point values for equality may or may not work as expected. On most platforms, the real type has a range of at least 1E-37 to 1E+37 with a precision of at least 6 decimal digits. The double precision type typically has a range of around 1E-307 to 1E+308 with a precision of at least 15 digits. Values that are too large or too small will cause an error. Rounding may take place if the precision of an input number is too high. Numbers too close to zero that are not representable as distinct from zero will cause an underflow error. not a number double precision In addition to ordinary numeric values, the floating-point types have several special values: Infinity -Infinity NaN These represent the IEEE 754 special values infinity, negative infinity, and not-a-number, respectively. (On a machine whose floating-point arithmetic does not follow IEEE 754, these values will probably not work as expected.) When writing these values as constants in a SQL command, you must put quotes around them, for example UPDATE table SET x = 'Infinity'. On input, these strings are recognized in a case-insensitive manner. IEEE754 specifies that NaN should not compare equal to any other floating-point value (including NaN). In order to allow floating-point values to be sorted and used in tree-based indexes, PostgreSQL treats NaN values as equal, and greater than all non-NaN values. PostgreSQL also supports the SQL-standard notations float and float(p) for specifying inexact numeric types. Here, p specifies the minimum acceptable precision in binary digits. PostgreSQL accepts float(1) to float(24) as selecting the real type, while float(25) to float(53) select double precision. Values of p outside the allowed range draw an error. float with no precision specified is taken to mean double precision. Prior to PostgreSQL 7.4, the precision in float(p) was taken to mean so many decimal digits. This has been corrected to match the SQL standard, which specifies that the precision is measured in binary digits. The assumption that real and double precision have exactly 24 and 53 bits in the mantissa respectively is correct for IEEE-standard floating point implementations. On non-IEEE platforms it may be off a little, but for simplicity the same ranges of p are used on all platforms. Serial Types serial bigserial serial4 serial8 auto-increment serial sequence and serial type The data types serial and bigserial are not true types, but merely a notational convenience for setting up unique identifier columns (similar to the AUTO_INCREMENT property supported by some other databases). In the current implementation, specifying CREATE TABLE tablename ( colname SERIAL ); is equivalent to specifying: CREATE SEQUENCE tablename_colname_seq; CREATE TABLE tablename ( colname integer NOT NULL DEFAULT nextval('tablename_colname_seq') ); ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname; Thus, we have created an integer column and arranged for its default values to be assigned from a sequence generator. A NOT NULL constraint is applied to ensure that a null value cannot be explicitly inserted, either. (In most cases you would also want to attach a UNIQUE or PRIMARY KEY constraint to prevent duplicate values from being inserted by accident, but this is not automatic.) Lastly, the sequence is marked as owned by the column, so that it will be dropped if the column or table is dropped. Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a serial column to be in a unique constraint or a primary key, it must now be specified, same as with any other data type. To insert the next value of the sequence into the serial column, specify that the serial column should be assigned its default value. This can be done either by excluding the column from the list of columns in the INSERT statement, or through the use of the DEFAULT key word. The type names serial and serial4 are equivalent: both create integer columns. The type names bigserial and serial8 work just the same way, except that they create a bigint column. bigserial should be used if you anticipate the use of more than 231 identifiers over the lifetime of the table. The sequence created for a serial column is automatically dropped when the owning column is dropped. You can drop the sequence without dropping the column, but this will force removal of the column default expression. Monetary Types The money type is deprecated. Use numeric or decimal instead, in combination with the to_char function. The money type stores a currency amount with a fixed fractional precision; see . Input is accepted in a variety of formats, including integer and floating-point literals, as well as typical currency formatting, such as '$1,000.00'. Output is generally in the latter form but depends on the locale. Monetary Types Name Storage Size Description Range money 4 bytes currency amount -21474836.48 to +21474836.47

Character Types character string data types string character string character character varying text char varchar Character Types Name Description character varying(n), varchar(n) variable-length with limit character(n), char(n) fixed-length, blank padded text variable unlimited length

shows the general-purpose character types available in PostgreSQL. SQL defines two primary character types: character varying(n) and character(n), where n is a positive integer. Both of these types can store strings up to n characters in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the SQL standard.) If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string. If one explicitly casts a value to character varying(n) or character(n), then an over-length value will be truncated to n characters without raising an error. (This too is required by the SQL standard.) The notations varchar(n) and char(n) are aliases for character varying(n) and character(n), respectively. character without length specifier is equivalent to character(1). If character varying is used without length specifier, the type accepts strings of any size. The latter is a PostgreSQL extension. In addition, PostgreSQL provides the text type, which stores strings of any length. Although the type text is not in the SQL standard, several other SQL database management systems have it as well. Values of type character are physically padded with spaces to the specified width n, and are stored and displayed that way. However, the padding spaces are treated as semantically insignificant. Trailing spaces are disregarded when comparing two values of type character, and they will be removed when converting a character value to one of the other string types. Note that trailing spaces are semantically significant in character varying and text values. The storage requirement for data of these types is 4 bytes plus the actual string, and in case of character plus the padding. Long strings are compressed by the system automatically, so the physical requirement on disk may be less. Long values are also stored in background tables so they do not interfere with rapid access to the shorter column values. In any case, the longest possible character string that can be stored is about 1 GB. (The maximum value that will be allowed for n in the data type declaration is less than that. It wouldn't be very useful to change this because with multibyte character encodings the number of characters and bytes can be quite different anyway. If you desire to store long strings with no specific upper limit, use text or character varying without a length specifier, rather than making up an arbitrary length limit.) There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. While character(n) has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. Refer to for information about the syntax of string literals, and to for information about available operators and functions. The database character set determines the character set used to store textual values; for more information on character set support, refer to . Using the character types CREATE TABLE test1 (a character(4)); INSERT INTO test1 VALUES ('ok'); SELECT a, char_length(a) FROM test1; -- a | char_length ------+------------- ok | 2 CREATE TABLE test2 (b varchar(5)); INSERT INTO test2 VALUES ('ok'); INSERT INTO test2 VALUES ('good '); INSERT INTO test2 VALUES ('too long'); ERROR: value too long for type character varying(5) INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation SELECT b, char_length(b) FROM test2; b | char_length -------+------------- ok | 2 good | 5 too l | 5 The char_length function is discussed in . There are two other fixed-length character types in PostgreSQL, shown in . The name type exists only for storage of identifiers in the internal system catalogs and is not intended for use by the general user. Its length is currently defined as 64 bytes (63 usable characters plus terminator) but should be referenced using the constant NAMEDATALEN. The length is set at compile time (and is therefore adjustable for special uses); the default maximum length may change in a future release. The type "char" (note the quotes) is different from char(1) in that it only uses one byte of storage. It is internally used in the system catalogs as a poor-man's enumeration type. Special Character Types Name Storage Size Description "char" 1 byte single-character internal type name 64 bytes internal type for object names

Binary Data Types binary data bytea The bytea data type allows storage of binary strings; see . Binary Data Types Name Storage Size Description bytea 4 bytes plus the actual binary string variable-length binary string

A binary string is a sequence of octets (or bytes). Binary strings are distinguished from character strings by two characteristics: First, binary strings specifically allow storing octets of value zero and other non-printable octets (usually, octets outside the range 32 to 126). Character strings disallow zero octets, and also disallow any other octet values and sequences of octet values that are invalid according to the database's selected character set encoding. Second, operations on binary strings process the actual bytes, whereas the processing of character strings depends on locale settings. In short, binary strings are appropriate for storing data that the programmer thinks of as raw bytes, whereas character strings are appropriate for storing text. When entering bytea values, octets of certain values must be escaped (but all octet values can be escaped) when used as part of a string literal in an SQL statement. In general, to escape an octet, it is converted into the three-digit octal number equivalent of its decimal octet value, and preceded by two backslashes (or one backslash if standard_conforming_strings is off). shows the characters that must be escaped, and gives the alternate escape sequences where applicable. <type>bytea</> Literal Escaped Octets Decimal Octet Value Description Escaped Input Representation Example Output Representation 0 zero octet '\\000' SELECT '\\000'::bytea; \000 39 single quote '\'' or '\\047' SELECT '\''::bytea; ' 92 backslash '\\\\' or '\\134' SELECT '\\\\'::bytea; \\ 0 to 31 and 127 to 255 non-printable octets '\\xxx' (octal value) SELECT '\\001'::bytea; \001

The requirement to escape non-printable octets actually varies depending on locale settings. In some instances you can get away with leaving them unescaped. Note that the result in each of the examples in was exactly one octet in length, even though the output representation of the zero octet and backslash are more than one character. The reason that you have to write so many backslashes, as shown in , is that an input string written as a string literal must pass through two parse phases in the PostgreSQL server. The first backslash of each pair is interpreted as an escape character by the string-literal parser (assuming standard_conforming_strings is off) and is therefore consumed, leaving the second backslash of the pair. The remaining backslash is then recognized by the bytea input function as starting either a three digit octal value or escaping another backslash. For example, a string literal passed to the server as '\\001' becomes \001 after passing through the string-literal parser. The \001 is then sent to the bytea input function, where it is converted to a single octet with a decimal value of 1. Note that the apostrophe character is not treated specially by bytea, so it follows the normal rules for string literals. (See also .) Bytea octets are also escaped in the output. In general, each non-printable octet is converted into its equivalent three-digit octal value and preceded by one backslash. Most printable octets are represented by their standard representation in the client character set. The octet with decimal value 92 (backslash) has a special alternative output representation. Details are in . <type>bytea</> Output Escaped Octets Decimal Octet Value Description Escaped Output Representation Example Output Result 92 backslash \\ SELECT '\\134'::bytea; \\ 0 to 31 and 127 to 255 non-printable octets \xxx (octal value) SELECT '\\001'::bytea; \001 32 to 126 printable octets client character set representation SELECT '\\176'::bytea; ~

Depending on the front end to PostgreSQL you use, you may have additional work to do in terms of escaping and unescaping bytea strings. For example, you may also have to escape line feeds and carriage returns if your interface automatically translates these. The SQL standard defines a different binary string type, called BLOB or BINARY LARGE OBJECT. The input format is different from bytea, but the provided functions and operators are mostly the same. Date/Time Types date time time without time zone time with time zone timestamp timestamp with time zone timestamp without time zone interval time span PostgreSQL supports the full set of SQL date and time types, shown in . The operations available on these data types are described in . Date/Time Types Name Storage Size Description Low Value High Value Resolution timestamp [ (p) ] [ without time zone ] 8 bytes both date and time 4713 BC 5874897 AD 1 microsecond / 14 digits timestamp [ (p) ] with time zone 8 bytes both date and time, with time zone 4713 BC 5874897 AD 1 microsecond / 14 digits interval [ (p) ] 12 bytes time intervals -178000000 years 178000000 years 1 microsecond / 14 digits date 4 bytes dates only 4713 BC 5874897 AD 1 day time [ (p) ] [ without time zone ] 8 bytes times of day only 00:00:00 24:00:00 1 microsecond / 14 digits time [ (p) ] with time zone 12 bytes times of day only, with time zone 00:00:00+1459 24:00:00-1459 1 microsecond / 14 digits

Prior to PostgreSQL 7.3, writing just timestamp was equivalent to timestamp with time zone. This was changed for SQL compliance. time, timestamp, and interval accept an optional precision value p which specifies the number of fractional digits retained in the seconds field. By default, there is no explicit bound on precision. The allowed range of p is from 0 to 6 for the timestamp and interval types. When timestamp values are stored as double precision floating-point numbers (currently the default), the effective limit of precision may be less than 6. timestamp values are stored as seconds before or after midnight 2000-01-01. Microsecond precision is achieved for dates within a few years of 2000-01-01, but the precision degrades for dates further away. When timestamp values are stored as eight-byte integers (a compile-time option), microsecond precision is available over the full range of values. However eight-byte integer timestamps have a more limited range of dates than shown above: from 4713 BC up to 294276 AD. The same compile-time option also determines whether time and interval values are stored as floating-point or eight-byte integers. In the floating-point case, large interval values degrade in precision as the size of the interval increases. For the time types, the allowed range of p is from 0 to 6 when eight-byte integer storage is used, or from 0 to 10 when floating-point storage is used. The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness. In most cases, a combination of date, time, timestamp without time zone, and timestamp with time zone should provide a complete range of date/time functionality required by any application. The types abstime and reltime are lower precision types which are used internally. You are discouraged from using these types in new applications and are encouraged to move any old ones over when appropriate. Any or all of these internal types might disappear in a future release. Date/Time Input Date and time input is accepted in almost any reasonable format, including ISO 8601, SQL-compatible, traditional POSTGRES, and others. For some formats, ordering of month, day, and year in date input is ambiguous and there is support for specifying the expected ordering of these fields. Set the parameter to MDY to select month-day-year interpretation, DMY to select day-month-year interpretation, or YMD to select year-month-day interpretation. PostgreSQL is more flexible in handling date/time input than the SQL standard requires. See for the exact parsing rules of date/time input and for the recognized text fields including months, days of the week, and time zones. Remember that any date or time literal input needs to be enclosed in single quotes, like text strings. Refer to for more information. SQL requires the following syntax type [ (p) ] 'value' where p in the optional precision specification is an integer corresponding to the number of fractional digits in the seconds field. Precision can be specified for time, timestamp, and interval types. The allowed values are mentioned above. If no precision is specified in a constant specification, it defaults to the precision of the literal value. Dates date shows some possible inputs for the date type. Date Input Example Description January 8, 1999 unambiguous in any datestyle input mode 1999-01-08 ISO 8601; January 8 in any mode (recommended format) 1/8/1999 January 8 in MDY mode; August 1 in DMY mode 1/18/1999 January 18 in MDY mode; rejected in other modes 01/02/03 January 2, 2003 in MDY mode; February 1, 2003 in DMY mode; February 3, 2001 in YMD mode 1999-Jan-08 January 8 in any mode Jan-08-1999 January 8 in any mode 08-Jan-1999 January 8 in any mode 99-Jan-08 January 8 in YMD mode, else error 08-Jan-99 January 8, except error in YMD mode Jan-08-99 January 8, except error in YMD mode 19990108 ISO 8601; January 8, 1999 in any mode 990108 ISO 8601; January 8, 1999 in any mode 1999.008 year and day of year J2451187 Julian day January 8, 99 BC year 99 before the Common Era

Times time time without time zone time with time zone The time-of-day types are time [ (p) ] without time zone and time [ (p) ] with time zone. Writing just time is equivalent to time without time zone. Valid input for these types consists of a time of day followed by an optional time zone. (See and .) If a time zone is specified in the input for time without time zone, it is silently ignored. You can also specify a date but it will be ignored, except when you use a time zone name that involves a daylight-savings rule, such as America/New_York. In this case specifying the date is required in order to determine whether standard or daylight-savings time applies. The appropriate time zone offset is recorded in the time with time zone value. Time Input Example Description 04:05:06.789 ISO 8601 04:05:06 ISO 8601 04:05 ISO 8601 040506 ISO 8601 04:05 AM same as 04:05; AM does not affect value 04:05 PM same as 16:05; input hour must be <= 12 04:05:06.789-8 ISO 8601 04:05:06-08:00 ISO 8601 04:05-08:00 ISO 8601 040506-08 ISO 8601 04:05:06 PST time zone specified by abbreviation 2003-04-12 04:05:06 America/New_York time zone specified by full name

Time Zone Input Example Description PST Abbreviation (for Pacific Standard Time) America/New_York Full time zone name PST8PDT POSIX-style time zone specification -8:00 ISO-8601 offset for PST -800 ISO-8601 offset for PST -8 ISO-8601 offset for PST zulu Military abbreviation for UTC z Short form of zulu

Refer to for more information on how to specify time zones. Time Stamps timestamp timestamp with time zone timestamp without time zone Valid input for the time stamp types consists of a concatenation of a date and a time, followed by an optional time zone, followed by an optional AD or BC. (Alternatively, AD/BC can appear before the time zone, but this is not the preferred ordering.) Thus 1999-01-08 04:05:06 and 1999-01-08 04:05:06 -8:00 are valid values, which follow the ISO 8601 standard. In addition, the wide-spread format January 8 04:05:06 1999 PST is supported. The SQL standard differentiates timestamp without time zone and timestamp with time zone literals by the presence of a + or -. Hence, according to the standard, TIMESTAMP '2004-10-19 10:23:54' is a timestamp without time zone, while TIMESTAMP '2004-10-19 10:23:54+02' is a timestamp with time zone. PostgreSQL never examines the content of a literal string before determining its type, and therefore will treat both of the above as timestamp without time zone. To ensure that a literal is treated as timestamp with time zone, give it the correct explicit type: TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02' In a literal that has been decided to be timestamp without time zone, PostgreSQL will silently ignore any time zone indication. That is, the resulting value is derived from the date/time fields in the input value, and is not adjusted for time zone. For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in the input string, then it is assumed to be in the time zone indicated by the system's parameter, and is converted to UTC using the offset for the timezone zone. When a timestamp with time zone value is output, it is always converted from UTC to the current timezone zone, and displayed as local time in that zone. To see the time in another time zone, either change timezone or use the AT TIME ZONE construct (see ). Conversions between timestamp without time zone and timestamp with time zone normally assume that the timestamp without time zone value should be taken or given as timezone local time. A different zone reference can be specified for the conversion using AT TIME ZONE. Intervals interval interval values can be written with the following syntax: @ quantity unit quantity unit... direction Where: quantity is a number (possibly signed); unit is second, minute, hour, day, week, month, year, decade, century, millennium, or abbreviations or plurals of these units; direction can be ago or empty. The at sign (@) is optional noise. The amounts of different units are implicitly added up with appropriate sign accounting. Quantities of days, hours, minutes, and seconds can be specified without explicit unit markings. For example, '1 12:59:10' is read the same as '1 day 12 hours 59 min 10 sec'. The optional subsecond precision p should be between 0 and 6, and defaults to the precision of the input literal. Internally interval values are stored as months, days, and seconds. This is done because the number of days in a month varies, and a day can have 23 or 25 hours if a daylight savings time adjustment is involved. Because intervals are usually created from constant strings or timestamp subtraction, this storage method works well in most cases. Functions justify_days and justify_hours are available for adjusting days and hours that overflow their normal periods. Special Values time constants date constants PostgreSQL supports several special date/time input values for convenience, as shown in . The values infinity and -infinity are specially represented inside the system and will be displayed the same way; but the others are simply notational shorthands that will be converted to ordinary date/time values when read. (In particular, now and related strings are converted to a specific time value as soon as they are read.) All of these values need to be written in single quotes when used as constants in SQL commands. Special Date/Time Inputs Input String Valid Types Description epoch date, timestamp 1970-01-01 00:00:00+00 (Unix system time zero) infinity timestamp later than all other time stamps -infinity timestamp earlier than all other time stamps now date, time, timestamp current transaction's start time today date, timestamp midnight today tomorrow date, timestamp midnight tomorrow yesterday date, timestamp midnight yesterday allballs time 00:00:00.00 UTC

The following SQL-compatible functions can also be used to obtain the current time value for the corresponding data type: CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME, LOCALTIMESTAMP. The latter four accept an optional subsecond precision specification. (See .) Note however that these are SQL functions and are not recognized as data input strings. Date/Time Output date output format formatting time output format formatting The output format of the date/time types can be set to one of the four styles ISO 8601, SQL (Ingres), traditional POSTGRES, and German, using the command SET datestyle. The default is the ISO format. (The SQL standard requires the use of the ISO 8601 format. The name of the SQL output format is a historical accident.) shows examples of each output style. The output of the date and time types is of course only the date or time part in accordance with the given examples. Date/Time Output Styles Style Specification Description Example ISO ISO 8601/SQL standard 1997-12-17 07:37:16-08 SQL traditional style 12/17/1997 07:37:16.00 PST POSTGRES original style Wed Dec 17 07:37:16 1997 PST German regional style 17.12.1997 07:37:16.00 PST

In the SQL and POSTGRES styles, day appears before month if DMY field ordering has been specified, otherwise month appears before day. (See for how this setting also affects interpretation of input values.) shows an example. Date Order Conventions datestyle Setting Input Ordering Example Output SQL, DMY day/month/year 17/12/1997 15:37:16.00 CET SQL, MDY month/day/year 12/17/1997 07:37:16.00 PST Postgres, DMY day/month/year Wed 17 Dec 07:37:16 1997 PST

interval output looks like the input format, except that units like century or week are converted to years and days and ago is converted to an appropriate sign. In ISO mode the output looks like quantity unit ... days hours:minutes:seconds The date/time styles can be selected by the user using the SET datestyle command, the parameter in the postgresql.conf configuration file, or the PGDATESTYLE environment variable on the server or client. The formatting function to_char (see ) is also available as a more flexible way to format the date/time output. Time Zones time zone Time zones, and time-zone conventions, are influenced by political decisions, not just earth geometry. Time zones around the world became somewhat standardized during the 1900's, but continue to be prone to arbitrary changes, particularly with respect to daylight-savings rules. PostgreSQL currently supports daylight-savings rules over the time period 1902 through 2038 (corresponding to the full range of conventional Unix system time). Times outside that range are taken to be in standard time for the selected time zone, no matter what part of the year they fall in. PostgreSQL endeavors to be compatible with the SQL standard definitions for typical usage. However, the SQL standard has an odd mix of date and time types and capabilities. Two obvious problems are: Although the date type does not have an associated time zone, the time type can. Time zones in the real world have little meaning unless associated with a date as well as a time, since the offset may vary through the year with daylight-saving time boundaries. The default time zone is specified as a constant numeric offset from UTC. It is therefore not possible to adapt to daylight-saving time when doing date/time arithmetic across DST boundaries. To address these difficulties, we recommend using date/time types that contain both date and time when using time zones. We recommend not using the type time with time zone (though it is supported by PostgreSQL for legacy applications and for compliance with the SQL standard). PostgreSQL assumes your local time zone for any type containing only date or time. All timezone-aware dates and times are stored internally in UTC. They are converted to local time in the zone specified by the configuration parameter before being displayed to the client. PostgreSQL allows you to specify time zones in three different forms: A full time zone name, for example America/New_York. The recognized time zone names are listed in the pg_timezone_names view (see ). PostgreSQL uses the widely-used zic time zone data for this purpose, so the same names are also recognized by much other software. A time zone abbreviation, for example PST. Such a specification merely defines a particular offset from UTC, in contrast to full time zone names which may imply a set of daylight savings transition-date rules as well. The recognized abbreviations are listed in the pg_timezone_abbrevs view (see ). You cannot set the configuration parameter using a time zone abbreviation, but you can use abbreviations in date/time input values and with the AT TIME ZONE operator. In addition to the timezone names and abbreviations, PostgreSQL will accept POSIX-style time zone specifications of the form STDoffset or STDoffsetDST, where STD is a zone abbreviation, offset is a numeric offset in hours west from UTC, and DST is an optional daylight-savings zone abbreviation, assumed to stand for one hour ahead of the given offset. For example, if EST5EDT were not already a recognized zone name, it would be accepted and would be functionally equivalent to USA East Coast time. When a daylight-savings zone name is present, it is assumed to be used according to USA time zone rules, so this feature is of limited use outside North America. One should also be wary that this provision can lead to silently accepting bogus input, since there is no check on the reasonableness of the zone abbreviations. For example, SET TIMEZONE TO FOOBAR0 will work, leaving the system effectively using a rather peculiar abbreviation for GMT. There is a conceptual and practical difference between the abbreviations and the full names: abbreviations always represent a fixed offset from UTC, whereas most of the full names imply a local daylight-savings time rule and so have two possible UTC offsets. In all cases, timezone names are recognized case-insensitively. (This is a change from PostgreSQL versions prior to 8.2, which were case-sensitive in some contexts and not others.) Neither full names nor abbreviations are hard-wired into the server; they are obtained from configuration files stored under .../share/timezone/ and .../share/timezonesets/ of the installation directory (see ). The configuration parameter can be set in the file postgresql.conf, or in any of the other standard ways described in . There are also several special ways to set it: If timezone is not specified in postgresql.conf nor as a server command-line option, the server attempts to use the value of the TZ environment variable as the default time zone. If TZ is not defined or is not any of the time zone names known to PostgreSQL, the server attempts to determine the operating system's default time zone by checking the behavior of the C library function localtime(). The default time zone is selected as the closest match among PostgreSQL's known time zones. The SQL command SET TIME ZONE sets the time zone for the session. This is an alternative spelling of SET TIMEZONE TO with a more SQL-spec-compatible syntax. The PGTZ environment variable, if set at the client, is used by libpq applications to send a SET TIME ZONE command to the server upon connection. Internals PostgreSQL uses Julian dates for all date/time calculations. They have the nice property of correctly predicting/calculating any date more recent than 4713 BC to far into the future, using the assumption that the length of the year is 365.2425 days. Date conventions before the 19th century make for interesting reading, but are not consistent enough to warrant coding into a date/time handler. Boolean Type Boolean data type true false PostgreSQL provides the standard SQL type boolean. boolean can have one of only two states: true or false. A third state, unknown, is represented by the SQL null value. Valid literal values for the true state are: TRUE 't' 'true' 'y' 'yes' '1' For the false state, the following values can be used: FALSE 'f' 'false' 'n' 'no' '0' Using the key words TRUE and FALSE is preferred (and SQL-compliant). Using the <type>boolean</type> type CREATE TABLE test1 (a boolean, b text); INSERT INTO test1 VALUES (TRUE, 'sic est'); INSERT INTO test1 VALUES (FALSE, 'non est'); SELECT * FROM test1; a | b ---+--------- t | sic est f | non est SELECT * FROM test1 WHERE a; a | b ---+--------- t | sic est shows that boolean values are output using the letters t and f. boolean uses 1 byte of storage. Geometric Types Geometric data types represent two-dimensional spatial objects. shows the geometric types available in PostgreSQL. The most fundamental type, the point, forms the basis for all of the other types. Geometric Types Name Storage Size Representation Description point 16 bytes Point on the plane (x,y) line 32 bytes Infinite line (not fully implemented) ((x1,y1),(x2,y2)) lseg 32 bytes Finite line segment ((x1,y1),(x2,y2)) box 32 bytes Rectangular box ((x1,y1),(x2,y2)) path 16+16n bytes Closed path (similar to polygon) ((x1,y1),...) path 16+16n bytes Open path [(x1,y1),...] polygon 40+16n bytes Polygon (similar to closed path) ((x1,y1),...) circle 24 bytes Circle <(x,y),r> (center and radius)

A rich set of functions and operators is available to perform various geometric operations such as scaling, translation, rotation, and determining intersections. They are explained in . Points point Points are the fundamental two-dimensional building block for geometric types. Values of type point are specified using the following syntax: ( x , y ) x , y where x and y are the respective coordinates as floating-point numbers. Line Segments lseg line segment Line segments (lseg) are represented by pairs of points. Values of type lseg are specified using the following syntax: ( ( x1 , y1 ) , ( x2 , y2 ) ) ( x1 , y1 ) , ( x2 , y2 ) x1 , y1 , x2 , y2 where (x1,y1) and (x2,y2) are the end points of the line segment. Boxes box (data type) rectangle Boxes are represented by pairs of points that are opposite corners of the box. Values of type box are specified using the following syntax: ( ( x1 , y1 ) , ( x2 , y2 ) ) ( x1 , y1 ) , ( x2 , y2 ) x1 , y1 , x2 , y2 where (x1,y1) and (x2,y2) are any two opposite corners of the box. Boxes are output using the first syntax. The corners are reordered on input to store the upper right corner, then the lower left corner. Other corners of the box can be entered, but the lower left and upper right corners are determined from the input and stored. Paths path (data type) Paths are represented by lists of connected points. Paths can be open, where the first and last points in the list are not considered connected, or closed, where the first and last points are considered connected. Values of type path are specified using the following syntax: ( ( x1 , y1 ) , ... , ( xn , yn ) ) [ ( x1 , y1 ) , ... , ( xn , yn ) ] ( x1 , y1 ) , ... , ( xn , yn ) ( x1 , y1 , ... , xn , yn ) x1 , y1 , ... , xn , yn where the points are the end points of the line segments comprising the path. Square brackets ([]) indicate an open path, while parentheses (()) indicate a closed path. Paths are output using the first syntax. Polygons polygon Polygons are represented by lists of points (the vertexes of the polygon). Polygons should probably be considered equivalent to closed paths, but are stored differently and have their own set of support routines. Values of type polygon are specified using the following syntax: ( ( x1 , y1 ) , ... , ( xn , yn ) ) ( x1 , y1 ) , ... , ( xn , yn ) ( x1 , y1 , ... , xn , yn ) x1 , y1 , ... , xn , yn where the points are the end points of the line segments comprising the boundary of the polygon. Polygons are output using the first syntax. Circles circle Circles are represented by a center point and a radius. Values of type circle are specified using the following syntax: < ( x , y ) , r > ( ( x , y ) , r ) ( x , y ) , r x , y , r where (x,y) is the center and r is the radius of the circle. Circles are output using the first syntax. Network Address Types network data types PostgreSQL offers data types to store IPv4, IPv6, and MAC addresses, as shown in . It is preferable to use these types instead of plain text types to store network addresses, because these types offer input error checking and several specialized operators and functions (see ). Network Address Types Name Storage Size Description cidr 12 or 24 bytes IPv4 and IPv6 networks inet 12 or 24 bytes IPv4 and IPv6 hosts and networks macaddr 6 bytes MAC addresses

When sorting inet or cidr data types, IPv4 addresses will always sort before IPv6 addresses, including IPv4 addresses encapsulated or mapped into IPv6 addresses, such as ::10.2.3.4 or ::ffff::10.4.3.2. <type>inet</type> inet (data type) The inet type holds an IPv4 or IPv6 host address, and optionally the identity of the subnet it is in, all in one field. The subnet identity is represented by stating how many bits of the host address represent the network address (the netmask). If the netmask is 32 and the address is IPv4, then the value does not indicate a subnet, only a single host. In IPv6, the address length is 128 bits, so 128 bits specify a unique host address. Note that if you want to accept networks only, you should use the cidr type rather than inet. The input format for this type is address/y where address is an IPv4 or IPv6 address and y is the number of bits in the netmask. If the /y part is left off, then the netmask is 32 for IPv4 and 128 for IPv6, so the value represents just a single host. On display, the /y portion is suppressed if the netmask specifies a single host. <type>cidr</> cidr The cidr type holds an IPv4 or IPv6 network specification. Input and output formats follow Classless Internet Domain Routing conventions. The format for specifying networks is address/y where address is the network represented as an IPv4 or IPv6 address, and y is the number of bits in the netmask. If y is omitted, it is calculated using assumptions from the older classful network numbering system, except that it will be at least large enough to include all of the octets written in the input. It is an error to specify a network address that has bits set to the right of the specified netmask. shows some examples. <type>cidr</> Type Input Examples cidr Input cidr Output abbrev(cidr) 192.168.100.128/25 192.168.100.128/25 192.168.100.128/25 192.168/24 192.168.0.0/24 192.168.0/24 192.168/25 192.168.0.0/25 192.168.0.0/25 192.168.1 192.168.1.0/24 192.168.1/24 192.168 192.168.0.0/24 192.168.0/24 128.1 128.1.0.0/16 128.1/16 128 128.0.0.0/16 128.0/16 128.1.2 128.1.2.0/24 128.1.2/24 10.1.2 10.1.2.0/24 10.1.2/24 10.1 10.1.0.0/16 10.1/16 10 10.0.0.0/8 10/8 10.1.2.3/32 10.1.2.3/32 10.1.2.3/32 2001:4f8:3:ba::/64 2001:4f8:3:ba::/64 2001:4f8:3:ba::/64 2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128 2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128 2001:4f8:3:ba:2e0:81ff:fe22:d1f1 ::ffff:1.2.3.0/120 ::ffff:1.2.3.0/120 ::ffff:1.2.3/120 ::ffff:1.2.3.0/128 ::ffff:1.2.3.0/128 ::ffff:1.2.3.0/128

<type>inet</type> vs. <type>cidr</type> The essential difference between inet and cidr data types is that inet accepts values with nonzero bits to the right of the netmask, whereas cidr does not. If you do not like the output format for inet or cidr values, try the functions host, text, and abbrev. <type>macaddr</></> <indexterm> <primary>macaddr (data type)</primary> </indexterm> <indexterm> <primary>MAC address</primary> <see>macaddr</see> </indexterm> <para> The <type>macaddr</> type stores MAC addresses, i.e., Ethernet card hardware addresses (although MAC addresses are used for other purposes as well). Input is accepted in various customary formats, including <simplelist> <member><literal>'08002b:010203'</></member> <member><literal>'08002b-010203'</></member> <member><literal>'0800.2b01.0203'</></member> <member><literal>'08-00-2b-01-02-03'</></member> <member><literal>'08:00:2b:01:02:03'</></member> </simplelist> which would all specify the same address. Upper and lower case is accepted for the digits <literal>a</> through <literal>f</>. Output is always in the last of the forms shown. </para> </sect2> </sect1> <sect1 id="datatype-bit"> <title>Bit String Types bit string data type Bit strings are strings of 1's and 0's. They can be used to store or visualize bit masks. There are two SQL bit types: bit(n) and bit varying(n), where n is a positive integer. bit type data must match the length n exactly; it is an error to attempt to store shorter or longer bit strings. bit varying data is of variable length up to the maximum length n; longer strings will be rejected. Writing bit without a length is equivalent to bit(1), while bit varying without a length specification means unlimited length. If one explicitly casts a bit-string value to bit(n), it will be truncated or zero-padded on the right to be exactly n bits, without raising an error. Similarly, if one explicitly casts a bit-string value to bit varying(n), it will be truncated on the right if it is more than n bits. Refer to for information about the syntax of bit string constants. Bit-logical operators and string manipulation functions are available; see . Using the bit string types CREATE TABLE test (a BIT(3), b BIT VARYING(5)); INSERT INTO test VALUES (B'101', B'00'); INSERT INTO test VALUES (B'10', B'101'); ERROR: bit string length 2 does not match type bit(3) INSERT INTO test VALUES (B'10'::bit(3), B'101'); SELECT * FROM test; a | b -----+----- 101 | 00 100 | 101 &array; &rowtypes; Object Identifier Types object identifier data type oid regproc regprocedure regoper regoperator regclass regtype xid cid tid Object identifiers (OIDs) are used internally by PostgreSQL as primary keys for various system tables. OIDs are not added to user-created tables, unless WITH OIDS is specified when the table is created, or the configuration variable is enabled. Type oid represents an object identifier. There are also several alias types for oid: regproc, regprocedure, regoper, regoperator, regclass, and regtype. shows an overview. The oid type is currently implemented as an unsigned four-byte integer. Therefore, it is not large enough to provide database-wide uniqueness in large databases, or even in large individual tables. So, using a user-created table's OID column as a primary key is discouraged. OIDs are best used only for references to system tables. The oid type itself has few operations beyond comparison. It can be cast to integer, however, and then manipulated using the standard integer operators. (Beware of possible signed-versus-unsigned confusion if you do this.) The OID alias types have no operations of their own except for specialized input and output routines. These routines are able to accept and display symbolic names for system objects, rather than the raw numeric value that type oid would use. The alias types allow simplified lookup of OID values for objects. For example, to examine the pg_attribute rows related to a table mytable, one could write SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass; rather than SELECT * FROM pg_attribute WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable'); While that doesn't look all that bad by itself, it's still oversimplified. A far more complicated sub-select would be needed to select the right OID if there are multiple tables named mytable in different schemas. The regclass input converter handles the table lookup according to the schema path setting, and so it does the right thing automatically. Similarly, casting a table's OID to regclass is handy for symbolic display of a numeric OID. Object Identifier Types Name References Description Value Example oid any numeric object identifier 564182 regproc pg_proc function name sum regprocedure pg_proc function with argument types sum(int4) regoper pg_operator operator name + regoperator pg_operator operator with argument types *(integer,integer) or -(NONE,integer) regclass pg_class relation name pg_type regtype pg_type data type name integer

All of the OID alias types accept schema-qualified names, and will display schema-qualified names on output if the object would not be found in the current search path without being qualified. The regproc and regoper alias types will only accept input names that are unique (not overloaded), so they are of limited use; for most uses regprocedure or regoperator is more appropriate. For regoperator, unary operators are identified by writing NONE for the unused operand. An additional property of the OID alias types is that if a constant of one of these types appears in a stored expression (such as a column default expression or view), it creates a dependency on the referenced object. For example, if a column has a default expression nextval('my_seq'::regclass), PostgreSQL understands that the default expression depends on the sequence my_seq; the system will not let the sequence be dropped without first removing the default expression. Another identifier type used by the system is xid, or transaction (abbreviated xact) identifier. This is the data type of the system columns xmin and xmax. Transaction identifiers are 32-bit quantities. A third identifier type used by the system is cid, or command identifier. This is the data type of the system columns cmin and cmax. Command identifiers are also 32-bit quantities. A final identifier type used by the system is tid, or tuple identifier (row identifier). This is the data type of the system column ctid. A tuple ID is a pair (block number, tuple index within block) that identifies the physical location of the row within its table. (The system columns are further explained in .) Pseudo-Types record any anyarray anyelement void trigger language_handler cstring internal opaque The PostgreSQL type system contains a number of special-purpose entries that are collectively called pseudo-types. A pseudo-type cannot be used as a column data type, but it can be used to declare a function's argument or result type. Each of the available pseudo-types is useful in situations where a function's behavior does not correspond to simply taking or returning a value of a specific SQL data type. lists the existing pseudo-types. Pseudo-Types Name Description any Indicates that a function accepts any input data type whatever. anyarray Indicates that a function accepts any array data type (see ). anyelement Indicates that a function accepts any data type (see ). cstring Indicates that a function accepts or returns a null-terminated C string. internal Indicates that a function accepts or returns a server-internal data type. language_handler A procedural language call handler is declared to return language_handler. record Identifies a function returning an unspecified row type. trigger A trigger function is declared to return trigger. void Indicates that a function returns no value. opaque An obsolete type name that formerly served all the above purposes.

Functions coded in C (whether built-in or dynamically loaded) may be declared to accept or return any of these pseudo data types. It is up to the function author to ensure that the function will behave safely when a pseudo-type is used as an argument type. Functions coded in procedural languages may use pseudo-types only as allowed by their implementation languages. At present the procedural languages all forbid use of a pseudo-type as argument type, and allow only void and record as a result type (plus trigger when the function is used as a trigger). Some also support polymorphic functions using the types anyarray and anyelement. The internal pseudo-type is used to declare functions that are meant only to be called internally by the database system, and not by direct invocation in a SQL query. If a function has at least one internal-type argument then it cannot be called from SQL. To preserve the type safety of this restriction it is important to follow this coding rule: do not create any function that is declared to return internal unless it has at least one internal argument. <acronym>XML</> Type XML The data type xml can be used to store XML data. Its advantage over storing XML data in a text field is that it checks the input values for well-formedness, and there are support functions to perform type-safe operations on it; see . In particular, the xml type can store well-formed documents, as defined by the XML standard, as well as content fragments, which are defined by the production XMLDecl? content in the XML standard. Roughly, this means that content fragments can have more than one top-level element or character node. The expression xmlvalue IS DOCUMENT can be used to evaluate whether a particular xml value is a full document or only a content fragment. To produce a value of type xml from character data, use the function xmlparse: XMLPARSE ( { DOCUMENT | CONTENT } value) Examples: Manual...') XMLPARSE (CONTENT 'abcbarfoo') ]]> While this is the only way to convert character strings into XML values according to the SQL standard, the PostgreSQL-specific syntaxes bar' 'bar'::xml ]]> can also be used. The xml type does not validate its input values against a possibly included document type declaration (DTD). The inverse operation, producing character string type values from xml, uses the function xmlserialize: XMLSERIALIZE ( { DOCUMENT | CONTENT } value AS type ) type can be one of character, character varying, or text (or an alias name for those). Again, according to the SQL standard, this is the only way to convert between type xml and character types, but PostgreSQL also allows you to simply cast the value. Care must be taken when dealing with multiple character encodings on the client, server, and in the XML data passed through them. When using the text mode to pass queries to the server and query results to the client (which is the normal mode), PostgreSQL converts all character data passed between the client and the server and vice versa to the character encoding of the respective end; see . This includes string representations of XML values, such as in the above examples. This would ordinarily mean that encoding declarations contained in XML data might become invalid as the character data is converted to other encodings while travelling between client and server, while the embedded encoding declaration is not changed. To cope with this behavior, an encoding declaration contained in a character string presented for input to the xml type is ignored, and the content is always assumed to be in the current server encoding. Consequently, for correct processing, such character strings of XML data must be sent off from the client in the current client encoding. It is the responsibility of the client to either convert the document to the current client encoding before sending it off to the server or to adjust the client encoding appropriately. On output, values of type xml will not have an encoding declaration, and clients must assume that the data is in the current client encoding. When using the binary mode to pass query parameters to the server and query results back the the client, no character set conversion is performed, so the situation is different. In this case, an encoding declaration in the XML data will be observed, and if it is absent, the data will be assumed to be in UTF-8 (as required by the XML standard; note that PostgreSQL does not support UTF-16 at all). On output, data will have an encoding declaration specifying the client encoding, unless the client encoding is UTF-8, in which case it will be omitted. Needless to say, processing XML data with PostgreSQL will be less error-prone and more efficient if data encoding, client encoding, and server encoding are the same. Since XML data is internally processed in UTF-8, computations will be most efficient if the server encoding is also UTF-8. XML (Extensible Markup Language) support is not just the existance of an xml data type, but a variety of features supported by a database system. These capabilities include import/export, indexing, searching, transforming, and XML to SQL mapping. PostgreSQL supports some but not all of these XML capabilities. For an overview of XML use in databases, see . Import/Export There is no facility for mapping XML to relational tables. An external tool must be used for this. One simple way to export XML is to use psql in HTML mode (\pset format html), and convert the XHTML output to XML using an external tool. Indexing /contrib/xml2 functions can be used in expression indexes to index specific XML fields. To index the full contents of XML documents, the full-text indexing tool /contrib/tsearch2 can be used. Of course, Tsearch2 indexes have no XML awareness so additional /contrib/xml2 checks should be added to queries. Searching XPath searches are implemented using /contrib/xml2. It processes XML text documents and returns results based on the requested query. Transforming /contrib/xml2 supports XSLT (Extensible Stylesheet Language Transformation). XML to SQL Mapping This involves converting XML data to and from relational structures. PostgreSQL has no internal support for such mapping, and relies on external tools to do such conversions.