Functions and Operators

Functions and Operators function operator PostgreSQL provides a large number of functions and operators for the built-in data types. Users can also define their own functions and operators, as described in . The psql commands \df and \do can be used to show the list of all actually available functions and operators, respectively. If you are concerned about portability then take note that most of the functions and operators described in this chapter, with the exception of the most trivial arithmetic and comparison operators and some explicitly marked functions, are not specified by the SQL standard. Some of the extended functionality is present in other SQL database management systems, and in many cases this functionality is compatible and consistent between the various implementations. This chapter is also not exhaustive; additional functions appear in relevant sections of the manual. Logical Operators operator logical Boolean operators operators, logical The usual logical operators are available: AND (operator) OR (operator) NOT (operator) conjunction disjunction negation AND OR NOT SQL uses a three-valued Boolean logic where the null value represents unknown. Observe the following truth tables: a b a AND b a OR b TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE NULL NULL TRUE FALSE FALSE FALSE FALSE FALSE NULL FALSE NULL NULL NULL NULL NULL a NOT a TRUE FALSE FALSE TRUE NULL NULL The operators AND and OR are commutative, that is, you can switch the left and right operand without affecting the result. But see for more information about the order of evaluation of subexpressions. Comparison Operators comparison operators The usual comparison operators are available, shown in . Comparison Operators Operator Description < less than > greater than <= less than or equal to >= greater than or equal to = equal <> or != not equal

The != operator is converted to <> in the parser stage. It is not possible to implement != and <> operators that do different things. Comparison operators are available for all data types where this makes sense. All comparison operators are binary operators that return values of type boolean; expressions like 1 < 2 < 3 are not valid (because there is no < operator to compare a Boolean value with 3). BETWEEN In addition to the comparison operators, the special BETWEEN construct is available. a BETWEEN x AND y is equivalent to a >= x AND a <= y Similarly, a NOT BETWEEN x AND y is equivalent to a < x OR a > y There is no difference between the two respective forms apart from the CPU cycles required to rewrite the first one into the second one internally. BETWEEN SYMETRIC BETWEEN SYMMETRIC is the same as BETWEEN except there is no requirement that the argument to the left of AND be less than or equal to the argument on the right; the proper range is automatically determined. IS NULL IS NOT NULL ISNULL NOTNULL To check whether a value is or is not null, use the constructs expression IS NULL expression IS NOT NULL or the equivalent, but nonstandard, constructs expression ISNULL expression NOTNULL null valuecomparing Do not write expression = NULL because NULL is not equal to NULL. (The null value represents an unknown value, and it is not known whether two unknown values are equal.) This behavior conforms to the SQL standard. Some applications might expect that expression = NULL returns true if expression evaluates to the null value. It is highly recommended that these applications be modified to comply with the SQL standard. However, if that cannot be done the configuration variable is available. If it is enabled, PostgreSQL will convert x = NULL clauses to x IS NULL. This was the default behavior in PostgreSQL releases 6.5 through 7.1. If the expression is row-valued, then IS NULL is true when the row expression itself is null or when all the row's fields are null, while IS NOT NULL is true when the row expression itself is non-null and all the row's fields are non-null. This definition conforms to the SQL standard, and is a change from the inconsistent behavior exhibited by PostgreSQL versions prior to 8.2. IS DISTINCT FROM IS NOT DISTINCT FROM The ordinary comparison operators yield null (signifying unknown) when either input is null. Another way to do comparisons is with the IS NOT DISTINCT FROM construct: expression IS DISTINCT FROM expression expression IS NOT DISTINCT FROM expression For non-null inputs, IS DISTINCT FROM is the same as the <> operator. However, when both inputs are null it will return false, and when just one input is null it will return true. Similarly, IS NOT DISTINCT FROM is identical to = for non-null inputs, but it returns true when both inputs are null, and false when only one input is null. Thus, these constructs effectively act as though null were a normal data value, rather than unknown. IS TRUE IS NOT TRUE IS FALSE IS NOT FALSE IS UNKNOWN IS NOT UNKNOWN Boolean values can also be tested using the constructs expression IS TRUE expression IS NOT TRUE expression IS FALSE expression IS NOT FALSE expression IS UNKNOWN expression IS NOT UNKNOWN These will always return true or false, never a null value, even when the operand is null. A null input is treated as the logical value unknown. Notice that IS UNKNOWN and IS NOT UNKNOWN are effectively the same as IS NULL and IS NOT NULL, respectively, except that the input expression must be of Boolean type. Mathematical Functions and Operators Mathematical operators are provided for many PostgreSQL types. For types without common mathematical conventions for all possible permutations (e.g., date/time types) we describe the actual behavior in subsequent sections. shows the available mathematical operators. Mathematical Operators Operator Description Example Result + addition 2 + 3 5 - subtraction 2 - 3 -1 * multiplication 2 * 3 6 / division (integer division truncates results) 4 / 2 2 % modulo (remainder) 5 % 4 1 ^ exponentiation 2.0 ^ 3.0 8 |/ square root |/ 25.0 5 ||/ cube root ||/ 27.0 3 ! factorial 5 ! 120 !! factorial (prefix operator) !! 5 120 @ absolute value @ -5.0 5 & bitwise AND 91 & 15 11 | bitwise OR 32 | 3 35 # bitwise XOR 17 # 5 20 ~ bitwise NOT ~1 -2 << bitwise shift left 1 << 4 16 >> bitwise shift right 8 >> 2 2

The bitwise operators work only on integral data types, whereas the others are available for all numeric data types. The bitwise operators are also available for the bit string types bit and bit varying, as shown in . shows the available mathematical functions. In the table, dp indicates double precision. Many of these functions are provided in multiple forms with different argument types. Except where noted, any given form of a function returns the same data type as its argument. The functions working with double precision data are mostly implemented on top of the host system's C library; accuracy and behavior in boundary cases can therefore vary depending on the host system. abs cbrt ceiling degrees exp floor ln log mod π power radians random round setseed sign sqrt trunc width_bucket Mathematical Functions Function Return Type Description Example Result abs(x) (same as x) absolute value abs(-17.4) 17.4 cbrt(dp) dp cube root cbrt(27.0) 3 ceil(dp or numeric) (same as input) smallest integer not less than argument ceil(-42.8) -42 ceiling(dp or numeric) (same as input) smallest integer not less than argument (alias for ceil) ceiling(-95.3) -95 degrees(dp) dp radians to degrees degrees(0.5) 28.6478897565412 exp(dp or numeric) (same as input) exponential exp(1.0) 2.71828182845905 floor(dp or numeric) (same as input) largest integer not greater than argument floor(-42.8) -43 ln(dp or numeric) (same as input) natural logarithm ln(2.0) 0.693147180559945 log(dp or numeric) (same as input) base 10 logarithm log(100.0) 2 log(b numeric, x numeric) numeric logarithm to base b log(2.0, 64.0) 6.0000000000 mod(y, x) (same as argument types) remainder of y/x mod(9,4) 1 pi() dp π constant pi() 3.14159265358979 power(a dp, b dp) dp a raised to the power of b power(9.0, 3.0) 729 power(a numeric, b numeric) numeric a raised to the power of b power(9.0, 3.0) 729 radians(dp) dp degrees to radians radians(45.0) 0.785398163397448 random() dp random value between 0.0 and 1.0 random() round(dp or numeric) (same as input) round to nearest integer round(42.4) 42 round(v numeric, s int) numeric round to s decimal places round(42.4382, 2) 42.44 setseed(dp) void set seed for subsequent random() calls (value between 0 and 1.0) setseed(0.54823) sign(dp or numeric) (same as input) sign of the argument (-1, 0, +1) sign(-8.4) -1 sqrt(dp or numeric) (same as input) square root sqrt(2.0) 1.4142135623731 trunc(dp or numeric) (same as input) truncate toward zero trunc(42.8) 42 trunc(v numeric, s int) numeric truncate to s decimal places trunc(42.4382, 2) 42.43 width_bucket(op numeric, b1 numeric, b2 numeric, count int) int return the bucket to which operand would be assigned in an equidepth histogram with count buckets, in the range b1 to b2 width_bucket(5.35, 0.024, 10.06, 5) 3 width_bucket(op dp, b1 dp, b2 dp, count int) int return the bucket to which operand would be assigned in an equidepth histogram with count buckets, in the range b1 to b2 width_bucket(5.35, 0.024, 10.06, 5) 3

Finally, shows the available trigonometric functions. All trigonometric functions take arguments and return values of type double precision. acos asin atan atan2 cos cot sin tan Trigonometric Functions Function Description acos(x) inverse cosine asin(x) inverse sine atan(x) inverse tangent atan2(x, y) inverse tangent of x/y cos(x) cosine cot(x) cotangent sin(x) sine tan(x) tangent

String Functions and Operators This section describes functions and operators for examining and manipulating string values. Strings in this context include values of all the types character, character varying, and text. Unless otherwise noted, all of the functions listed below work on all of these types, but be wary of potential effects of the automatic padding when using the character type. Generally, the functions described here also work on data of non-string types by converting that data to a string representation first. Some functions also exist natively for the bit-string types. SQL defines some string functions with a special syntax where certain key words rather than commas are used to separate the arguments. Details are in . These functions are also implemented using the regular syntax for function invocation. (See .) bit_length char_length convert lower octet_length overlay position substring trim upper <acronym>SQL</acronym> String Functions and Operators Function Return Type Description Example Result string || string text String concatenation character string concatenation 'Post' || 'greSQL' PostgreSQL bit_length(string) int Number of bits in string bit_length('jose') 32 char_length(string) or character_length(string) int Number of characters in string character string length length of a character string character string, length char_length('jose') 4 convert(string using conversion_name) text Change encoding using specified conversion name. Conversions can be defined by CREATE CONVERSION. Also there are some pre-defined conversion names. See for available conversion names. convert('PostgreSQL' using iso_8859_1_to_utf8) 'PostgreSQL' in UTF8 (Unicode, 8-bit) encoding lower(string) text Convert string to lower case lower('TOM') tom octet_length(string) int Number of bytes in string octet_length('jose') 4 overlay(string placing string from int for int) text Replace substring overlay('Txxxxas' placing 'hom' from 2 for 4) Thomas position(substring in string) int Location of specified substring position('om' in 'Thomas') 3 substring(string from int for int) text Extract substring substring('Thomas' from 2 for 3) hom substring(string from pattern) text Extract substring matching POSIX regular expression. See for more information on pattern matching. substring('Thomas' from '...$') mas substring(string from pattern for escape) text Extract substring matching SQL regular expression. See for more information on pattern matching. substring('Thomas' from '%#"o_a#"_' for '#') oma trim(leading | trailing | both characters from string) text Remove the longest string containing only the characters (a space by default) from the start/end/both ends of the string trim(both 'x' from 'xTomxx') Tom upper(string) text Convert string to uppercase upper('tom') TOM

Additional string manipulation functions are available and are listed in . Some of them are used internally to implement the SQL-standard string functions listed in . ascii btrim chr decode encode initcap lpad ltrim md5 pg_client_encoding quote_ident quote_literal repeat replace rpad rtrim split_part strpos substr to_ascii to_hex translate Other String Functions Function Return Type Description Example Result ascii(string) int ASCII code of the first byte of the argument ascii('x') 120 btrim(string text , characters text) text Remove the longest string consisting only of characters in characters (a space by default) from the start and end of string btrim('xyxtrimyyx', 'xy') trim chr(int) text Character with the given ASCII code chr(65) A convert(string text, src_encoding name, dest_encoding name) text Convert string to dest_encoding. The original encoding is specified by src_encoding. If src_encoding is omitted, database encoding is assumed. convert( 'text_in_utf8', 'UTF8', 'LATIN1') text_in_utf8 represented in ISO 8859-1 encoding decode(string text, type text) bytea Decode binary data from string previously encoded with encode. Parameter type is same as in encode. decode('MTIzAAE=', 'base64') 123\000\001 encode(data bytea, type text) text Encode binary data to ASCII-only representation. Supported types are: base64, hex, escape. encode( E'123\\000\\001', 'base64') MTIzAAE= initcap(string) text Convert the first letter of each word to uppercase and the rest to lowercase. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. initcap('hi THOMAS') Hi Thomas length(string) int Number of characters in string length('jose') 4 lpad(string text, length int , fill text) text Fill up the string to length length by prepending the characters fill (a space by default). If the string is already longer than length then it is truncated (on the right). lpad('hi', 5, 'xy') xyxhi ltrim(string text , characters text) text Remove the longest string containing only characters from characters (a space by default) from the start of string ltrim('zzzytrim', 'xyz') trim md5(string) text Calculates the MD5 hash of string, returning the result in hexadecimal md5('abc') 900150983cd24fb0 d6963f7d28e17f72 pg_client_encoding() name Current client encoding name pg_client_encoding() SQL_ASCII quote_ident(string) text Return the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary (i.e., if the string contains non-identifier characters or would be case-folded). Embedded quotes are properly doubled. quote_ident('Foo bar') "Foo bar" quote_literal(string) text Return the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. quote_literal( 'O\'Reilly') 'O''Reilly' regexp_replace(string text, pattern text, replacement text [,flags text]) text Replace substring matching POSIX regular expression. See for more information on pattern matching. regexp_replace('Thomas', '.[mN]a.', 'M') ThM repeat(string text, number int) text Repeat string the specified number of times repeat('Pg', 4) PgPgPgPg replace(string text, from text, to text) text Replace all occurrences in string of substring from with substring to replace( 'abcdefabcdef', 'cd', 'XX') abXXefabXXef rpad(string text, length int , fill text) text Fill up the string to length length by appending the characters fill (a space by default). If the string is already longer than length then it is truncated. rpad('hi', 5, 'xy') hixyx rtrim(string text , characters text) text Remove the longest string containing only characters from characters (a space by default) from the end of string rtrim('trimxxxx', 'x') trim split_part(string text, delimiter text, field int) text Split string on delimiter and return the given field (counting from one) split_part('abc~@~def~@~ghi', '~@~', 2) def strpos(string, substring) int Location of specified substring (same as position(substring in string), but note the reversed argument order) strpos('high', 'ig') 2 substr(string, from , count) text Extract substring (same as substring(string from from for count)) substr('alphabet', 3, 2) ph to_ascii(string text , encoding text) text Convert string to ASCII from another encoding (only supports conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings) to_ascii('Karel') Karel to_hex(number int or bigint) text Convert number to its equivalent hexadecimal representation to_hex(2147483647) 7fffffff translate(string text, from text, to text) text Any character in string that matches a character in the from set is replaced by the corresponding character in the to set translate('12345', '14', 'ax') a23x5

Built-in Conversions Conversion Name The conversion names follow a standard naming scheme: The official name of the source encoding with all non-alphanumeric characters replaced by underscores followed by _to_ followed by the equally processed destination encoding name. Therefore the names might deviate from the customary encoding names. Source Encoding Destination Encoding ascii_to_mic SQL_ASCII MULE_INTERNAL ascii_to_utf8 SQL_ASCII UTF8 big5_to_euc_tw BIG5 EUC_TW big5_to_mic BIG5 MULE_INTERNAL big5_to_utf8 BIG5 UTF8 euc_cn_to_mic EUC_CN MULE_INTERNAL euc_cn_to_utf8 EUC_CN UTF8 euc_jp_to_mic EUC_JP MULE_INTERNAL euc_jp_to_sjis EUC_JP SJIS euc_jp_to_utf8 EUC_JP UTF8 euc_kr_to_mic EUC_KR MULE_INTERNAL euc_kr_to_utf8 EUC_KR UTF8 euc_tw_to_big5 EUC_TW BIG5 euc_tw_to_mic EUC_TW MULE_INTERNAL euc_tw_to_utf8 EUC_TW UTF8 gb18030_to_utf8 GB18030 UTF8 gbk_to_utf8 GBK UTF8 iso_8859_10_to_utf8 LATIN6 UTF8 iso_8859_13_to_utf8 LATIN7 UTF8 iso_8859_14_to_utf8 LATIN8 UTF8 iso_8859_15_to_utf8 LATIN9 UTF8 iso_8859_16_to_utf8 LATIN10 UTF8 iso_8859_1_to_mic LATIN1 MULE_INTERNAL iso_8859_1_to_utf8 LATIN1 UTF8 iso_8859_2_to_mic LATIN2 MULE_INTERNAL iso_8859_2_to_utf8 LATIN2 UTF8 iso_8859_2_to_windows_1250 LATIN2 WIN1250 iso_8859_3_to_mic LATIN3 MULE_INTERNAL iso_8859_3_to_utf8 LATIN3 UTF8 iso_8859_4_to_mic LATIN4 MULE_INTERNAL iso_8859_4_to_utf8 LATIN4 UTF8 iso_8859_5_to_koi8_r ISO_8859_5 KOI8 iso_8859_5_to_mic ISO_8859_5 MULE_INTERNAL iso_8859_5_to_utf8 ISO_8859_5 UTF8 iso_8859_5_to_windows_1251 ISO_8859_5 WIN1251 iso_8859_5_to_windows_866 ISO_8859_5 WIN866 iso_8859_6_to_utf8 ISO_8859_6 UTF8 iso_8859_7_to_utf8 ISO_8859_7 UTF8 iso_8859_8_to_utf8 ISO_8859_8 UTF8 iso_8859_9_to_utf8 LATIN5 UTF8 johab_to_utf8 JOHAB UTF8 koi8_r_to_iso_8859_5 KOI8 ISO_8859_5 koi8_r_to_mic KOI8 MULE_INTERNAL koi8_r_to_utf8 KOI8 UTF8 koi8_r_to_windows_1251 KOI8 WIN1251 koi8_r_to_windows_866 KOI8 WIN866 mic_to_ascii MULE_INTERNAL SQL_ASCII mic_to_big5 MULE_INTERNAL BIG5 mic_to_euc_cn MULE_INTERNAL EUC_CN mic_to_euc_jp MULE_INTERNAL EUC_JP mic_to_euc_kr MULE_INTERNAL EUC_KR mic_to_euc_tw MULE_INTERNAL EUC_TW mic_to_iso_8859_1 MULE_INTERNAL LATIN1 mic_to_iso_8859_2 MULE_INTERNAL LATIN2 mic_to_iso_8859_3 MULE_INTERNAL LATIN3 mic_to_iso_8859_4 MULE_INTERNAL LATIN4 mic_to_iso_8859_5 MULE_INTERNAL ISO_8859_5 mic_to_koi8_r MULE_INTERNAL KOI8 mic_to_sjis MULE_INTERNAL SJIS mic_to_windows_1250 MULE_INTERNAL WIN1250 mic_to_windows_1251 MULE_INTERNAL WIN1251 mic_to_windows_866 MULE_INTERNAL WIN866 sjis_to_euc_jp SJIS EUC_JP sjis_to_mic SJIS MULE_INTERNAL sjis_to_utf8 SJIS UTF8 tcvn_to_utf8 WIN1258 UTF8 uhc_to_utf8 UHC UTF8 utf8_to_ascii UTF8 SQL_ASCII utf8_to_big5 UTF8 BIG5 utf8_to_euc_cn UTF8 EUC_CN utf8_to_euc_jp UTF8 EUC_JP utf8_to_euc_kr UTF8 EUC_KR utf8_to_euc_tw UTF8 EUC_TW utf8_to_gb18030 UTF8 GB18030 utf8_to_gbk UTF8 GBK utf8_to_iso_8859_1 UTF8 LATIN1 utf8_to_iso_8859_10 UTF8 LATIN6 utf8_to_iso_8859_13 UTF8 LATIN7 utf8_to_iso_8859_14 UTF8 LATIN8 utf8_to_iso_8859_15 UTF8 LATIN9 utf8_to_iso_8859_16 UTF8 LATIN10 utf8_to_iso_8859_2 UTF8 LATIN2 utf8_to_iso_8859_3 UTF8 LATIN3 utf8_to_iso_8859_4 UTF8 LATIN4 utf8_to_iso_8859_5 UTF8 ISO_8859_5 utf8_to_iso_8859_6 UTF8 ISO_8859_6 utf8_to_iso_8859_7 UTF8 ISO_8859_7 utf8_to_iso_8859_8 UTF8 ISO_8859_8 utf8_to_iso_8859_9 UTF8 LATIN5 utf8_to_johab UTF8 JOHAB utf8_to_koi8_r UTF8 KOI8 utf8_to_sjis UTF8 SJIS utf8_to_tcvn UTF8 WIN1258 utf8_to_uhc UTF8 UHC utf8_to_windows_1250 UTF8 WIN1250 utf8_to_windows_1251 UTF8 WIN1251 utf8_to_windows_1252 UTF8 WIN1252 utf8_to_windows_1253 UTF8 WIN1253 utf8_to_windows_1254 UTF8 WIN1254 utf8_to_windows_1255 UTF8 WIN1255 utf8_to_windows_1256 UTF8 WIN1256 utf8_to_windows_1257 UTF8 WIN1257 utf8_to_windows_866 UTF8 WIN866 utf8_to_windows_874 UTF8 WIN874 windows_1250_to_iso_8859_2 WIN1250 LATIN2 windows_1250_to_mic WIN1250 MULE_INTERNAL windows_1250_to_utf8 WIN1250 UTF8 windows_1251_to_iso_8859_5 WIN1251 ISO_8859_5 windows_1251_to_koi8_r WIN1251 KOI8 windows_1251_to_mic WIN1251 MULE_INTERNAL windows_1251_to_utf8 WIN1251 UTF8 windows_1251_to_windows_866 WIN1251 WIN866 windows_1252_to_utf8 WIN1252 UTF8 windows_1256_to_utf8 WIN1256 UTF8 windows_866_to_iso_8859_5 WIN866 ISO_8859_5 windows_866_to_koi8_r WIN866 KOI8 windows_866_to_mic WIN866 MULE_INTERNAL windows_866_to_utf8 WIN866 UTF8 windows_866_to_windows_1251 WIN866 WIN windows_874_to_utf8 WIN874 UTF8

Binary String Functions and Operators binary data functions This section describes functions and operators for examining and manipulating values of type bytea. SQL defines some string functions with a special syntax where certain key words rather than commas are used to separate the arguments. Details are in . Some functions are also implemented using the regular syntax for function invocation. (See .) <acronym>SQL</acronym> Binary String Functions and Operators Function Return Type Description Example Result string || string bytea String concatenation binary string concatenation E'\\\\Post'::bytea || E'\\047gres\\000'::bytea \\Post'gres\000 get_bit(string, offset) int Extract bit from string get_bit get_bit(E'Th\\000omas'::bytea, 45) 1 get_byte(string, offset) int Extract byte from string get_byte get_byte(E'Th\\000omas'::bytea, 4) 109 octet_length(string) int Number of bytes in binary string octet_length( E'jo\\000se'::bytea) 5 position(substring in string) int Location of specified substring position(E'\\000om'::bytea in E'Th\\000omas'::bytea) 3 set_bit(string, offset, newvalue) bytea Set bit in string set_bit set_bit(E'Th\\000omas'::bytea, 45, 0) Th\000omAs set_byte(string, offset, newvalue) bytea Set byte in string set_byte set_byte(E'Th\\000omas'::bytea, 4, 64) Th\000o@as substring(string from int for int) bytea Extract substring substring substring(E'Th\\000omas'::bytea from 2 for 3) h\000o trim(both bytes from string) bytea Remove the longest string containing only the bytes in bytes from the start and end of string trim(E'\\000'::bytea from E'\\000Tom\\000'::bytea) Tom

Additional binary string manipulation functions are available and are listed in . Some of them are used internally to implement the SQL-standard string functions listed in . Other Binary String Functions Function Return Type Description Example Result btrim(string bytea, bytes bytea) bytea Remove the longest string consisting only of bytes in bytes from the start and end of string btrim(E'\\000trim\\000'::bytea, E'\\000'::bytea) trim decode(string text, type text) bytea Decode binary string from string previously encoded with encode. Parameter type is same as in encode. decode(E'123\\000456', 'escape') 123\000456 encode(string bytea, type text) text Encode binary string to ASCII-only representation. Supported types are: base64, hex, escape. encode(E'123\\000456'::bytea, 'escape') 123\000456 length(string) int Length of binary string binary string length length of a binary string binary strings, length length(E'jo\\000se'::bytea) 5 md5(string) text Calculates the MD5 hash of string, returning the result in hexadecimal md5(E'Th\\000omas'::bytea) 8ab2d3c9689aaf18 b4958c334c82d8b1

Bit String Functions and Operators bit strings functions This section describes functions and operators for examining and manipulating bit strings, that is values of the types bit and bit varying. Aside from the usual comparison operators, the operators shown in can be used. Bit string operands of &, |, and # must be of equal length. When bit shifting, the original length of the string is preserved, as shown in the examples. Bit String Operators Operator Description Example Result || concatenation B'10001' || B'011' 10001011 & bitwise AND B'10001' & B'01101' 00001 | bitwise OR B'10001' | B'01101' 11101 # bitwise XOR B'10001' # B'01101' 11100 ~ bitwise NOT ~ B'10001' 01110 << bitwise shift left B'10001' << 3 01000 >> bitwise shift right B'10001' >> 2 00100

The following SQL-standard functions work on bit strings as well as character strings: length, bit_length, octet_length, position, substring. In addition, it is possible to cast integral values to and from type bit. Some examples: 44::bit(10) 0000101100 44::bit(3) 100 cast(-44 as bit(12)) 111111010100 '1110'::bit(4)::integer 14 Note that casting to just bit means casting to bit(1), and so it will deliver only the least significant bit of the integer. Prior to PostgreSQL 8.0, casting an integer to bit(n) would copy the leftmost n bits of the integer, whereas now it copies the rightmost n bits. Also, casting an integer to a bit string width wider than the integer itself will sign-extend on the left. Pattern Matching pattern matching There are three separate approaches to pattern matching provided by PostgreSQL: the traditional SQL LIKE operator, the more recent SIMILAR TO operator (added in SQL:1999), and POSIX-style regular expressions. Additionally, a pattern matching function, substring, is available, using either SIMILAR TO-style or POSIX-style regular expressions. If you have pattern matching needs that go beyond this, consider writing a user-defined function in Perl or Tcl. <function>LIKE</function> LIKE string LIKE pattern ESCAPE escape-character string NOT LIKE pattern ESCAPE escape-character Every pattern defines a set of strings. The LIKE expression returns true if the string is contained in the set of strings represented by pattern. (As expected, the NOT LIKE expression returns false if LIKE returns true, and vice versa. An equivalent expression is NOT (string LIKE pattern).) If pattern does not contain percent signs or underscore, then the pattern only represents the string itself; in that case LIKE acts like the equals operator. An underscore (_) in pattern stands for (matches) any single character; a percent sign (%) matches any string of zero or more characters. Some examples: 'abc' LIKE 'abc' true 'abc' LIKE 'a%' true 'abc' LIKE '_b_' true 'abc' LIKE 'c' false LIKE pattern matches always cover the entire string. To match a sequence anywhere within a string, the pattern must therefore start and end with a percent sign. To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters. Note that the backslash already has a special meaning in string literals, so to write a pattern constant that contains a backslash you must write two backslashes in an SQL statement (assuming escape string syntax is used). Thus, writing a pattern that actually matches a literal backslash means writing four backslashes in the statement. You can avoid this by selecting a different escape character with ESCAPE; then a backslash is not special to LIKE anymore. (But it is still special to the string literal parser, so you still need two of them.) It's also possible to select no escape character by writing ESCAPE ''. This effectively disables the escape mechanism, which makes it impossible to turn off the special meaning of underscore and percent signs in the pattern. The key word ILIKE can be used instead of LIKE to make the match case-insensitive according to the active locale. This is not in the SQL standard but is a PostgreSQL extension. The operator ~~ is equivalent to LIKE, and ~~* corresponds to ILIKE. There are also !~~ and !~~* operators that represent NOT LIKE and NOT ILIKE, respectively. All of these operators are PostgreSQL-specific. <function>SIMILAR TO</function> Regular Expressions regular expression SIMILAR TO substring regexp_replace string SIMILAR TO pattern ESCAPE escape-character string NOT SIMILAR TO pattern ESCAPE escape-character The SIMILAR TO operator returns true or false depending on whether its pattern matches the given string. It is much like LIKE, except that it interprets the pattern using the SQL standard's definition of a regular expression. SQL regular expressions are a curious cross between LIKE notation and common regular expression notation. Like LIKE, the SIMILAR TO operator succeeds only if its pattern matches the entire string; this is unlike common regular expression practice, wherein the pattern can match any part of the string. Also like LIKE, SIMILAR TO uses _ and % as wildcard characters denoting any single character and any string, respectively (these are comparable to . and .* in POSIX regular expressions). In addition to these facilities borrowed from LIKE, SIMILAR TO supports these pattern-matching metacharacters borrowed from POSIX regular expressions: | denotes alternation (either of two alternatives). * denotes repetition of the previous item zero or more times. + denotes repetition of the previous item one or more times. Parentheses () can be used to group items into a single logical item. A bracket expression [...] specifies a character class, just as in POSIX regular expressions. Notice that bounded repetition (? and {...}) are not provided, though they exist in POSIX. Also, the dot (.) is not a metacharacter. As with LIKE, a backslash disables the special meaning of any of these metacharacters; or a different escape character can be specified with ESCAPE. Some examples: 'abc' SIMILAR TO 'abc' true 'abc' SIMILAR TO 'a' false 'abc' SIMILAR TO '%(b|d)%' true 'abc' SIMILAR TO '(b|c)%' false The substring function with three parameters, substring(string from pattern for escape-character), provides extraction of a substring that matches an SQL regular expression pattern. As with SIMILAR TO, the specified pattern must match to the entire data string, else the function fails and returns null. To indicate the part of the pattern that should be returned on success, the pattern must contain two occurrences of the escape character followed by a double quote ("). The text matching the portion of the pattern between these markers is returned. Some examples: substring('foobar' from '%#"o_b#"%' for '#') oob substring('foobar' from '#"o_b#"%' for '#') NULL <acronym>POSIX</acronym> Regular Expressions regular expression pattern matching lists the available operators for pattern matching using POSIX regular expressions. Regular Expression Match Operators Operator Description Example ~ Matches regular expression, case sensitive 'thomas' ~ '.*thomas.*' ~* Matches regular expression, case insensitive 'thomas' ~* '.*Thomas.*' !~ Does not match regular expression, case sensitive 'thomas' !~ '.*Thomas.*' !~* Does not match regular expression, case insensitive 'thomas' !~* '.*vadim.*'

POSIX regular expressions provide a more powerful means for pattern matching than the LIKE and SIMILAR TO operators. Many Unix tools such as egrep, sed, or awk use a pattern matching language that is similar to the one described here. A regular expression is a character sequence that is an abbreviated definition of a set of strings (a regular set). A string is said to match a regular expression if it is a member of the regular set described by the regular expression. As with LIKE, pattern characters match string characters exactly unless they are special characters in the regular expression language — but regular expressions use different special characters than LIKE does. Unlike LIKE patterns, a regular expression is allowed to match anywhere within a string, unless the regular expression is explicitly anchored to the beginning or end of the string. Some examples: 'abc' ~ 'abc' true 'abc' ~ '^a' true 'abc' ~ '(b|d)' true 'abc' ~ '^(b|c)' false The substring function with two parameters, substring(string from pattern), provides extraction of a substring that matches a POSIX regular expression pattern. It returns null if there is no match, otherwise the portion of the text that matched the pattern. But if the pattern contains any parentheses, the portion of the text that matched the first parenthesized subexpression (the one whose left parenthesis comes first) is returned. You can put parentheses around the whole expression if you want to use parentheses within it without triggering this exception. If you need parentheses in the pattern before the subexpression you want to extract, see the non-capturing parentheses described below. Some examples: substring('foobar' from 'o.b') oob substring('foobar' from 'o(.)b') o The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. It has the syntax regexp_replace(source, pattern, replacement , flags ). The source string is returned unchanged if there is no match to the pattern. If there is a match, the source string is returned with the replacement string substituted for the matching substring. The replacement string can contain \n, where n is 1 through 9, to indicate that the source substring matching the n'th parenthesized subexpression of the pattern should be inserted, and it can contain \& to indicate that the substring matching the entire pattern should be inserted. Write \\ if you need to put a literal backslash in the replacement text. (As always, remember to double backslashes written in literal constant strings, assuming escape string syntax is used.) The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one. Some examples: regexp_replace('foobarbaz', 'b..', 'X') fooXbaz regexp_replace('foobarbaz', 'b..', 'X', 'g') fooXX regexp_replace('foobarbaz', 'b(..)', E'X\\1Y', 'g') fooXarYXazY PostgreSQL's regular expressions are implemented using a package written by Henry Spencer. Much of the description of regular expressions below is copied verbatim from his manual entry. Regular Expression Details Regular expressions (REs), as defined in POSIX 1003.2, come in two forms: extended REs or EREs (roughly those of egrep), and basic REs or BREs (roughly those of ed). PostgreSQL supports both forms, and also implements some extensions that are not in the POSIX standard, but have become widely used anyway due to their availability in programming languages such as Perl and Tcl. REs using these non-POSIX extensions are called advanced REs or AREs in this documentation. AREs are almost an exact superset of EREs, but BREs have several notational incompatibilities (as well as being much more limited). We first describe the ARE and ERE forms, noting features that apply only to AREs, and then describe how BREs differ. The form of regular expressions accepted by PostgreSQL can be chosen by setting the run-time parameter. The usual setting is advanced, but one might choose extended for maximum backwards compatibility with pre-7.4 releases of PostgreSQL. A regular expression is defined as one or more branches, separated by |. It matches anything that matches one of the branches. A branch is zero or more quantified atoms or constraints, concatenated. It matches a match for the first, followed by a match for the second, etc; an empty branch matches the empty string. A quantified atom is an atom possibly followed by a single quantifier. Without a quantifier, it matches a match for the atom. With a quantifier, it can match some number of matches of the atom. An atom can be any of the possibilities shown in . The possible quantifiers and their meanings are shown in . A constraint matches an empty string, but matches only when specific conditions are met. A constraint can be used where an atom could be used, except it cannot be followed by a quantifier. The simple constraints are shown in ; some more constraints are described later. Regular Expression Atoms Atom Description (re) (where re is any regular expression) matches a match for re, with the match noted for possible reporting (?:re) as above, but the match is not noted for reporting (a non-capturing set of parentheses) (AREs only) . matches any single character [chars] a bracket expression, matching any one of the chars (see for more detail) \k (where k is a non-alphanumeric character) matches that character taken as an ordinary character, e.g. \\ matches a backslash character \c where c is alphanumeric (possibly followed by other characters) is an escape, see (AREs only; in EREs and BREs, this matches c) { when followed by a character other than a digit, matches the left-brace character {; when followed by a digit, it is the beginning of a bound (see below) x where x is a single character with no other significance, matches that character

An RE cannot end with \. Remember that the backslash (\) already has a special meaning in PostgreSQL string literals. To write a pattern constant that contains a backslash, you must write two backslashes in the statement, assuming escape string syntax is used. Regular Expression Quantifiers Quantifier Matches * a sequence of 0 or more matches of the atom + a sequence of 1 or more matches of the atom ? a sequence of 0 or 1 matches of the atom {m} a sequence of exactly m matches of the atom {m,} a sequence of m or more matches of the atom {m,n} a sequence of m through n (inclusive) matches of the atom; m cannot exceed n *? non-greedy version of * +? non-greedy version of + ?? non-greedy version of ? {m}? non-greedy version of {m} {m,}? non-greedy version of {m,} {m,n}? non-greedy version of {m,n}

The forms using {...} are known as bounds. The numbers m and n within a bound are unsigned decimal integers with permissible values from 0 to 255 inclusive. Non-greedy quantifiers (available in AREs only) match the same possibilities as their corresponding normal (greedy) counterparts, but prefer the smallest number rather than the largest number of matches. See for more detail. A quantifier cannot immediately follow another quantifier. A quantifier cannot begin an expression or subexpression or follow ^ or |. Regular Expression Constraints Constraint Description ^ matches at the beginning of the string $ matches at the end of the string (?=re) positive lookahead matches at any point where a substring matching re begins (AREs only) (?!re) negative lookahead matches at any point where no substring matching re begins (AREs only)

Lookahead constraints cannot contain back references (see ), and all parentheses within them are considered non-capturing. Bracket Expressions A bracket expression is a list of characters enclosed in []. It normally matches any single character from the list (but see below). If the list begins with ^, it matches any single character not from the rest of the list. If two characters in the list are separated by -, this is shorthand for the full range of characters between those two (inclusive) in the collating sequence, e.g. [0-9] in ASCII matches any decimal digit. It is illegal for two ranges to share an endpoint, e.g. a-c-e. Ranges are very collating-sequence-dependent, so portable programs should avoid relying on them. To include a literal ] in the list, make it the first character (following a possible ^). To include a literal -, make it the first or last character, or the second endpoint of a range. To use a literal - as the first endpoint of a range, enclose it in [. and .] to make it a collating element (see below). With the exception of these characters, some combinations using [ (see next paragraphs), and escapes (AREs only), all other special characters lose their special significance within a bracket expression. In particular, \ is not special when following ERE or BRE rules, though it is special (as introducing an escape) in AREs. Within a bracket expression, a collating element (a character, a multiple-character sequence that collates as if it were a single character, or a collating-sequence name for either) enclosed in [. and .] stands for the sequence of characters of that collating element. The sequence is a single element of the bracket expression's list. A bracket expression containing a multiple-character collating element can thus match more than one character, e.g. if the collating sequence includes a ch collating element, then the RE [[.ch.]]*c matches the first five characters of chchcc. PostgreSQL currently has no multicharacter collating elements. This information describes possible future behavior. Within a bracket expression, a collating element enclosed in [= and =] is an equivalence class, standing for the sequences of characters of all collating elements equivalent to that one, including itself. (If there are no other equivalent collating elements, the treatment is as if the enclosing delimiters were [. and .].) For example, if o and ^ are the members of an equivalence class, then [[=o=]], [[=^=]], and [o^] are all synonymous. An equivalence class cannot be an endpoint of a range. Within a bracket expression, the name of a character class enclosed in [: and :] stands for the list of all characters belonging to that class. Standard character class names are: alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit. These stand for the character classes defined in ctype3. A locale can provide others. A character class cannot be used as an endpoint of a range. There are two special cases of bracket expressions: the bracket expressions [[:<:]] and [[:>:]] are constraints, matching empty strings at the beginning and end of a word respectively. A word is defined as a sequence of word characters that is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype3) or an underscore. This is an extension, compatible with but not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to other systems. The constraint escapes described below are usually preferable (they are no more standard, but are certainly easier to type). Regular Expression Escapes Escapes are special sequences beginning with \ followed by an alphanumeric character. Escapes come in several varieties: character entry, class shorthands, constraint escapes, and back references. A \ followed by an alphanumeric character but not constituting a valid escape is illegal in AREs. In EREs, there are no escapes: outside a bracket expression, a \ followed by an alphanumeric character merely stands for that character as an ordinary character, and inside a bracket expression, \ is an ordinary character. (The latter is the one actual incompatibility between EREs and AREs.) Character-entry escapes exist to make it easier to specify non-printing and otherwise inconvenient characters in REs. They are shown in . Class-shorthand escapes provide shorthands for certain commonly-used character classes. They are shown in . A constraint escape is a constraint, matching the empty string if specific conditions are met, written as an escape. They are shown in . A back reference (\n) matches the same string matched by the previous parenthesized subexpression specified by the number n (see ). For example, ([bc])\1 matches bb or cc but not bc or cb. The subexpression must entirely precede the back reference in the RE. Subexpressions are numbered in the order of their leading parentheses. Non-capturing parentheses do not define subexpressions. Keep in mind that an escape's leading \ will need to be doubled when entering the pattern as an SQL string constant. For example: '123' ~ E'^\\d{3}' true Regular Expression Character-Entry Escapes Escape Description \a alert (bell) character, as in C \b backspace, as in C \B synonym for \ to help reduce the need for backslash doubling \cX (where X is any character) the character whose low-order 5 bits are the same as those of X, and whose other bits are all zero \e the character whose collating-sequence name is ESC, or failing that, the character with octal value 033 \f form feed, as in C \n newline, as in C \r carriage return, as in C \t horizontal tab, as in C \uwxyz (where wxyz is exactly four hexadecimal digits) the UTF16 (Unicode, 16-bit) character U+wxyz in the local byte ordering \Ustuvwxyz (where stuvwxyz is exactly eight hexadecimal digits) reserved for a somewhat-hypothetical Unicode extension to 32 bits \v vertical tab, as in C \xhhh (where hhh is any sequence of hexadecimal digits) the character whose hexadecimal value is 0xhhh (a single character no matter how many hexadecimal digits are used) \0 the character whose value is 0 \xy (where xy is exactly two octal digits, and is not a back reference) the character whose octal value is 0xy \xyz (where xyz is exactly three octal digits, and is not a back reference) the character whose octal value is 0xyz

Hexadecimal digits are 0-9, a-f, and A-F. Octal digits are 0-7. The character-entry escapes are always taken as ordinary characters. For example, \135 is ] in ASCII, but \135 does not terminate a bracket expression. Regular Expression Class-Shorthand Escapes Escape Description \d [[:digit:]] \s [[:space:]] \w [[:alnum:]_] (note underscore is included) \D [^[:digit:]] \S [^[:space:]] \W [^[:alnum:]_] (note underscore is included)

Within bracket expressions, \d, \s, and \w lose their outer brackets, and \D, \S, and \W are illegal. (So, for example, [a-c\d] is equivalent to [a-c[:digit:]]. Also, [a-c\D], which is equivalent to [a-c^[:digit:]], is illegal.) Regular Expression Constraint Escapes Escape Description \A matches only at the beginning of the string (see for how this differs from ^) \m matches only at the beginning of a word \M matches only at the end of a word \y matches only at the beginning or end of a word \Y matches only at a point that is not the beginning or end of a word \Z matches only at the end of the string (see for how this differs from $)

A word is defined as in the specification of [[:<:]] and [[:>:]] above. Constraint escapes are illegal within bracket expressions. Regular Expression Back References Escape Description \m (where m is a nonzero digit) a back reference to the m'th subexpression \mnn (where m is a nonzero digit, and nn is some more digits, and the decimal value mnn is not greater than the number of closing capturing parentheses seen so far) a back reference to the mnn'th subexpression

There is an inherent historical ambiguity between octal character-entry escapes and back references, which is resolved by heuristics, as hinted at above. A leading zero always indicates an octal escape. A single non-zero digit, not followed by another digit, is always taken as a back reference. A multidigit sequence not starting with a zero is taken as a back reference if it comes after a suitable subexpression (i.e. the number is in the legal range for a back reference), and otherwise is taken as octal. Regular Expression Metasyntax In addition to the main syntax described above, there are some special forms and miscellaneous syntactic facilities available. Normally the flavor of RE being used is determined by regex_flavor. However, this can be overridden by a director prefix. If an RE begins with ***:, the rest of the RE is taken as an ARE regardless of regex_flavor. If an RE begins with ***=, the rest of the RE is taken to be a literal string, with all characters considered ordinary characters. An ARE can begin with embedded options: a sequence (?xyz) (where xyz is one or more alphabetic characters) specifies options affecting the rest of the RE. These options override any previously determined options (including both the RE flavor and case sensitivity). The available option letters are shown in . ARE Embedded-Option Letters Option Description b rest of RE is a BRE c case-sensitive matching (overrides operator type) e rest of RE is an ERE i case-insensitive matching (see ) (overrides operator type) m historical synonym for n n newline-sensitive matching (see ) p partial newline-sensitive matching (see ) q rest of RE is a literal (quoted) string, all ordinary characters s non-newline-sensitive matching (default) t tight syntax (default; see below) w inverse partial newline-sensitive (weird) matching (see ) x expanded syntax (see below)

Embedded options take effect at the ) terminating the sequence. They can appear only at the start of an ARE (after the ***: director if any). In addition to the usual (tight) RE syntax, in which all characters are significant, there is an expanded syntax, available by specifying the embedded x option. In the expanded syntax, white-space characters in the RE are ignored, as are all characters between a # and the following newline (or the end of the RE). This permits paragraphing and commenting a complex RE. There are three exceptions to that basic rule: a white-space character or # preceded by \ is retained white space or # within a bracket expression is retained white space and comments cannot appear within multicharacter symbols, such as (?: For this purpose, white-space characters are blank, tab, newline, and any character that belongs to the space character class. Finally, in an ARE, outside bracket expressions, the sequence (?#ttt) (where ttt is any text not containing a )) is a comment, completely ignored. Again, this is not allowed between the characters of multicharacter symbols, like (?:. Such comments are more a historical artifact than a useful facility, and their use is deprecated; use the expanded syntax instead. None of these metasyntax extensions is available if an initial ***= director has specified that the user's input be treated as a literal string rather than as an RE. Regular Expression Matching Rules In the event that an RE could match more than one substring of a given string, the RE matches the one starting earliest in the string. If the RE could match more than one substring starting at that point, either the longest possible match or the shortest possible match will be taken, depending on whether the RE is greedy or non-greedy. Whether an RE is greedy or not is determined by the following rules: Most atoms, and all constraints, have no greediness attribute (because they cannot match variable amounts of text anyway). Adding parentheses around an RE does not change its greediness. A quantified atom with a fixed-repetition quantifier ({m} or {m}?) has the same greediness (possibly none) as the atom itself. A quantified atom with other normal quantifiers (including {m,n} with m equal to n) is greedy (prefers longest match). A quantified atom with a non-greedy quantifier (including {m,n}? with m equal to n) is non-greedy (prefers shortest match). A branch — that is, an RE that has no top-level | operator — has the same greediness as the first quantified atom in it that has a greediness attribute. An RE consisting of two or more branches connected by the | operator is always greedy. The above rules associate greediness attributes not only with individual quantified atoms, but with branches and entire REs that contain quantified atoms. What that means is that the matching is done in such a way that the branch, or whole RE, matches the longest or shortest possible substring as a whole. Once the length of the entire match is determined, the part of it that matches any particular subexpression is determined on the basis of the greediness attribute of that subexpression, with subexpressions starting earlier in the RE taking priority over ones starting later. An example of what this means: SELECT SUBSTRING('XY1234Z', 'Y*([0-9]{1,3})'); Result: 123 SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})'); Result: 1 In the first case, the RE as a whole is greedy because Y* is greedy. It can match beginning at the Y, and it matches the longest possible string starting there, i.e., Y123. The output is the parenthesized part of that, or 123. In the second case, the RE as a whole is non-greedy because Y*? is non-greedy. It can match beginning at the Y, and it matches the shortest possible string starting there, i.e., Y1. The subexpression [0-9]{1,3} is greedy but it cannot change the decision as to the overall match length; so it is forced to match just 1. In short, when an RE contains both greedy and non-greedy subexpressions, the total match length is either as long as possible or as short as possible, according to the attribute assigned to the whole RE. The attributes assigned to the subexpressions only affect how much of that match they are allowed to eat relative to each other. The quantifiers {1,1} and {1,1}? can be used to force greediness or non-greediness, respectively, on a subexpression or a whole RE. Match lengths are measured in characters, not collating elements. An empty string is considered longer than no match at all. For example: bb* matches the three middle characters of abbbc; (week|wee)(night|knights) matches all ten characters of weeknights; when (.*).* is matched against abc the parenthesized subexpression matches all three characters; and when (a*)* is matched against bc both the whole RE and the parenthesized subexpression match an empty string. If case-independent matching is specified, the effect is much as if all case distinctions had vanished from the alphabet. When an alphabetic that exists in multiple cases appears as an ordinary character outside a bracket expression, it is effectively transformed into a bracket expression containing both cases, e.g. x becomes [xX]. When it appears inside a bracket expression, all case counterparts of it are added to the bracket expression, e.g. [x] becomes [xX] and [^x] becomes [^xX]. If newline-sensitive matching is specified, . and bracket expressions using ^ will never match the newline character (so that matches will never cross newlines unless the RE explicitly arranges it) and ^and $ will match the empty string after and before a newline respectively, in addition to matching at beginning and end of string respectively. But the ARE escapes \A and \Z continue to match beginning or end of string only. If partial newline-sensitive matching is specified, this affects . and bracket expressions as with newline-sensitive matching, but not ^ and $. If inverse partial newline-sensitive matching is specified, this affects ^ and $ as with newline-sensitive matching, but not . and bracket expressions. This isn't very useful but is provided for symmetry. Limits and Compatibility No particular limit is imposed on the length of REs in this implementation. However, programs intended to be highly portable should not employ REs longer than 256 bytes, as a POSIX-compliant implementation can refuse to accept such REs. The only feature of AREs that is actually incompatible with POSIX EREs is that \ does not lose its special significance inside bracket expressions. All other ARE features use syntax which is illegal or has undefined or unspecified effects in POSIX EREs; the *** syntax of directors likewise is outside the POSIX syntax for both BREs and EREs. Many of the ARE extensions are borrowed from Perl, but some have been changed to clean them up, and a few Perl extensions are not present. Incompatibilities of note include \b, \B, the lack of special treatment for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-sensitive matching, the restrictions on parentheses and back references in lookahead constraints, and the longest/shortest-match (rather than first-match) matching semantics. Two significant incompatibilities exist between AREs and the ERE syntax recognized by pre-7.4 releases of PostgreSQL: In AREs, \ followed by an alphanumeric character is either an escape or an error, while in previous releases, it was just another way of writing the alphanumeric. This should not be much of a problem because there was no reason to write such a sequence in earlier releases. In AREs, \ remains a special character within [], so a literal \ within a bracket expression must be written \\. While these differences are unlikely to create a problem for most applications, you can avoid them if necessary by setting regex_flavor to extended. Basic Regular Expressions BREs differ from EREs in several respects. |, +, and ? are ordinary characters and there is no equivalent for their functionality. The delimiters for bounds are \{ and \}, with { and } by themselves ordinary characters. The parentheses for nested subexpressions are $ and $, with ( and ) by themselves ordinary characters. ^ is an ordinary character except at the beginning of the RE or the beginning of a parenthesized subexpression, $ is an ordinary character except at the end of the RE or the end of a parenthesized subexpression, and * is an ordinary character if it appears at the beginning of the RE or the beginning of a parenthesized subexpression (after a possible leading ^). Finally, single-digit back references are available, and \< and \> are synonyms for [[:<:]] and [[:>:]] respectively; no other escapes are available. Data Type Formatting Functions formatting to_char to_date to_number to_timestamp The PostgreSQL formatting functions provide a powerful set of tools for converting various data types (date/time, integer, floating point, numeric) to formatted strings and for converting from formatted strings to specific data types. lists them. These functions all follow a common calling convention: the first argument is the value to be formatted and the second argument is a template that defines the output or input format. The to_timestamp function can also take a single double precision argument to convert from Unix epoch to timestamp with time zone. (Integer Unix epochs are implicitly cast to double precision.) Formatting Functions Function Return Type Description Example to_char(timestamp, text) text convert time stamp to string to_char(current_timestamp, 'HH12:MI:SS') to_char(interval, text) text convert interval to string to_char(interval '15h 2m 12s', 'HH24:MI:SS') to_char(int, text) text convert integer to string to_char(125, '999') to_char(double precision, text) text convert real/double precision to string to_char(125.8::real, '999D9') to_char(numeric, text) text convert numeric to string to_char(-125.8, '999D99S') to_date(text, text) date convert string to date to_date('05 Dec 2000', 'DD Mon YYYY') to_number(text, text) numeric convert string to numeric to_number('12,454.8-', '99G999D9S') to_timestamp(text, text) timestamp with time zone convert string to time stamp to_timestamp('05 Dec 2000', 'DD Mon YYYY') to_timestamp(double precision) timestamp with time zone convert UNIX epoch to time stamp to_timestamp(200120400)

In an output template string (for to_char), there are certain patterns that are recognized and replaced with appropriately-formatted data from the value to be formatted. Any text that is not a template pattern is simply copied verbatim. Similarly, in an input template string (for anything but to_char), template patterns identify the parts of the input data string to be looked at and the values to be found there. shows the template patterns available for formatting date and time values. Template Patterns for Date/Time Formatting Pattern Description HH hour of day (01-12) HH12 hour of day (01-12) HH24 hour of day (00-23) MI minute (00-59) SS second (00-59) MS millisecond (000-999) US microsecond (000000-999999) SSSS seconds past midnight (0-86399) AM or A.M. or PM or P.M. meridian indicator (uppercase) am or a.m. or pm or p.m. meridian indicator (lowercase) Y,YYY year (4 and more digits) with comma YYYY year (4 and more digits) YYY last 3 digits of year YY last 2 digits of year Y last digit of year IYYY ISO year (4 and more digits) IYY last 3 digits of ISO year IY last 2 digits of ISO year I last digit of ISO year BC or B.C. or AD or A.D. era indicator (uppercase) bc or b.c. or ad or a.d. era indicator (lowercase) MONTH full uppercase month name (blank-padded to 9 chars) Month full mixed-case month name (blank-padded to 9 chars) month full lowercase month name (blank-padded to 9 chars) MON abbreviated uppercase month name (3 chars in English, localized lengths vary) Mon abbreviated mixed-case month name (3 chars in English, localized lengths vary) mon abbreviated lowercase month name (3 chars in English, localized lengths vary) MM month number (01-12) DAY full uppercase day name (blank-padded to 9 chars) Day full mixed-case day name (blank-padded to 9 chars) day full lowercase day name (blank-padded to 9 chars) DY abbreviated uppercase day name (3 chars in English, localized lengths vary) Dy abbreviated mixed-case day name (3 chars in English, localized lengths vary) dy abbreviated lowercase day name (3 chars in English, localized lengths vary) DDD day of year (001-366) IDDD ISO day of year (001-371; day 1 of the year is Monday of the first ISO week.) DD day of month (01-31) D day of the week, Sunday(1) to Saturday(7) ID ISO day of the week, Monday(1) to Sunday(7) W week of month (1-5) (The first week starts on the first day of the month.) WW week number of year (1-53) (The first week starts on the first day of the year.) IW ISO week number of year (1 - 53; the first Thursday of the new year is in week 1.) CC century (2 digits) (The twenty-first century starts on 2001-01-01.) J Julian Day (days since midnight, January 1, 4712 BC) Q quarter RM month in Roman numerals (I-XII; I=January) (uppercase) rm month in Roman numerals (i-xii; i=January) (lowercase) TZ time-zone name (uppercase) tz time-zone name (lowercase)

Certain modifiers can be applied to any template pattern to alter its behavior. For example, FMMonth is the Month pattern with the FM modifier. shows the modifier patterns for date/time formatting. Template Pattern Modifiers for Date/Time Formatting Modifier Description Example FM prefix fill mode (suppress padding blanks and zeroes) FMMonth TH suffix uppercase ordinal number suffix DDTH th suffix lowercase ordinal number suffix DDth FX prefix fixed format global option (see usage notes) FX Month DD Day TM prefix translation mode (print localized day and month names based on lc_messages) TMMonth SP suffix spell mode (not yet implemented) DDSP

Usage notes for date/time formatting: FM suppresses leading zeroes and trailing blanks that would otherwise be added to make the output of a pattern be fixed-width. TM does not include trailing blanks. to_timestamp and to_date skip multiple blank spaces in the input string if the FX option is not used. FX must be specified as the first item in the template. For example to_timestamp('2000 JUN', 'YYYY MON') is correct, but to_timestamp('2000 JUN', 'FXYYYY MON') returns an error, because to_timestamp expects one space only. Ordinary text is allowed in to_char templates and will be output literally. You can put a substring in double quotes to force it to be interpreted as literal text even if it contains pattern key words. For example, in '"Hello Year "YYYY', the YYYY will be replaced by the year data, but the single Y in Year will not be. If you want to have a double quote in the output you must precede it with a backslash, for example E'\\"YYYY Month\\"'. (Two backslashes are necessary because the backslash already has a special meaning when using the escape string syntax.) The YYYY conversion from string to timestamp or date has a restriction if you use a year with more than 4 digits. You must use some non-digit character or template after YYYY, otherwise the year is always interpreted as 4 digits. For example (with the year 20000): to_date('200001131', 'YYYYMMDD') will be interpreted as a 4-digit year; instead use a non-digit separator after the year, like to_date('20000-1131', 'YYYY-MMDD') or to_date('20000Nov31', 'YYYYMonDD'). In conversions from string to timestamp or date, the CC field is ignored if there is a YYY, YYYY or Y,YYY field. If CC is used with YY or Y then the year is computed as (CC-1)*100+YY. An ISO week date (as distinct from a Gregorian date) can be specified to to_timestamp and to_date in one of two ways: Year, week and weekday, for example to_date('2006-42-4', 'IYYY-IW-ID') returns the date 2006-10-19. If you omit the weekday it is assumed to be 1 (Monday). Year and day of year, for example to_date('2006-291', 'IYYY-IDDD') also returns 2006-10-19. Attempting to construct a date using a mixture of ISO week and Gregorian date fields is nonsensical, and could yield unexpected results. In the context of an ISO year, the concept of a 'month' or 'day of month' has no meaning. In the context of a Gregorian year, the ISO week has no meaning. Users should take care to keep Gregorian and ISO date specifications separate. Millisecond (MS) and microsecond (US) values in a conversion from string to timestamp are used as part of the seconds after the decimal point. For example to_timestamp('12:3', 'SS:MS') is not 3 milliseconds, but 300, because the conversion counts it as 12 + 0.3 seconds. This means for the format SS:MS, the input values 12:3, 12:30, and 12:300 specify the same number of milliseconds. To get three milliseconds, one must use 12:003, which the conversion counts as 12 + 0.003 = 12.003 seconds. Here is a more complex example: to_timestamp('15:12:02.020.001230', 'HH:MI:SS.MS.US') is 15 hours, 12 minutes, and 2 seconds + 20 milliseconds + 1230 microseconds = 2.021230 seconds. to_char(..., 'ID')'s day of the week numbering matches the extract('isodow', ...) function, but to_char(..., 'D')'s does not match extract('dow', ...)'s day numbering. to_char(interval) formats HH and HH12 as hours in a single day, while HH24 can output hours exceeding a single day, e.g. >24. shows the template patterns available for formatting numeric values. Template Patterns for Numeric Formatting Pattern Description 9 value with the specified number of digits 0 value with leading zeros . (period) decimal point , (comma) group (thousand) separator PR negative value in angle brackets S sign anchored to number (uses locale) L currency symbol (uses locale) D decimal point (uses locale) G group separator (uses locale) MI minus sign in specified position (if number < 0) PL plus sign in specified position (if number > 0) SG plus/minus sign in specified position RN roman numeral (input between 1 and 3999) TH or th ordinal number suffix V shift specified number of digits (see notes) EEEE scientific notation (not implemented yet)

Usage notes for numeric formatting: A sign formatted using SG, PL, or MI is not anchored to the number; for example, to_char(-12, 'S9999') produces ' -12', but to_char(-12, 'MI9999') produces '- 12'. The Oracle implementation does not allow the use of MI ahead of 9, but rather requires that 9 precede MI. 9 results in a value with the same number of digits as there are 9s. If a digit is not available it outputs a space. TH does not convert values less than zero and does not convert fractional numbers. PL, SG, and TH are PostgreSQL extensions. V effectively multiplies the input values by 10^n, where n is the number of digits following V. to_char does not support the use of V combined with a decimal point. (E.g., 99.9V99 is not allowed.) shows some examples of the use of the to_char function. <function>to_char</function> Examples Expression Result to_char(current_timestamp, 'Day, DD HH12:MI:SS') 'Tuesday , 06 05:39:18' to_char(current_timestamp, 'FMDay, FMDD HH12:MI:SS') 'Tuesday, 6 05:39:18' to_char(-0.1, '99.99') ' -.10' to_char(-0.1, 'FM9.99') '-.1' to_char(0.1, '0.9') ' 0.1' to_char(12, '9990999.9') ' 0012.0' to_char(12, 'FM9990999.9') '0012.' to_char(485, '999') ' 485' to_char(-485, '999') '-485' to_char(485, '9 9 9') ' 4 8 5' to_char(1485, '9,999') ' 1,485' to_char(1485, '9G999') ' 1 485' to_char(148.5, '999.999') ' 148.500' to_char(148.5, 'FM999.999') '148.5' to_char(148.5, 'FM999.990') '148.500' to_char(148.5, '999D999') ' 148,500' to_char(3148.5, '9G999D999') ' 3 148,500' to_char(-485, '999S') '485-' to_char(-485, '999MI') '485-' to_char(485, '999MI') '485 ' to_char(485, 'FM999MI') '485' to_char(485, 'PL999') '+485' to_char(485, 'SG999') '+485' to_char(-485, 'SG999') '-485' to_char(-485, '9SG99') '4-85' to_char(-485, '999PR') '<485>' to_char(485, 'L999') 'DM 485 to_char(485, 'RN') ' CDLXXXV' to_char(485, 'FMRN') 'CDLXXXV' to_char(5.2, 'FMRN') 'V' to_char(482, '999th') ' 482nd' to_char(485, '"Good number:"999') 'Good number: 485' to_char(485.8, '"Pre:"999" Post:" .999') 'Pre: 485 Post: .800' to_char(12, '99V999') ' 12000' to_char(12.4, '99V999') ' 12400' to_char(12.45, '99V9') ' 125'

Date/Time Functions and Operators shows the available functions for date/time value processing, with details appearing in the following subsections. illustrates the behaviors of the basic arithmetic operators (+, *, etc.). For formatting functions, refer to . You should be familiar with the background information on date/time data types from . All the functions and operators described below that take time or timestamp inputs actually come in two variants: one that takes time with time zone or timestamp with time zone, and one that takes time without time zone or timestamp without time zone. For brevity, these variants are not shown separately. Also, the + and * operators come in commutative pairs (for example both date + integer and integer + date); we show only one of each such pair. Date/Time Operators Operator Example Result + date '2001-09-28' + integer '7' date '2001-10-05' + date '2001-09-28' + interval '1 hour' timestamp '2001-09-28 01:00:00' + date '2001-09-28' + time '03:00' timestamp '2001-09-28 03:00:00' + interval '1 day' + interval '1 hour' interval '1 day 01:00:00' + timestamp '2001-09-28 01:00' + interval '23 hours' timestamp '2001-09-29 00:00:00' + time '01:00' + interval '3 hours' time '04:00:00' - - interval '23 hours' interval '-23:00:00' - date '2001-10-01' - date '2001-09-28' integer '3' - date '2001-10-01' - integer '7' date '2001-09-24' - date '2001-09-28' - interval '1 hour' timestamp '2001-09-27 23:00:00' - time '05:00' - time '03:00' interval '02:00:00' - time '05:00' - interval '2 hours' time '03:00:00' - timestamp '2001-09-28 23:00' - interval '23 hours' timestamp '2001-09-28 00:00:00' - interval '1 day' - interval '1 hour' interval '1 day -01:00:00' - timestamp '2001-09-29 03:00' - timestamp '2001-09-27 12:00' interval '1 day 15:00:00' * 900 * interval '1 second' interval '00:15:00' * 21 * interval '1 day' interval '21 days' * double precision '3.5' * interval '1 hour' interval '03:30:00' / interval '1 hour' / double precision '1.5' interval '00:40:00'

age clock_timestamp current_date current_time current_timestamp date_part date_trunc extract isfinite justify_days justify_hours justify_interval localtime localtimestamp now statement_timestamp timeofday transaction_timestamp Date/Time Functions Function Return Type Description Example Result age(timestamp, timestamp) interval Subtract arguments, producing a symbolic result that uses years and months age(timestamp '2001-04-10', timestamp '1957-06-13') 43 years 9 mons 27 days age(timestamp) interval Subtract from current_date age(timestamp '1957-06-13') 43 years 8 mons 3 days clock_timestamp() timestamp with time zone Current date and time (changes during statement execution); see current_date date Current date; see current_time time with time zone Current time of day; see current_timestamp timestamp with time zone Current date and time (start of current transaction); see date_part(text, timestamp) double precision Get subfield (equivalent to extract); see date_part('hour', timestamp '2001-02-16 20:38:40') 20 date_part(text, interval) double precision Get subfield (equivalent to extract); see date_part('month', interval '2 years 3 months') 3 date_trunc(text, timestamp) timestamp Truncate to specified precision; see also date_trunc('hour', timestamp '2001-02-16 20:38:40') 2001-02-16 20:00:00 extract(field from timestamp) double precision Get subfield; see extract(hour from timestamp '2001-02-16 20:38:40') 20 extract(field from interval) double precision Get subfield; see extract(month from interval '2 years 3 months') 3 isfinite(timestamp) boolean Test for finite time stamp (not equal to infinity) isfinite(timestamp '2001-02-16 21:28:30') true isfinite(interval) boolean Test for finite interval isfinite(interval '4 hours') true justify_days(interval) interval Adjust interval so 30-day time periods are represented as months justify_days(interval '30 days') 1 month justify_hours(interval) interval Adjust interval so 24-hour time periods are represented as days justify_hours(interval '24 hours') 1 day justify_interval(interval) interval Adjust interval using justify_days and justify_hours, with additional sign adjustments justify_interval(interval '1 mon -1 hour') 29 days 23:00:00 localtime time Current time of day; see localtimestamp timestamp Current date and time (start of current transaction); see now() timestamp with time zone Current date and time (start of current transaction); see statement_timestamp() timestamp with time zone Current date and time (start of current statement); see timeofday() text Current date and time (like clock_timestamp, but as a text string); see transaction_timestamp() timestamp with time zone Current date and time (start of current transaction); see

In addition to these functions, the SQL OVERLAPS operator is supported: (start1, end1) OVERLAPS (start2, end2) (start1, length1) OVERLAPS (start2, length2) This expression yields true when two time periods (defined by their endpoints) overlap, false when they do not overlap. The endpoints can be specified as pairs of dates, times, or time stamps; or as a date, time, or time stamp followed by an interval. SELECT (DATE '2001-02-16', DATE '2001-12-21') OVERLAPS (DATE '2001-10-30', DATE '2002-10-30'); Result: true SELECT (DATE '2001-02-16', INTERVAL '100 days') OVERLAPS (DATE '2001-10-30', DATE '2002-10-30'); Result: false When adding an interval value to (or subtracting an interval value from) a timestamp with time zone value, the days component advances (or decrements) the date of the timestamp with time zone by the indicated number of days. Across daylight saving time changes (with the session time zone set to a time zone that recognizes DST), this means interval '1 day' does not necessarily equal interval '24 hours'. For example, with the session time zone set to CST7CDT, timestamp with time zone '2005-04-02 12:00-07' + interval '1 day' will produce timestamp with time zone '2005-04-03 12:00-06', while adding interval '24 hours' to the same initial timestamp with time zone produces timestamp with time zone '2005-04-03 13:00-06', as there is a change in daylight saving time at 2005-04-03 02:00 in time zone CST7CDT. <function>EXTRACT</function>, <function>date_part</function> date_part extract EXTRACT(field FROM source) The extract function retrieves subfields such as year or hour from date/time values. source must be a value expression of type timestamp, time, or interval. (Expressions of type date will be cast to timestamp and can therefore be used as well.) field is an identifier or string that selects what field to extract from the source value. The extract function returns values of type double precision. The following are valid field names: century The century SELECT EXTRACT(CENTURY FROM TIMESTAMP '2000-12-16 12:21:13'); Result: 20 SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 21 The first century starts at 0001-01-01 00:00:00 AD, although they did not know it at the time. This definition applies to all Gregorian calendar countries. There is no century number 0, you go from -1 to 1. If you disagree with this, please write your complaint to: Pope, Cathedral Saint-Peter of Roma, Vatican. PostgreSQL releases before 8.0 did not follow the conventional numbering of centuries, but just returned the year field divided by 100. day The day (of the month) field (1 - 31) SELECT EXTRACT(DAY FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 16 decade The year field divided by 10 SELECT EXTRACT(DECADE FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 200 dow The day of the week as Sunday(0) to Saturday(6) SELECT EXTRACT(DOW FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 5 Note that extract's day of the week numbering is different from that of the to_char(..., 'D') function. doy The day of the year (1 - 365/366) SELECT EXTRACT(DOY FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 47 epoch For date and timestamp values, the number of seconds since 1970-01-01 00:00:00-00 (can be negative); for interval values, the total number of seconds in the interval SELECT EXTRACT(EPOCH FROM TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-08'); Result: 982384720 SELECT EXTRACT(EPOCH FROM INTERVAL '5 days 3 hours'); Result: 442800 Here is how you can convert an epoch value back to a time stamp: SELECT TIMESTAMP WITH TIME ZONE 'epoch' + 982384720 * INTERVAL '1 second'; hour The hour field (0 - 23) SELECT EXTRACT(HOUR FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 20 isodow The day of the week as Monday(1) to Sunday(7) SELECT EXTRACT(ISODOW FROM TIMESTAMP '2001-02-18 20:38:40'); Result: 7 This is identical to dow except for Sunday. This matches the ISO 8601 day of the week numbering. isoyear The ISO 8601 year that the date falls in (not applicable to intervals). SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-01'); Result: 2005 SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-02'); Result: 2006 Each ISO year begins with the Monday of the week containing the 4th of January, so in early January or late December the ISO year may be different from the Gregorian year. See the week field for more information. This field is not available in PostgreSQL releases prior to 8.3. microseconds The seconds field, including fractional parts, multiplied by 1 000 000. Note that this includes full seconds. SELECT EXTRACT(MICROSECONDS FROM TIME '17:12:28.5'); Result: 28500000 millennium The millennium SELECT EXTRACT(MILLENNIUM FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 3 Years in the 1900s are in the second millennium. The third millennium starts January 1, 2001. PostgreSQL releases before 8.0 did not follow the conventional numbering of millennia, but just returned the year field divided by 1000. milliseconds The seconds field, including fractional parts, multiplied by 1000. Note that this includes full seconds. SELECT EXTRACT(MILLISECONDS FROM TIME '17:12:28.5'); Result: 28500 minute The minutes field (0 - 59) SELECT EXTRACT(MINUTE FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 38 month For timestamp values, the number of the month within the year (1 - 12) ; for interval values the number of months, modulo 12 (0 - 11) SELECT EXTRACT(MONTH FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 2 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 3 months'); Result: 3 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 13 months'); Result: 1 quarter The quarter of the year (1 - 4) that the day is in SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 1 second The seconds field, including fractional parts (0 - 5960 if leap seconds are implemented by the operating system) SELECT EXTRACT(SECOND FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 40 SELECT EXTRACT(SECOND FROM TIME '17:12:28.5'); Result: 28.5 timezone The time zone offset from UTC, measured in seconds. Positive values correspond to time zones east of UTC, negative values to zones west of UTC. timezone_hour The hour component of the time zone offset timezone_minute The minute component of the time zone offset week The number of the week of the year that the day is in. By definition (ISO 8601), the first week of a year contains January 4 of that year. (The ISO-8601 week starts on Monday.) In other words, the first Thursday of a year is in week 1 of that year. Because of this, it is possible for early January dates to be part of the 52nd or 53rd week of the previous year. For example, 2005-01-01 is part of the 53rd week of year 2004, and 2006-01-01 is part of the 52nd week of year 2005. SELECT EXTRACT(WEEK FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 7 year The year field. Keep in mind there is no 0 AD, so subtracting BC years from AD years should be done with care. SELECT EXTRACT(YEAR FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 2001 The extract function is primarily intended for computational processing. For formatting date/time values for display, see . The date_part function is modeled on the traditional Ingres equivalent to the SQL-standard function extract: date_part('field', source) Note that here the field parameter needs to be a string value, not a name. The valid field names for date_part are the same as for extract. SELECT date_part('day', TIMESTAMP '2001-02-16 20:38:40'); Result: 16 SELECT date_part('hour', INTERVAL '4 hours 3 minutes'); Result: 4 <function>date_trunc</function> date_trunc The function date_trunc is conceptually similar to the trunc function for numbers. date_trunc('field', source) source is a value expression of type timestamp or interval. (Values of type date and time are cast automatically, to timestamp or interval respectively.) field selects to which precision to truncate the input value. The return value is of type timestamp or interval with all fields that are less significant than the selected one set to zero (or one, for day and month). Valid values for field are: microseconds milliseconds second minute hour day week month quarter year decade century millennium Examples: SELECT date_trunc('hour', TIMESTAMP '2001-02-16 20:38:40'); Result: 2001-02-16 20:00:00 SELECT date_trunc('year', TIMESTAMP '2001-02-16 20:38:40'); Result: 2001-01-01 00:00:00 <literal>AT TIME ZONE</literal> time zone conversion AT TIME ZONE The AT TIME ZONE construct allows conversions of time stamps to different time zones. shows its variants. <literal>AT TIME ZONE</literal> Variants Expression Return Type Description timestamp without time zone AT TIME ZONE zone timestamp with time zone Treat given time stamp without time zone as located in the specified time zone timestamp with time zone AT TIME ZONE zone timestamp without time zone Convert given time stamp with time zone to the new time zone time with time zone AT TIME ZONE zone time with time zone Convert given time with time zone to the new time zone

In these expressions, the desired time zone zone can be specified either as a text string (e.g., 'PST') or as an interval (e.g., INTERVAL '-08:00'). In the text case, a time zone name can be specified in any of the ways described in . Examples (supposing that the local time zone is PST8PDT): SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'MST'; Result: 2001-02-16 19:38:40-08 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'MST'; Result: 2001-02-16 18:38:40 The first example takes a time stamp without time zone and interprets it as MST time (UTC-7), which is then converted to PST (UTC-8) for display. The second example takes a time stamp specified in EST (UTC-5) and converts it to local time in MST (UTC-7). The function timezone(zone, timestamp) is equivalent to the SQL-conforming construct timestamp AT TIME ZONE zone. Current Date/Time date current time current PostgreSQL provides a number of functions that return values related to the current date and time. These SQL-standard functions all return values based on the start time of the current transaction: CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP CURRENT_TIME(precision) CURRENT_TIMESTAMP(precision) LOCALTIME LOCALTIMESTAMP LOCALTIME(precision) LOCALTIMESTAMP(precision) CURRENT_TIME and CURRENT_TIMESTAMP deliver values with time zone; LOCALTIME and LOCALTIMESTAMP deliver values without time zone. CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME, and LOCALTIMESTAMP can optionally be given a precision parameter, which causes the result to be rounded to that many fractional digits in the seconds field. Without a precision parameter, the result is given to the full available precision. Some examples: SELECT CURRENT_TIME; Result: 14:39:53.662522-05 SELECT CURRENT_DATE; Result: 2001-12-23 SELECT CURRENT_TIMESTAMP; Result: 2001-12-23 14:39:53.662522-05 SELECT CURRENT_TIMESTAMP(2); Result: 2001-12-23 14:39:53.66-05 SELECT LOCALTIMESTAMP; Result: 2001-12-23 14:39:53.662522 Since these functions return the start time of the current transaction, their values do not change during the transaction. This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the current time, so that multiple modifications within the same transaction bear the same time stamp. Other database systems might advance these values more frequently. PostgreSQL also provides functions that return the start time of the current statement, as well as the actual current time at the instant the function is called. The complete list of non-SQL-standard time functions is: now() transaction_timestamp() statement_timestamp() clock_timestamp() timeofday() now() is a traditional PostgreSQL equivalent to CURRENT_TIMESTAMP. transaction_timestamp() is likewise equivalent to CURRENT_TIMESTAMP, but is named to clearly reflect what it returns. statement_timestamp() returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client). statement_timestamp() and transaction_timestamp() return the same value during the first command of a transaction, but might differ during subsequent commands. clock_timestamp() returns the actual current time, and therefore its value changes even within a single SQL command. timeofday() is a historical PostgreSQL function. Like clock_timestamp(), it returns the actual current time, but as a formatted text string rather than a timestamp with time zone value. All the date/time data types also accept the special literal value now to specify the current date and time (again, interpreted as the transaction start time). Thus, the following three all return the same result: SELECT CURRENT_TIMESTAMP; SELECT now(); SELECT TIMESTAMP 'now'; -- incorrect for use with DEFAULT You do not want to use the third form when specifying a DEFAULT clause while creating a table. The system will convert now to a timestamp as soon as the constant is parsed, so that when the default value is needed, the time of the table creation would be used! The first two forms will not be evaluated until the default value is used, because they are function calls. Thus they will give the desired behavior of defaulting to the time of row insertion. Delaying Execution pg_sleep sleep delay The following function is available to delay execution of the server process: pg_sleep(seconds) pg_sleep makes the current session's process sleep until seconds seconds have elapsed. seconds is a value of type double precision, so fractional-second delays can be specified. For example: SELECT pg_sleep(1.5); The effective resolution of the sleep interval is platform-specific; 0.01 seconds is a common value. The sleep delay will be at least as long as specified. It might be longer depending on factors such as server load. Make sure that your session does not hold more locks than necessary when calling pg_sleep. Otherwise other sessions might have to wait for your sleeping process, slowing down the entire system. Geometric Functions and Operators The geometric types point, box, lseg, line, path, polygon, and circle have a large set of native support functions and operators, shown in , , and . Note that the same as operator, ~=, represents the usual notion of equality for the point, box, polygon, and circle types. Some of these types also have an = operator, but = compares for equal areas only. The other scalar comparison operators (<= and so on) likewise compare areas for these types. Geometric Operators Operator Description Example + Translation box '((0,0),(1,1))' + point '(2.0,0)' - Translation box '((0,0),(1,1))' - point '(2.0,0)' * Scaling/rotation box '((0,0),(1,1))' * point '(2.0,0)' / Scaling/rotation box '((0,0),(2,2))' / point '(2.0,0)' # Point or box of intersection '((1,-1),(-1,1))' # '((1,1),(-1,-1))' # Number of points in path or polygon # '((1,0),(0,1),(-1,0))' @-@ Length or circumference @-@ path '((0,0),(1,0))' @@ Center @@ circle '((0,0),10)' ## Closest point to first operand on second operand point '(0,0)' ## lseg '((2,0),(0,2))' <-> Distance between circle '((0,0),1)' <-> circle '((5,0),1)' && Overlaps? box '((0,0),(1,1))' && box '((0,0),(2,2))' << Is strictly left of? circle '((0,0),1)' << circle '((5,0),1)' >> Is strictly right of? circle '((5,0),1)' >> circle '((0,0),1)' &< Does not extend to the right of? box '((0,0),(1,1))' &< box '((0,0),(2,2))' &> Does not extend to the left of? box '((0,0),(3,3))' &> box '((0,0),(2,2))' <<| Is strictly below? box '((0,0),(3,3))' <<| box '((3,4),(5,5))' |>> Is strictly above? box '((3,4),(5,5))' |>> box '((0,0),(3,3))' &<| Does not extend above? box '((0,0),(1,1))' &<| box '((0,0),(2,2))' |&> Does not extend below? box '((0,0),(3,3))' |&> box '((0,0),(2,2))' <^ Is below (allows touching)? circle '((0,0),1)' <^ circle '((0,5),1)' >^ Is above (allows touching)? circle '((0,5),1)' >^ circle '((0,0),1)' ?# Intersects? lseg '((-1,0),(1,0))' ?# box '((-2,-2),(2,2))' ?- Is horizontal? ?- lseg '((-1,0),(1,0))' ?- Are horizontally aligned? point '(1,0)' ?- point '(0,0)' ?| Is vertical? ?| lseg '((-1,0),(1,0))' ?| Are vertically aligned? point '(0,1)' ?| point '(0,0)' ?-| Is perpendicular? lseg '((0,0),(0,1))' ?-| lseg '((0,0),(1,0))' ?|| Are parallel? lseg '((-1,0),(1,0))' ?|| lseg '((-1,2),(1,2))' @> Contains? circle '((0,0),2)' @> point '(1,1)' <@ Contained in or on? point '(1,1)' <@ circle '((0,0),2)' ~= Same as? polygon '((0,0),(1,1))' ~= polygon '((1,1),(0,0))'

Before PostgreSQL 8.2, the containment operators @> and <@ were respectively called ~ and @. These names are still available, but are deprecated and will eventually be retired. area center diameter height isclosed isopen length npoints pclose popen radius width Geometric Functions Function Return Type Description Example area(object) double precision area area(box '((0,0),(1,1))') center(object) point center center(box '((0,0),(1,2))') diameter(circle) double precision diameter of circle diameter(circle '((0,0),2.0)') height(box) double precision vertical size of box height(box '((0,0),(1,1))') isclosed(path) boolean a closed path? isclosed(path '((0,0),(1,1),(2,0))') isopen(path) boolean an open path? isopen(path '[(0,0),(1,1),(2,0)]') length(object) double precision length length(path '((-1,0),(1,0))') npoints(path) int number of points npoints(path '[(0,0),(1,1),(2,0)]') npoints(polygon) int number of points npoints(polygon '((1,1),(0,0))') pclose(path) path convert path to closed pclose(path '[(0,0),(1,1),(2,0)]') point(lseg, lseg) point intersection point(lseg '((-1,0),(1,0))',lseg '((-2,-2),(2,2))') ]]> popen(path) path convert path to open popen(path '((0,0),(1,1),(2,0))') radius(circle) double precision radius of circle radius(circle '((0,0),2.0)') width(box) double precision horizontal size of box width(box '((0,0),(1,1))')

Geometric Type Conversion Functions Function Return Type Description Example box(circle) box circle to box box(circle '((0,0),2.0)') box(point, point) box points to box box(point '(0,0)', point '(1,1)') box(polygon) box polygon to box box(polygon '((0,0),(1,1),(2,0))') circle(box) circle box to circle circle(box '((0,0),(1,1))') circle(point, double precision) circle center and radius to circle circle(point '(0,0)', 2.0) circle(polygon) circle polygon to circle circle(polygon '((0,0),(1,1),(2,0))') lseg(box) lseg box diagonal to line segment lseg(box '((-1,0),(1,0))') lseg(point, point) lseg points to line segment lseg(point '(-1,0)', point '(1,0)') path(polygon) point polygon to path path(polygon '((0,0),(1,1),(2,0))') point(double precision, double precision) point construct point point(23.4, -44.5) point(box) point center of box point(box '((-1,0),(1,0))') point(circle) point center of circle point(circle '((0,0),2.0)') point(lseg) point center of line segment point(lseg '((-1,0),(1,0))') point(polygon) point center of polygon point(polygon '((0,0),(1,1),(2,0))') polygon(box) polygon box to 4-point polygon polygon(box '((0,0),(1,1))') polygon(circle) polygon circle to 12-point polygon polygon(circle '((0,0),2.0)') polygon(npts, circle) polygon circle to npts-point polygon polygon(12, circle '((0,0),2.0)') polygon(path) polygon path to polygon polygon(path '((0,0),(1,1),(2,0))')

It is possible to access the two component numbers of a point as though it were an array with indices 0 and 1. For example, if t.p is a point column then SELECT p[0] FROM t retrieves the X coordinate and UPDATE t SET p[1] = ... changes the Y coordinate. In the same way, a value of type box or lseg can be treated as an array of two point values. The area function works for the types box, circle, and path. The area function only works on the path data type if the points in the path are non-intersecting. For example, the path '((0,0),(0,1),(2,1),(2,2),(1,2),(1,0),(0,0))'::PATH won't work, however, the following visually identical path '((0,0),(0,1),(1,1),(1,2),(2,2),(2,1),(1,1),(1,0),(0,0))'::PATH will work. If the concept of an intersecting versus non-intersecting path is confusing, draw both of the above paths side by side on a piece of graph paper. Network Address Functions and Operators shows the operators available for the cidr and inet types. The operators <<, <<=, >>, and >>= test for subnet inclusion. They consider only the network parts of the two addresses, ignoring any host part, and determine whether one network part is identical to or a subnet of the other. <type>cidr</type> and <type>inet</type> Operators Operator Description Example < is less than inet '192.168.1.5' < inet '192.168.1.6' <= is less than or equal inet '192.168.1.5' <= inet '192.168.1.5' = equals inet '192.168.1.5' = inet '192.168.1.5' >= is greater or equal inet '192.168.1.5' >= inet '192.168.1.5' > is greater than inet '192.168.1.5' > inet '192.168.1.4' <> is not equal inet '192.168.1.5' <> inet '192.168.1.4' << is contained within inet '192.168.1.5' << inet '192.168.1/24' <<= is contained within or equals inet '192.168.1/24' <<= inet '192.168.1/24' >> contains inet '192.168.1/24' >> inet '192.168.1.5' >>= contains or equals inet '192.168.1/24' >>= inet '192.168.1/24' ~ bitwise NOT ~ inet '192.168.1.6' & bitwise AND inet '192.168.1.6' & inet '0.0.0.255' | bitwise OR inet '192.168.1.6' | inet '0.0.0.255' + addition inet '192.168.1.6' + 25 - subtraction inet '192.168.1.43' - 36 - subtraction inet '192.168.1.43' - inet '192.168.1.19'

shows the functions available for use with the cidr and inet types. The host, text, and abbrev functions are primarily intended to offer alternative display formats. <type>cidr</type> and <type>inet</type> Functions Function Return Type Description Example Result abbrev(inet) text abbreviated display format as text abbrev(inet '10.1.0.0/16') 10.1.0.0/16 abbrev(cidr) text abbreviated display format as text abbrev(cidr '10.1.0.0/16') 10.1/16 broadcast(inet) inet broadcast address for network broadcast('192.168.1.5/24') 192.168.1.255/24 family(inet) int extract family of address; 4 for IPv4, 6 for IPv6 family('::1') 6 host(inet) text extract IP address as text host('192.168.1.5/24') 192.168.1.5 hostmask(inet) inet construct host mask for network hostmask('192.168.23.20/30') 0.0.0.3 masklen(inet) int extract netmask length masklen('192.168.1.5/24') 24 netmask(inet) inet construct netmask for network netmask('192.168.1.5/24') 255.255.255.0 network(inet) cidr extract network part of address network('192.168.1.5/24') 192.168.1.0/24 set_masklen(inet, int) inet set netmask length for inet value set_masklen('192.168.1.5/24', 16) 192.168.1.5/16 set_masklen(cidr, int) cidr set netmask length for cidr value set_masklen('192.168.1.0/24'::cidr, 16) 192.168.0.0/16 text(inet) text extract IP address and netmask length as text text(inet '192.168.1.5') 192.168.1.5/32

Any cidr value can be cast to inet implicitly or explicitly; therefore, the functions shown above as operating on inet also work on cidr values. (Where there are separate functions for inet and cidr, it is because the behavior should be different for the two cases.) Also, it is permitted to cast an inet value to cidr. When this is done, any bits to the right of the netmask are silently zeroed to create a valid cidr value. In addition, you can cast a text value to inet or cidr using normal casting syntax: for example, inet(expression) or colname::cidr. shows the functions available for use with the macaddr type. The function trunc(macaddr) returns a MAC address with the last 3 bytes set to zero. This can be used to associate the remaining prefix with a manufacturer. <type>macaddr</type> Functions Function Return Type Description Example Result trunc(macaddr) macaddr set last 3 bytes to zero trunc(macaddr '12:34:56:78:90:ab') 12:34:56:00:00:00

The macaddr type also supports the standard relational operators (>, <=, etc.) for lexicographical ordering. Sequence Manipulation Functions sequence nextval currval lastval setval This section describes PostgreSQL's functions for operating on sequence objects. Sequence objects (also called sequence generators or just sequences) are special single-row tables created with CREATE SEQUENCE. A sequence object is usually used to generate unique identifiers for rows of a table. The sequence functions, listed in , provide simple, multiuser-safe methods for obtaining successive sequence values from sequence objects. Sequence Functions Function Return Type Description currval(regclass) bigint Return value most recently obtained with nextval for specified sequence nextval(regclass) bigint Advance sequence and return new value setval(regclass, bigint) bigint Set sequence's current value setval(regclass, bigint, boolean) bigint Set sequence's current value and is_called flag

The sequence to be operated on by a sequence-function call is specified by a regclass argument, which is just the OID of the sequence in the pg_class system catalog. You do not have to look up the OID by hand, however, since the regclass data type's input converter will do the work for you. Just write the sequence name enclosed in single quotes, so that it looks like a literal constant. To achieve some compatibility with the handling of ordinary SQL names, the string will be converted to lowercase unless it contains double quotes around the sequence name. Thus: nextval('foo') operates on sequence foo nextval('FOO') operates on sequence foo nextval('"Foo"') operates on sequence Foo The sequence name can be schema-qualified if necessary: nextval('myschema.foo') operates on myschema.foo nextval('"myschema".foo') same as above nextval('foo') searches search path for foo See for more information about regclass. Before PostgreSQL 8.1, the arguments of the sequence functions were of type text, not regclass, and the above-described conversion from a text string to an OID value would happen at run time during each call. For backwards compatibility, this facility still exists, but internally it is now handled as an implicit coercion from text to regclass before the function is invoked. When you write the argument of a sequence function as an unadorned literal string, it becomes a constant of type regclass. Since this is really just an OID, it will track the originally identified sequence despite later renaming, schema reassignment, etc. This early binding behavior is usually desirable for sequence references in column defaults and views. But sometimes you will want late binding where the sequence reference is resolved at run time. To get late-binding behavior, force the constant to be stored as a text constant instead of regclass: nextval('foo'::text) foo is looked up at runtime Note that late binding was the only behavior supported in PostgreSQL releases before 8.1, so you might need to do this to preserve the semantics of old applications. Of course, the argument of a sequence function can be an expression as well as a constant. If it is a text expression then the implicit coercion will result in a run-time lookup. The available sequence functions are: nextval Advance the sequence object to its next value and return that value. This is done atomically: even if multiple sessions execute nextval concurrently, each will safely receive a distinct sequence value. currval Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Notice that because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did. lastval Return the value most recently returned by nextval in the current session. This function is identical to currval, except that instead of taking the sequence name as an argument it fetches the value of the last sequence that nextval was used on in the current session. It is an error to call lastval if nextval has not yet been called in the current session. setval Reset the sequence object's counter value. The two-parameter form sets the sequence's last_value field to the specified value and sets its is_called field to true, meaning that the next nextval will advance the sequence before returning a value. In the three-parameter form, is_called can be set either true or false. If it's set to false, the next nextval will return exactly the specified value, and sequence advancement commences with the following nextval. For example, SELECT setval('foo', 42); Next nextval will return 43 SELECT setval('foo', 42, true); Same as above SELECT setval('foo', 42, false); Next nextval will return 42 The result returned by setval is just the value of its second argument. If a sequence object has been created with default parameters, nextval calls on it will return successive values beginning with 1. Other behaviors can be obtained by using special parameters in the command; see its command reference page for more information. To avoid blocking of concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back; that is, once a value has been fetched it is considered used, even if the transaction that did the nextval later aborts. This means that aborted transactions might leave unused holes in the sequence of assigned values. setval operations are never rolled back, either. Conditional Expressions CASE conditional expression This section describes the SQL-compliant conditional expressions available in PostgreSQL. If your needs go beyond the capabilities of these conditional expressions you might want to consider writing a stored procedure in a more expressive programming language. <literal>CASE</> The SQL CASE expression is a generic conditional expression, similar to if/else statements in other languages: CASE WHEN condition THEN result WHEN ... ELSE result END CASE clauses can be used wherever an expression is valid. condition is an expression that returns a boolean result. If the result is true then the value of the CASE expression is the result that follows the condition. If the result is false any subsequent WHEN clauses are searched in the same manner. If no WHEN condition is true then the value of the case expression is the result in the ELSE clause. If the ELSE clause is omitted and no condition matches, the result is null. An example: SELECT * FROM test; a --- 1 2 3 SELECT a, CASE WHEN a=1 THEN 'one' WHEN a=2 THEN 'two' ELSE 'other' END FROM test; a | case ---+------- 1 | one 2 | two 3 | other The data types of all the result expressions must be convertible to a single output type. See for more detail. The following simple CASE expression is a specialized variant of the general form above: CASE expression WHEN value THEN result WHEN ... ELSE result END The expression is computed and compared to all the value specifications in the WHEN clauses until one is found that is equal. If no match is found, the result in the ELSE clause (or a null value) is returned. This is similar to the switch statement in C. The example above can be written using the simple CASE syntax: SELECT a, CASE a WHEN 1 THEN 'one' WHEN 2 THEN 'two' ELSE 'other' END FROM test; a | case ---+------- 1 | one 2 | two 3 | other A CASE expression does not evaluate any subexpressions that are not needed to determine the result. For example, this is a possible way of avoiding a division-by-zero failure: SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false END; <literal>COALESCE</> COALESCE NVL IFNULL COALESCE(value , ...) The COALESCE function returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display, for example: SELECT COALESCE(description, short_description, '(none)') ... Like a CASE expression, COALESCE will not evaluate arguments that are not needed to determine the result; that is, arguments to the right of the first non-null argument are not evaluated. This SQL-standard function provides capabilities similar to NVL and IFNULL, which are used in some other database systems. <literal>NULLIF</> NULLIF NULLIF(value1, value2) The NULLIF function returns a null value if value1 and value2 are equal; otherwise it returns value1. This can be used to perform the inverse operation of the COALESCE example given above: SELECT NULLIF(value, '(none)') ... If value1 is (none), return a null, otherwise return value1. <literal>GREATEST</literal> and <literal>LEAST</literal> GREATEST LEAST GREATEST(value , ...) LEAST(value , ...) The GREATEST and LEAST functions select the largest or smallest value from a list of any number of expressions. The expressions must all be convertible to a common data type, which will be the type of the result (see for details). NULL values in the list are ignored. The result will be NULL only if all the expressions evaluate to NULL. Note that GREATEST and LEAST are not in the SQL standard, but are a common extension. Array Functions and Operators shows the operators available for array types. <type>array</type> Operators Operator Description Example Result = equal ARRAY[1.1,2.1,3.1]::int[] = ARRAY[1,2,3] t <> not equal ARRAY[1,2,3] <> ARRAY[1,2,4] t < less than ARRAY[1,2,3] < ARRAY[1,2,4] t > greater than ARRAY[1,4,3] > ARRAY[1,2,4] t <= less than or equal ARRAY[1,2,3] <= ARRAY[1,2,3] t >= greater than or equal ARRAY[1,4,3] >= ARRAY[1,4,3] t @> contains ARRAY[1,4,3] @> ARRAY[3,1] t <@ is contained by ARRAY[2,7] <@ ARRAY[1,7,4,2,6] t && overlap (have elements in common) ARRAY[1,4,3] && ARRAY[2,1] t || array-to-array concatenation ARRAY[1,2,3] || ARRAY[4,5,6] {1,2,3,4,5,6} || array-to-array concatenation ARRAY[1,2,3] || ARRAY[[4,5,6],[7,8,9]] {{1,2,3},{4,5,6},{7,8,9}} || element-to-array concatenation 3 || ARRAY[4,5,6] {3,4,5,6} || array-to-element concatenation ARRAY[4,5,6] || 7 {4,5,6,7}

Array comparisons compare the array contents element-by-element, using the default B-Tree comparison function for the element data type. In multidimensional arrays the elements are visited in row-major order (last subscript varies most rapidly). If the contents of two arrays are equal but the dimensionality is different, the first difference in the dimensionality information determines the sort order. (This is a change from versions of PostgreSQL prior to 8.2: older versions would claim that two arrays with the same contents were equal, even if the number of dimensions or subscript ranges were different.) See for more details about array operator behavior. shows the functions available for use with array types. See for more discussion and examples of the use of these functions. <type>array</type> Functions Function Return Type Description Example Result array_append(anyarray, anyelement) anyarray append an element to the end of an array array_append(ARRAY[1,2], 3) {1,2,3} array_cat(anyarray, anyarray) anyarray concatenate two arrays array_cat(ARRAY[1,2,3], ARRAY[4,5]) {1,2,3,4,5} array_dims(anyarray) text returns a text representation of array's dimensions array_dims(ARRAY[[1,2,3], [4,5,6]]) [1:2][1:3] array_lower(anyarray, int) int returns lower bound of the requested array dimension array_lower('[0:2]={1,2,3}'::int[], 1) 0 array_prepend(anyelement, anyarray) anyarray append an element to the beginning of an array array_prepend(1, ARRAY[2,3]) {1,2,3} array_to_string(anyarray, text) text concatenates array elements using provided delimiter array_to_string(ARRAY[1, 2, 3], '~^~') 1~^~2~^~3 array_upper(anyarray, int) int returns upper bound of the requested array dimension array_upper(ARRAY[1,2,3,4], 1) 4 string_to_array(text, text) text[] splits string into array elements using provided delimiter string_to_array('xx~^~yy~^~zz', '~^~') {xx,yy,zz}

Aggregate Functions aggregate function built-in Aggregate functions compute a single result value from a set of input values. The built-in aggregate functions are listed in and . The special syntax considerations for aggregate functions are explained in . Consult for additional introductory information. General-Purpose Aggregate Functions Function Argument Type Return Type Description average avg(expression) smallint, int, bigint, real, double precision, numeric, or interval numeric for any integer type argument, double precision for a floating-point argument, otherwise the same as the argument data type the average (arithmetic mean) of all input values bit_and bit_and(expression) smallint, int, bigint, or bit same as argument data type the bitwise AND of all non-null input values, or null if none bit_or bit_or(expression) smallint, int, bigint, or bit same as argument data type the bitwise OR of all non-null input values, or null if none bool_and bool_and(expression) bool bool true if all input values are true, otherwise false bool_or bool_or(expression) bool bool true if at least one input value is true, otherwise false count(*) bigint number of input rows count(expression) any bigint number of input rows for which the value of expression is not null every every(expression) bool bool equivalent to bool_and max(expression) any array, numeric, string, or date/time type same as argument type maximum value of expression across all input values min(expression) any array, numeric, string, or date/time type same as argument type minimum value of expression across all input values sum(expression) smallint, int, bigint, real, double precision, numeric, or interval bigint for smallint or int arguments, numeric for bigint arguments, double precision for floating-point arguments, otherwise the same as the argument data type sum of expression across all input values

It should be noted that except for count, these functions return a null value when no rows are selected. In particular, sum of no rows returns null, not zero as one might expect. The coalesce function can be used to substitute zero for null when necessary. ANY SOME Boolean aggregates bool_and and bool_or correspond to standard SQL aggregates every and any or some. As for any and some, it seems that there is an ambiguity built into the standard syntax: SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...; Here ANY can be considered both as leading to a subquery or as an aggregate if the select expression returns 1 row. Thus the standard name cannot be given to these aggregates. Users accustomed to working with other SQL database management systems might be surprised by the performance of the count aggregate when it is applied to the entire table. A query like: SELECT count(*) FROM sometable; will be executed by PostgreSQL using a sequential scan of the entire table. shows aggregate functions typically used in statistical analysis. (These are separated out merely to avoid cluttering the listing of more-commonly-used aggregates.) Where the description mentions N, it means the number of input rows for which all the input expressions are non-null. In all cases, null is returned if the computation is meaningless, for example when N is zero. statistics linear regression Aggregate Functions for Statistics Function Argument Type Return Type Description correlation corr(Y, X) double precision double precision correlation coefficient covariance population covar_pop(Y, X) double precision double precision population covariance covariance sample covar_samp(Y, X) double precision double precision sample covariance regr_avgx(Y, X) double precision double precision average of the independent variable (sum(X)/N) regr_avgy(Y, X) double precision double precision average of the dependent variable (sum(Y)/N) regr_count(Y, X) double precision bigint number of input rows in which both expressions are nonnull regression intercept regr_intercept(Y, X) double precision double precision y-intercept of the least-squares-fit linear equation determined by the (X, Y) pairs regr_r2(Y, X) double precision double precision square of the correlation coefficient regression slope regr_slope(Y, X) double precision double precision slope of the least-squares-fit linear equation determined by the (X, Y) pairs regr_sxx(Y, X) double precision double precision sum(X^2) - sum(X)^2/N (sum of squares of the independent variable) regr_sxy(Y, X) double precision double precision sum(X*Y) - sum(X) * sum(Y)/N (sum of products of independent times dependent variable) regr_syy(Y, X) double precision double precision sum(Y^2) - sum(Y)^2/N (sum of squares of the dependent variable) standard deviation stddev(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric historical alias for stddev_samp standard deviation population stddev_pop(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric population standard deviation of the input values standard deviation sample stddev_samp(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric sample standard deviation of the input values variance variance(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric historical alias for var_samp variance population var_pop(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric population variance of the input values (square of the population standard deviation) variance sample var_samp(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric sample variance of the input values (square of the sample standard deviation)

Subquery Expressions EXISTS IN NOT IN ANY ALL SOME subquery This section describes the SQL-compliant subquery expressions available in PostgreSQL. All of the expression forms documented in this section return Boolean (true/false) results. <literal>EXISTS</literal> EXISTS (subquery) The argument of EXISTS is an arbitrary SELECT statement, or subquery. The subquery is evaluated to determine whether it returns any rows. If it returns at least one row, the result of EXISTS is true; if the subquery returns no rows, the result of EXISTS is false. The subquery can refer to variables from the surrounding query, which will act as constants during any one evaluation of the subquery. The subquery will generally only be executed far enough to determine whether at least one row is returned, not all the way to completion. It is unwise to write a subquery that has any side effects (such as calling sequence functions); whether the side effects occur or not might be difficult to predict. Since the result depends only on whether any rows are returned, and not on the contents of those rows, the output list of the subquery is normally uninteresting. A common coding convention is to write all EXISTS tests in the form EXISTS(SELECT 1 WHERE ...). There are exceptions to this rule however, such as subqueries that use INTERSECT. This simple example is like an inner join on col2, but it produces at most one output row for each tab1 row, even if there are multiple matching tab2 rows: SELECT col1 FROM tab1 WHERE EXISTS(SELECT 1 FROM tab2 WHERE col2 = tab1.col2); <literal>IN</literal> expression IN (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result. The result of IN is true if any equal subquery row is found. The result is false if no equal row is found (including the special case where the subquery returns no rows). Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand row yields null, the result of the IN construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor IN (subquery) The left-hand side of this form of IN is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result. The result of IN is true if any equal subquery row is found. The result is false if no equal row is found (including the special case where the subquery returns no rows). As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of that row comparison is unknown (null). If all the per-row results are either unequal or null, with at least one null, then the result of IN is null. <literal>NOT IN</literal> expression NOT IN (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result. The result of NOT IN is true if only unequal subquery rows are found (including the special case where the subquery returns no rows). The result is false if any equal row is found. Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand row yields null, the result of the NOT IN construct will be null, not true. This is in accordance with SQL's normal rules for Boolean combinations of null values. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor NOT IN (subquery) The left-hand side of this form of NOT IN is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result. The result of NOT IN is true if only unequal subquery rows are found (including the special case where the subquery returns no rows). The result is false if any equal row is found. As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of that row comparison is unknown (null). If all the per-row results are either unequal or null, with at least one null, then the result of NOT IN is null. <literal>ANY</literal>/<literal>SOME</literal> expression operator ANY (subquery) expression operator SOME (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of ANY is true if any true result is obtained. The result is false if no true result is found (including the special case where the subquery returns no rows). SOME is a synonym for ANY. IN is equivalent to = ANY. Note that if there are no successes and at least one right-hand row yields null for the operator's result, the result of the ANY construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor operator ANY (subquery) row_constructor operator SOME (subquery) The left-hand side of this form of ANY is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result, using the given operator. The result of ANY is true if the comparison returns true for any subquery row. The result is false if the comparison returns false for every subquery row (including the special case where the subquery returns no rows). The result is NULL if the comparison does not return true for any row, and it returns NULL for at least one row. See for details about the meaning of a row-wise comparison. <literal>ALL</literal> expression operator ALL (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of ALL is true if all rows yield true (including the special case where the subquery returns no rows). The result is false if any false result is found. The result is NULL if the comparison does not return false for any row, and it returns NULL for at least one row. NOT IN is equivalent to <> ALL. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor operator ALL (subquery) The left-hand side of this form of ALL is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result, using the given operator. The result of ALL is true if the comparison returns true for all subquery rows (including the special case where the subquery returns no rows). The result is false if the comparison returns false for any subquery row. The result is NULL if the comparison does not return false for any subquery row, and it returns NULL for at least one row. See for details about the meaning of a row-wise comparison. Row-wise Comparison comparison subquery result row row_constructor operator (subquery) The left-hand side is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. Furthermore, the subquery cannot return more than one row. (If it returns zero rows, the result is taken to be null.) The left-hand side is evaluated and compared row-wise to the single subquery result row. See for details about the meaning of a row-wise comparison. Row and Array Comparisons IN NOT IN ANY ALL SOME row-wise comparison comparison row-wise IS DISTINCT FROM IS NOT DISTINCT FROM This section describes several specialized constructs for making multiple comparisons between groups of values. These forms are syntactically related to the subquery forms of the previous section, but do not involve subqueries. The forms involving array subexpressions are PostgreSQL extensions; the rest are SQL-compliant. All of the expression forms documented in this section return Boolean (true/false) results. <literal>IN</literal> expression IN (value , ...) The right-hand side is a parenthesized list of scalar expressions. The result is true if the left-hand expression's result is equal to any of the right-hand expressions. This is a shorthand notation for expression = value1 OR expression = value2 OR ... Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand expression yields null, the result of the IN construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values. <literal>NOT IN</literal> expression NOT IN (value , ...) The right-hand side is a parenthesized list of scalar expressions. The result is true if the left-hand expression's result is unequal to all of the right-hand expressions. This is a shorthand notation for expression <> value1 AND expression <> value2 AND ... Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand expression yields null, the result of the NOT IN construct will be null, not true as one might naively expect. This is in accordance with SQL's normal rules for Boolean combinations of null values. x NOT IN y is equivalent to NOT (x IN y) in all cases. However, null values are much more likely to trip up the novice when working with NOT IN than when working with IN. It's best to express your condition positively if possible. <literal>ANY</literal>/<literal>SOME</literal> (array) expression operator ANY (array expression) expression operator SOME (array expression) The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ANY is true if any true result is obtained. The result is false if no true result is found (including the special case where the array has zero elements). If the array expression yields a null array, the result of ANY will be null. If the left-hand expression yields null, the result of ANY is ordinarily null (though a non-strict comparison operator could possibly yield a different result). Also, if the right-hand array contains any null elements and no true comparison result is obtained, the result of ANY will be null, not false (again, assuming a strict comparison operator). This is in accordance with SQL's normal rules for Boolean combinations of null values. SOME is a synonym for ANY. <literal>ALL</literal> (array) expression operator ALL (array expression) The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ALL is true if all comparisons yield true (including the special case where the array has zero elements). The result is false if any false result is found. If the array expression yields a null array, the result of ALL will be null. If the left-hand expression yields null, the result of ALL is ordinarily null (though a non-strict comparison operator could possibly yield a different result). Also, if the right-hand array contains any null elements and no false comparison result is obtained, the result of ALL will be null, not true (again, assuming a strict comparison operator). This is in accordance with SQL's normal rules for Boolean combinations of null values. Row-wise Comparison row_constructor operator row_constructor Each side is a row constructor, as described in . The two row values must have the same number of fields. Each side is evaluated and they are compared row-wise. Row comparisons are allowed when the operator is =, <>, <, <=, > or >=, or has semantics similar to one of these. (To be specific, an operator can be a row comparison operator if it is a member of a B-Tree operator class, or is the negator of the = member of a B-Tree operator class.) The = and <> cases work slightly differently from the others. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of the row comparison is unknown (null). For the <, <=, > and >= cases, the row elements are compared left-to-right, stopping as soon as an unequal or null pair of elements is found. If either of this pair of elements is null, the result of the row comparison is unknown (null); otherwise comparison of this pair of elements determines the result. For example, ROW(1,2,NULL) < ROW(1,3,0) yields true, not null, because the third pair of elements are not considered. Prior to PostgreSQL 8.2, the <, <=, > and >= cases were not handled per SQL specification. A comparison like ROW(a,b) < ROW(c,d) was implemented as a < c AND b < d whereas the correct behavior is equivalent to a < c OR (a = c AND b < d). row_constructor IS DISTINCT FROM row_constructor This construct is similar to a <> row comparison, but it does not yield null for null inputs. Instead, any null value is considered unequal to (distinct from) any non-null value, and any two nulls are considered equal (not distinct). Thus the result will always be either true or false, never null. row_constructor IS NOT DISTINCT FROM row_constructor This construct is similar to a = row comparison, but it does not yield null for null inputs. Instead, any null value is considered unequal to (distinct from) any non-null value, and any two nulls are considered equal (not distinct). Thus the result will always be either true or false, never null. Set Returning Functions set returning functions functions generate_series This section describes functions that possibly return more than one row. Currently the only functions in this class are series generating functions, as detailed in . Series Generating Functions Function Argument Type Return Type Description generate_series(start, stop) int or bigint setof int or setof bigint (same as argument type) Generate a series of values, from start to stop with a step size of one generate_series(start, stop, step) int or bigint setof int or setof bigint (same as argument type) Generate a series of values, from start to stop with a step size of step

When step is positive, zero rows are returned if start is greater than stop. Conversely, when step is negative, zero rows are returned if start is less than stop. Zero rows are also returned for NULL inputs. It is an error for step to be zero. Some examples follow: select * from generate_series(2,4); generate_series ----------------- 2 3 4 (3 rows) select * from generate_series(5,1,-2); generate_series ----------------- 5 3 1 (3 rows) select * from generate_series(4,3); generate_series ----------------- (0 rows) select current_date + s.a as dates from generate_series(0,14,7) as s(a); dates ------------ 2004-02-05 2004-02-12 2004-02-19 (3 rows) System Information Functions shows several functions that extract session and system information. Session Information Functions Name Return Type Description current_database() name name of current database current_schema() name name of current schema current_schemas(boolean) name[] names of schemas in search path optionally including implicit schemas current_user name user name of current execution context inet_client_addr() inet address of the remote connection inet_client_port() int port of the remote connection inet_server_addr() inet address of the local connection inet_server_port() int port of the local connection pg_my_temp_schema() oid OID of session's temporary schema, or 0 if none pg_is_other_temp_schema(oid) boolean is schema another session's temporary schema? pg_postmaster_start_time() timestamp with time zone server start time session_user name session user name user name equivalent to current_user version() text PostgreSQL version information

user current schema current search path current The session_user is normally the user who initiated the current database connection; but superusers can change this setting with . The current_user is the user identifier that is applicable for permission checking. Normally, it is equal to the session user, but it can be changed with . It also changes during the execution of functions with the attribute SECURITY DEFINER. In Unix parlance, the session user is the real user and the current user is the effective user. current_user, session_user, and user have special syntactic status in SQL: they must be called without trailing parentheses. current_schema returns the name of the schema that is at the front of the search path (or a null value if the search path is empty). This is the schema that will be used for any tables or other named objects that are created without specifying a target schema. current_schemas(boolean) returns an array of the names of all schemas presently in the search path. The Boolean option determines whether or not implicitly included system schemas such as pg_catalog are included in the search path returned. The search path can be altered at run time. The command is: SET search_path TO schema , schema, ... inet_client_addr inet_client_port inet_server_addr inet_server_port inet_client_addr returns the IP address of the current client, and inet_client_port returns the port number. inet_server_addr returns the IP address on which the server accepted the current connection, and inet_server_port returns the port number. All these functions return NULL if the current connection is via a Unix-domain socket. pg_my_temp_schema pg_is_other_temp_schema pg_my_temp_schema returns the OID of the current session's temporary schema, or 0 if it has none (because it has not created any temporary tables). pg_is_other_temp_schema returns true if the given OID is the OID of any other session's temporary schema. (This can be useful, for example, to exclude other sessions' temporary tables from a catalog display.) pg_postmaster_start_time pg_postmaster_start_time returns the timestamp with time zone when the server started. version version returns a string describing the PostgreSQL server's version. privilege querying lists functions that allow the user to query object access privileges programmatically. See for more information about privileges. Access Privilege Inquiry Functions Name Return Type Description has_database_privilege(user, database, privilege) boolean does user have privilege for database has_database_privilege(database, privilege) boolean does current user have privilege for database has_function_privilege(user, function, privilege) boolean does user have privilege for function has_function_privilege(function, privilege) boolean does current user have privilege for function has_language_privilege(user, language, privilege) boolean does user have privilege for language has_language_privilege(language, privilege) boolean does current user have privilege for language has_schema_privilege(user, schema, privilege) boolean does user have privilege for schema has_schema_privilege(schema, privilege) boolean does current user have privilege for schema has_table_privilege(user, table, privilege) boolean does user have privilege for table has_table_privilege(table, privilege) boolean does current user have privilege for table has_tablespace_privilege(user, tablespace, privilege) boolean does user have privilege for tablespace has_tablespace_privilege(tablespace, privilege) boolean does current user have privilege for tablespace pg_has_role(user, role, privilege) boolean does user have privilege for role pg_has_role(role, privilege) boolean does current user have privilege for role

has_database_privilege has_function_privilege has_language_privilege has_schema_privilege has_table_privilege has_tablespace_privilege pg_has_role has_database_privilege checks whether a user can access a database in a particular way. The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege type must evaluate to CREATE, CONNECT, TEMPORARY, or TEMP (which is equivalent to TEMPORARY). has_function_privilege checks whether a user can access a function in a particular way. The possibilities for its arguments are analogous to has_table_privilege. When specifying a function by a text string rather than by OID, the allowed input is the same as for the regprocedure data type (see ). The desired access privilege type must evaluate to EXECUTE. An example is: SELECT has_function_privilege('joeuser', 'myfunc(int, text)', 'execute'); has_language_privilege checks whether a user can access a procedural language in a particular way. The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege type must evaluate to USAGE. has_schema_privilege checks whether a user can access a schema in a particular way. The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege type must evaluate to CREATE or USAGE. has_table_privilege checks whether a user can access a table in a particular way. The user can be specified by name or by OID (pg_authid.oid), or if the argument is omitted current_user is assumed. The table can be specified by name or by OID. (Thus, there are actually six variants of has_table_privilege, which can be distinguished by the number and types of their arguments.) When specifying by name, the name can be schema-qualified if necessary. The desired access privilege type is specified by a text string, which must evaluate to one of the values SELECT, INSERT, UPDATE, DELETE, REFERENCES, or TRIGGER. (Case of the string is not significant, however.) An example is: SELECT has_table_privilege('myschema.mytable', 'select'); has_tablespace_privilege checks whether a user can access a tablespace in a particular way. The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege type must evaluate to CREATE. pg_has_role checks whether a user can access a role in a particular way. The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege type must evaluate to MEMBER or USAGE. MEMBER denotes direct or indirect membership in the role (that is, the right to do SET ROLE), while USAGE denotes whether the privileges of the role are immediately available without doing SET ROLE. To test whether a user holds a grant option on the privilege, append WITH GRANT OPTION to the privilege key word; for example 'UPDATE WITH GRANT OPTION'. shows functions that determine whether a certain object is visible in the current schema search path. A table is said to be visible if its containing schema is in the search path and no table of the same name appears earlier in the search path. This is equivalent to the statement that the table can be referenced by name without explicit schema qualification. For example, to list the names of all visible tables: SELECT relname FROM pg_class WHERE pg_table_is_visible(oid); Schema Visibility Inquiry Functions Name Return Type Description pg_conversion_is_visible(conversion_oid) boolean is conversion visible in search path pg_function_is_visible(function_oid) boolean is function visible in search path pg_operator_is_visible(operator_oid) boolean is operator visible in search path pg_opclass_is_visible(opclass_oid) boolean is operator class visible in search path pg_table_is_visible(table_oid) boolean is table visible in search path pg_type_is_visible(type_oid) boolean is type (or domain) visible in search path

pg_conversion_is_visible pg_function_is_visible pg_operator_is_visible pg_opclass_is_visible pg_table_is_visible pg_type_is_visible pg_conversion_is_visible, pg_function_is_visible, pg_operator_is_visible, pg_opclass_is_visible, pg_table_is_visible, and pg_type_is_visible perform the visibility check for conversions, functions, operators, operator classes, tables, and types. Note that pg_table_is_visible can also be used with views, indexes and sequences; pg_type_is_visible can also be used with domains. For functions and operators, an object in the search path is visible if there is no object of the same name and argument data type(s) earlier in the path. For operator classes, both name and associated index access method are considered. All these functions require object OIDs to identify the object to be checked. If you want to test an object by name, it is convenient to use the OID alias types (regclass, regtype, regprocedure, or regoperator), for example: SELECT pg_type_is_visible('myschema.widget'::regtype); Note that it would not make much sense to test an unqualified name in this way — if the name can be recognized at all, it must be visible. format_type pg_get_viewdef pg_get_ruledef pg_get_indexdef pg_get_triggerdef pg_get_constraintdef pg_get_expr pg_get_userbyid pg_get_serial_sequence pg_tablespace_databases lists functions that extract information from the system catalogs. System Catalog Information Functions Name Return Type Description format_type(type_oid, typemod) text get SQL name of a data type pg_get_constraintdef(constraint_oid) text get definition of a constraint pg_get_constraintdef(constraint_oid, pretty_bool) text get definition of a constraint pg_get_expr(expr_text, relation_oid) text decompile internal form of an expression, assuming that any Vars in it refer to the relation indicated by the second parameter pg_get_expr(expr_text, relation_oid, pretty_bool) text decompile internal form of an expression, assuming that any Vars in it refer to the relation indicated by the second parameter pg_get_indexdef(index_oid) text get CREATE INDEX command for index pg_get_indexdef(index_oid, column_no, pretty_bool) text get CREATE INDEX command for index, or definition of just one index column when column_no is not zero pg_get_ruledef(rule_oid) text get CREATE RULE command for rule pg_get_ruledef(rule_oid, pretty_bool) text get CREATE RULE command for rule pg_get_serial_sequence(table_name, column_name) text get name of the sequence that a serial or bigserial column uses pg_get_triggerdef(trigger_oid) text get CREATE [ CONSTRAINT ] TRIGGER command for trigger pg_get_userbyid(roleid) name get role name with given ID pg_get_viewdef(view_name) text get underlying SELECT command for view (deprecated) pg_get_viewdef(view_name, pretty_bool) text get underlying SELECT command for view (deprecated) pg_get_viewdef(view_oid) text get underlying SELECT command for view pg_get_viewdef(view_oid, pretty_bool) text get underlying SELECT command for view pg_tablespace_databases(tablespace_oid) setof oid get the set of database OIDs that have objects in the tablespace

format_type returns the SQL name of a data type that is identified by its type OID and possibly a type modifier. Pass NULL for the type modifier if no specific modifier is known. pg_get_constraintdef, pg_get_indexdef, pg_get_ruledef, and pg_get_triggerdef, respectively reconstruct the creating command for a constraint, index, rule, or trigger. (Note that this is a decompiled reconstruction, not the original text of the command.) pg_get_expr decompiles the internal form of an individual expression, such as the default value for a column. It can be useful when examining the contents of system catalogs. pg_get_viewdef reconstructs the SELECT query that defines a view. Most of these functions come in two variants, one of which can optionally pretty-print the result. The pretty-printed format is more readable, but the default format is more likely to be interpreted the same way by future versions of PostgreSQL; avoid using pretty-printed output for dump purposes. Passing false for the pretty-print parameter yields the same result as the variant that does not have the parameter at all. pg_get_serial_sequence returns the name of the sequence associated with a column, or NULL if no sequence is associated with the column. The first input parameter is a table name with optional schema, and the second parameter is a column name. Because the first parameter is potentially a schema and table, it is not treated as a double-quoted identifier, meaning it is lowercased by default, while the second parameter, being just a column name, is treated as double-quoted and has its case preserved. The function returns a value suitably formatted for passing to the sequence functions (see ). This association can be modified or removed with ALTER SEQUENCE OWNED BY. (The function probably should have been called pg_get_owned_sequence; its name reflects the fact that it's typically used with serial or bigserial columns.) pg_get_userbyid extracts a role's name given its OID. pg_tablespace_databases allows a tablespace to be examined. It returns the set of OIDs of databases that have objects stored in the tablespace. If this function returns any rows, the tablespace is not empty and cannot be dropped. To display the specific objects populating the tablespace, you will need to connect to the databases identified by pg_tablespace_databases and query their pg_class catalogs. col_description obj_description shobj_description comment about database objects The functions shown in extract comments previously stored with the command. A null value is returned if no comment could be found matching the specified parameters. Comment Information Functions Name Return Type Description col_description(table_oid, column_number) text get comment for a table column obj_description(object_oid, catalog_name) text get comment for a database object obj_description(object_oid) text get comment for a database object (deprecated) shobj_description(object_oid, catalog_name) text get comment for a shared database object

col_description returns the comment for a table column, which is specified by the OID of its table and its column number. obj_description cannot be used for table columns since columns do not have OIDs of their own. The two-parameter form of obj_description returns the comment for a database object specified by its OID and the name of the containing system catalog. For example, obj_description(123456,'pg_class') would retrieve the comment for a table with OID 123456. The one-parameter form of obj_description requires only the object OID. It is now deprecated since there is no guarantee that OIDs are unique across different system catalogs; therefore, the wrong comment could be returned. shobj_description is used just like obj_description only that it is used for retrieving comments on shared objects. Some system catalogs are global to all databases within each cluster and their descriptions are stored globally as well. System Administration Functions shows the functions available to query and alter run-time configuration parameters. Configuration Settings Functions Name Return Type Description current_setting(setting_name) text current value of setting set_config(setting_name, new_value, is_local) text set parameter and return new value

SET SHOW configuration of the server functions The function current_setting yields the current value of the setting setting_name. It corresponds to the SQL command SHOW. An example: SELECT current_setting('datestyle'); current_setting ----------------- ISO, MDY (1 row) set_config sets the parameter setting_name to new_value. If is_local is true, the new value will only apply to the current transaction. If you want the new value to apply for the current session, use false instead. The function corresponds to the SQL command SET. An example: SELECT set_config('log_statement_stats', 'off', false); set_config ------------ off (1 row) pg_cancel_backend pg_reload_conf pg_rotate_logfile signal backend processes The functions shown in send control signals to other server processes. Use of these functions is restricted to superusers. Server Signalling Functions Name Return Type Description pg_cancel_backend(pid int) boolean Cancel a backend's current query pg_reload_conf() boolean Cause server processes to reload their configuration files pg_rotate_logfile() boolean Rotate server's log file

Each of these functions returns true if successful and false otherwise. pg_cancel_backend sends a query cancel (SIGINT) signal to a backend process identified by process ID. The process ID of an active backend can be found from the procpid column in the pg_stat_activity view, or by listing the postgres processes on the server with ps. pg_reload_conf sends a SIGHUP signal to the server, causing the configuration files to be reloaded by all server processes. pg_rotate_logfile signals the log-file manager to switch to a new output file immediately. This works only when redirect_stderr is used for logging, since otherwise there is no log-file manager subprocess. pg_start_backup pg_stop_backup pg_switch_xlog pg_current_xlog_location pg_current_xlog_insert_location pg_xlogfile_name_offset pg_xlogfile_name backup The functions shown in assist in making on-line backups. Use of the first three functions is restricted to superusers. Backup Control Functions Name Return Type Description pg_start_backup(label text) text Set up for performing on-line backup pg_stop_backup() text Finish performing on-line backup pg_switch_xlog() text Force switch to a new transaction log file pg_current_xlog_location() text Get current transaction log write location pg_current_xlog_insert_location() text Get current transaction log insert location pg_xlogfile_name_offset(location text) text, integer Convert transaction log location string to file name and decimal byte offset within file pg_xlogfile_name(location text) text Convert transaction log location string to file name

pg_start_backup accepts a single parameter which is an arbitrary user-defined label for the backup. (Typically this would be the name under which the backup dump file will be stored.) The function writes a backup label file into the database cluster's data directory, and then returns the backup's starting transaction log location as text. The user need not pay any attention to this result value, but it is provided in case it is of use. postgres=# select pg_start_backup('label_goes_here'); pg_start_backup ----------------- 0/D4445B8 (1 row) pg_stop_backup removes the label file created by pg_start_backup, and instead creates a backup history file in the transaction log archive area. The history file includes the label given to pg_start_backup, the starting and ending transaction log locations for the backup, and the starting and ending times of the backup. The return value is the backup's ending transaction log location (which again might be of little interest). After noting the ending location, the current transaction log insertion point is automatically advanced to the next transaction log file, so that the ending transaction log file can be archived immediately to complete the backup. pg_switch_xlog moves to the next transaction log file, allowing the current file to be archived (assuming you are using continuous archiving). The result is the ending transaction log location within the just-completed transaction log file. If there has been no transaction log activity since the last transaction log switch, pg_switch_xlog does nothing and returns the end location of the previous transaction log file. pg_current_xlog_location displays the current transaction log write location in the same format used by the above functions. Similarly pg_current_xlog_insert_location displays the current transaction log insertion point. The insertion point is the logical end of transaction log at any instant, while the write location is the end of what has actually been written out from the server's internal buffers. The write location is the end of what can be examined from outside the server, and is usually what you want if you are interested in archiving partially-complete transaction log files. The insertion point is made available primarily for server debugging purposes. These are both read-only operations and do not require superuser permissions. You can use pg_xlogfile_name_offset to extract the corresponding transaction log file name and byte offset from the results of any of the above functions. For example: postgres=# select * from pg_xlogfile_name_offset(pg_stop_backup()); file_name | file_offset --------------------------+------------- 00000001000000000000000D | 4039624 (1 row) Similarly, pg_xlogfile_name extracts just the transaction log file name. When the given transction log location is exactly at an transaction log file boundary, both these functions return the name of the preceding transaction log file. This is usually the desired behavior for managing transaction log archiving behavior, since the preceding file is the last one that currently needs to be archived. For details about proper usage of these functions, see . The functions shown in calculate the actual disk space usage of database objects. pg_column_size pg_database_size pg_relation_size pg_size_pretty pg_tablespace_size pg_total_relation_size Database Object Size Functions Name Return Type Description pg_column_size(any) int Number of bytes used to store a particular value (possibly compressed) pg_database_size(oid) bigint Disk space used by the database with the specified OID pg_database_size(name) bigint Disk space used by the database with the specified name pg_relation_size(oid) bigint Disk space used by the table or index with the specified OID pg_relation_size(text) bigint Disk space used by the table or index with the specified name. The table name can be qualified with a schema name pg_size_pretty(bigint) text Converts a size in bytes into a human-readable format with size units pg_tablespace_size(oid) bigint Disk space used by the tablespace with the specified OID pg_tablespace_size(name) bigint Disk space used by the tablespace with the specified name pg_total_relation_size(oid) bigint Total disk space used by the table with the specified OID, including indexes and toasted data pg_total_relation_size(text) bigint Total disk space used by the table with the specified name, including indexes and toasted data. The table name can be qualified with a schema name

pg_column_size shows the space used to store any individual data value. pg_database_size and pg_tablespace_size accept the OID or name of a database or tablespace, and return the total disk space used therein. pg_relation_size accepts the OID or name of a table, index or toast table, and returns the size in bytes. pg_size_pretty can be used to format the result of one of the other functions in a human-readable way, using kB, MB, GB or TB as appropriate. pg_total_relation_size accepts the OID or name of a table or toast table, and returns the size in bytes of the data and all associated indexes and toast tables. The functions shown in provide native file access to files on the machine hosting the server. Only files within the database cluster directory and the log_directory can be accessed. Use a relative path for files within the cluster directory, and a path matching the log_directory configuration setting for log files. Use of these functions is restricted to superusers. Generic File Access Functions Name Return Type Description pg_ls_dir(dirname text) setof text List the contents of a directory pg_read_file(filename text, offset bigint, length bigint) text Return the contents of a text file pg_stat_file(filename text) record Return information about a file

pg_ls_dir pg_ls_dir returns all the names in the specified directory, except the special entries . and ... pg_read_file pg_read_file returns part of a text file, starting at the given offset, returning at most length bytes (less if the end of file is reached first). If offset is negative, it is relative to the end of the file. pg_stat_file pg_stat_file returns a record containing the file size, last accessed time stamp, last modified time stamp, last file status change time stamp (Unix platforms only), file creation time stamp (Windows only), and a boolean indicating if it is a directory. Typical usages include: SELECT * FROM pg_stat_file('filename'); SELECT (pg_stat_file('filename')).modification; The functions shown in manage advisory locks. For details about proper usage of these functions, see . Advisory Lock Functions Name Return Type Description pg_advisory_lock(key bigint) void Obtain exclusive advisory lock pg_advisory_lock(key1 int, key2 int) void Obtain exclusive advisory lock pg_advisory_lock_shared(key bigint) void Obtain shared advisory lock pg_advisory_lock_shared(key1 int, key2 int) void Obtain shared advisory lock pg_try_advisory_lock(key bigint) boolean Obtain exclusive advisory lock if available pg_try_advisory_lock(key1 int, key2 int) boolean Obtain exclusive advisory lock if available pg_try_advisory_lock_shared(key bigint) boolean Obtain shared advisory lock if available pg_try_advisory_lock_shared(key1 int, key2 int) boolean Obtain shared advisory lock if available pg_advisory_unlock(key bigint) boolean Release an exclusive advisory lock pg_advisory_unlock(key1 int, key2 int) boolean Release an exclusive advisory lock pg_advisory_unlock_shared(key bigint) boolean Release a shared advisory lock pg_advisory_unlock_shared(key1 int, key2 int) boolean Release a shared advisory lock pg_advisory_unlock_all() void Release all advisory locks held by the current session

pg_advisory_lock pg_advisory_lock locks an application-defined resource, which can be identified either by a single 64-bit key value or two 32-bit key values (note that these two key spaces do not overlap). If another session already holds a lock on the same resource, the function will wait until the resource becomes available. The lock is exclusive. Multiple lock requests stack, so that if the same resource is locked three times it must be also unlocked three times to be released for other sessions' use. pg_advisory_lock_shared pg_advisory_lock_shared works the same as pg_advisory_lock, except the lock can be shared with other sessions requesting shared locks. Only would-be exclusive lockers are locked out. pg_try_advisory_lock pg_try_advisory_lock is similar to pg_advisory_lock, except the function will not wait for the lock to become available. It will either obtain the lock immediately and return true, or return false if the lock cannot be acquired now. pg_try_advisory_lock_shared pg_try_advisory_lock_shared works the same as pg_try_advisory_lock, except it attempts to acquire shared rather than exclusive lock. pg_advisory_unlock pg_advisory_unlock will release a previously-acquired exclusive advisory lock. It will return true if the lock is successfully released. If the lock was in fact not held, it will return false, and in addition, an SQL warning will be raised by the server. pg_advisory_unlock_shared pg_advisory_unlock_shared works the same as pg_advisory_unlock, except to release a shared advisory lock. pg_advisory_unlock_all pg_advisory_unlock_all will release all advisory locks held by the current session. (This function is implicitly invoked at session end, even if the client disconnects ungracefully.) XML Functions The functions and function-like expressions described in this section operate on values of type xml. Check for information about the xml type. The function-like expressions xmlparse and xmlserialize for converting to and from type xml are not repeated here. Producing XML Content A set of functions and function-like expressions are available for producing XML content from SQL data. As such, they are particularly suitable for formatting query results into XML documents for processing in client applications. <literal>xmlcomment</literal> xmlcomment xmlcomment(text) The function xmlcomment creates an XML value containing an XML comment with the specified text as content. The text cannot contain -- or end with a - so that the resulting construct is a valid XML comment. If the argument is null, the result is null. Example: ]]> <literal>xmlconcat</literal> xmlconcat xmlconcat(xml, ...) The function xmlconcat concatenates a list of individual XML values to create a single value containing an XML content fragment. Null values are omitted; the result is only null if there are no nonnull arguments. Example: ', 'foo'); xmlconcat ---------------------- foo ]]> XML declarations, if present are combined as follows. If all argument values have the same XML version declaration, that version is used in the result, else no version is used. If all argument values have the standalone declaration value yes, then that value is used in the result. If all argument values have a standalone declaration value and at least one is no, then that is used in the result. Else the result will have no standalone declaration. If the result is determined to require a standalone declaration but no version declaration, a version declaration with version 1.0 will be used because XML requires an XML declaration to contain a version declaration. Encoding declarations are ignored and removed in all cases. Example: ', ''); xmlconcat ----------------------------------- ]]> <literal>xmlelement</literal> xmlelement xmlelement(name name , xmlattributes(value AS attname , ... ) , content, ...) The xmlelement expression produces an XML element with the given name, attributes, and content. Examples: SELECT xmlelement(name foo, xmlattributes('xyz' as bar)); xmlelement ------------------ SELECT xmlelement(name foo, xmlattributes(current_date as bar), 'cont', 'ent'); xmlelement ------------------------------------- content ]]> Element and attribute names that are not valid XML names are escaped by replacing the offending characters by the sequence _xHHHH_, where HHHH is the character's Unicode codepoint in hexadecimal notation. For example: ]]> An explicit attribute name need not be specified if the attribute value is a column reference, in which case the column's name will be used as attribute name by default. In any other case, the attribute must be given an explicit name. So this example is valid: CREATE TABLE test (a xml, b xml); SELECT xmlelement(name test, xmlattributes(a, b)) FROM test; But these are not: SELECT xmlelement(name test, xmlattributes('constant'), a, b) FROM test; SELECT xmlelement(name test, xmlattributes(func(a, b))) FROM test; Element content, if specified, will be formatted according to data type. If the content is itself of type xml, complex XML documents can be constructed. For example: ]]> Content of other types will be formatted into valid XML character data. This means in particular that the characters <, >, and & will be converted to entities. Binary data (data type bytea) will be represented in base64 or hex encoding, depending on the setting of the configuration parameter . The particular behavior for individual data types is expected evolve in order to align the SQL and PostgreSQL data types with the XML Schema specification, at which point a more precise description will appear. <literal>xmlforest</literal> xmlforest xmlforest(content AS name , ...) The xmlforest expression produces an XML forest (sequence) of elements using the given names and content. Examples: abc123 SELECT xmlforest(table_name, column_name) FROM information_schema.columns WHERE table_schema = 'pg_catalog'; xmlforest ------------------------------------------------------------------------------------------- pg_authidrolname pg_authidrolsuper ... ]]> As seen in the second example, the element name can be omitted if the content value is a column reference, in which case the column name is used by default. Otherwise, a name must be specified. Element names that are not valid XML names are escaped as shown for xmlelement above. Similarly, content data is escaped to make valid XML content, unless it is already of type xml. Note that XML forests are not valid XML documents if they consist of more than one element. So it might be useful to wrap xmlforest expressions in xmlelement. <literal>xmlpi</literal> xmlpi xmlpi(name target , content) The xmlpi expression creates an XML processing instruction. The content, if present, must not contain the character sequence ?>. Example: ]]> <literal>xmlroot</literal> xmlroot xmlroot(xml, version text|no value , standalone yes|no|no value) The xmlroot expression alters the properties of the root node of an XML value. If a version is specified, this replaces the value in the version declaration, if a standalone value is specified, this replaces the value in the standalone declaration. abc'), version '1.0', standalone yes); xmlroot ---------------------------------------- abc ]]> XML Predicates IS DOCUMENT xml IS DOCUMENT The expression IS DOCUMENT returns true if the argument XML value is a proper XML document, false if it is not (that is, it is a content fragment), or null if the argument is null. See about the difference between documents and content fragments. Mapping Tables to XML The following functions map the contents of relational tables to XML values. They can be thought of as XML export functionality. table_to_xml(tbl regclass, nulls boolean, tableforest boolean, targetns text) query_to_xml(query text, nulls boolean, tableforest boolean, targetns text) cursor_to_xml(cursor refcursor, count int, nulls boolean, tableforest boolean, targetns text) The return type of each function is xml. table_to_xml maps the content of the named table, passed as parameter tbl. The regclass accepts strings identifying tables using the usual notation, including optional schema qualifications and double quotes. query_to_xml executes the query whose text is passed as parameter query and maps the result set. cursor_to_xml fetches the indicated number of rows from the cursor specificed by the parameter cursor. This variant is recommendable if large tables have to be mapped, because the result value is built up in memory by each function. If tableforest is false, then the resulting XML document looks like this: data data ... ... ]]> If tableforest is true, the result is an XML content fragment that looks like this: data data ... ... ]]> If no table name is avaible, that is, when mapping a query or a cursor, the string table is used in the first format, row in the second format. The choice between these formats is up to the user. The first format is a proper XML document, which will be important in many applications. The second format tends to be more useful in the cursor_to_xml function if the result values are to be reassembled into one document later on. The functions for producing XML content discussed above, in particular xmlelement, can be used to alter the results to taste. The data values are mapping in the same way as described for the function xmlelement above. The parameter nulls determines whether null values should be included in the output. If true, null values in columns are represented as ]]> where xsi is the XML namespace prefix for XML Schema Instance. An appropriate namespace declaration will be added to the result value. If false, columns containing null values are simply omitted from the output. The parameter targetns specifies the desired XML namespace of the result. If no particular namespace is wanted, an empty string should be passed. The following functions return XML Schema documents describing the mappings made by the data mappings produced by the corresponding functions above. table_to_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text) query_to_xmlschema(query text, nulls boolean, tableforest boolean, targetns text) cursor_to_xmlschema(cursor refcursor, nulls boolean, tableforest boolean, targetns text) It is essential that the same parameters are passed in order to obtain matching XML data mappings and XML Schema documents. The following functions produce XML data mappings and the corresponding XML Schema in one document (or forest), linked together. They can be useful where self-contained and self-describing results are wanted. table_to_xml_and_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text) query_to_xml_and_xmlschema(query text, nulls boolean, tableforest boolean, targetns text) As an example for using the output produced by these functions, shows an XSLT stylesheet that converts the output of table_to_xml_and_xmlschema to an HTML document containing a tabular rendition of the table data. In a similar manner, the result data of these functions can be converted into other XML-based formats.

XSLT stylesheet for converting SQL/XML output to HTML <xsl:value-of select="name(current())"/>

]]> Processing XML XML support is not just the existence of an xml data type, but a variety of features supported by a database system. These capabilities include import/export, indexing, searching, transforming, and XML to SQL mapping. PostgreSQL supports some but not all of these XML capabilities. For an overview of XML use in databases, see . Indexing contrib/xml2/ functions can be used in expression indexes to index specific XML fields. To index the full contents of XML documents, the full-text indexing tool contrib/tsearch2/ can be used. Of course, Tsearch2 indexes have no XML awareness so additional contrib/xml2/ checks should be added to queries. Searching XPath searches are implemented using contrib/xml2/. It processes XML text documents and returns results based on the requested query. Transforming contrib/xml2/ supports XSLT (Extensible Stylesheet Language Transformation). XML to SQL Mapping This involves converting XML data to and from relational structures. PostgreSQL has no internal support for such mapping, and relies on external tools to do such conversions.