SQL Syntax A description of the general syntax of SQL. SQL manipulates sets of data. The language is composed of various key words. Arithmetic and procedural expressions are allowed. We will cover these topics in this chapter; subsequent chapters will include details on data types, functions, and operators. Key Words SQL92 defines key words for the language which have specific meaning. Some key words are reserved, which indicates that they are restricted to appear in only certain contexts. Other key words are not restricted, which indicates that in certain contexts they have a specific meaning but are not otherwise constrained. Postgres implements an extended subset of the SQL92 and SQL3 languages. Some language elements are not as restricted in this implementation as is called for in the language standards, in part due to the extensibility features of Postgres. Information on SQL92 and SQL3 key words is derived from . Reserved Key Words SQL92 and SQL3 have reserved key words which are not allowed as identifiers and not allowed in any usage other than as fundamental tokens in SQL statements. Postgres has additional key words which have similar restrictions. In particular, these key words are not allowed as column or table names, though in some cases they are allowed to be column labels (i.e. in AS clauses). Any string can be specified as an identifier if surrounded by double quotes (like this!). Some care is required since such an identifier will be case sensitive and will retain embedded whitespace other special characters. The following are Postgres reserved words which are neither SQL92 nor SQL3 reserved words. These are allowed to be present as column labels, but not as identifiers: ABORT ANALYZE BINARY CLUSTER CONSTRAINT COPY DO EXPLAIN EXTEND LISTEN LOAD LOCK MOVE NEW NONE NOTIFY RESET SETOF SHOW UNLISTEN UNTIL VACUUM VERBOSE The following are Postgres reserved words which are also SQL92 or SQL3 reserved words, and which are allowed to be present as column labels, but not as identifiers: CASE COALESCE CROSS CURRENT CURRENT_USER DEC DECIMAL ELSE END FALSE FOREIGN GLOBAL GROUP LOCAL NULLIF NUMERIC ORDER POSITION PRECISION SESSION_USER TABLE THEN TRANSACTION TRUE USER WHEN The following are Postgres reserved words which are also SQL92 or SQL3 reserved words: ADD ALL ALTER AND ANY AS ASC BEGIN BETWEEN BOTH BY CASCADE CAST CHAR CHARACTER CHECK CLOSE COLLATE COLUMN COMMIT CONSTRAINT CREATE CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP CURSOR DECLARE DEFAULT DELETE DESC DISTINCT DROP EXECUTE EXISTS EXTRACT FETCH FLOAT FOR FROM FULL GRANT HAVING IN INNER INSERT INTERVAL INTO IS JOIN LEADING LEFT LIKE LOCAL NAMES NATIONAL NATURAL NCHAR NO NOT NULL ON OR OUTER PARTIAL PRIMARY PRIVILEGES PROCEDURE PUBLIC REFERENCES REVOKE RIGHT ROLLBACK SELECT SET SUBSTRING TO TRAILING TRIM UNION UNIQUE UPDATE USING VALUES VARCHAR VARYING VIEW WHERE WITH WORK The following are SQL92 reserved key words which are not Postgres reserved key words, but which if used as function names are always translated into the function length: CHAR_LENGTH CHARACTER_LENGTH The following are SQL92 or SQL3 reserved key words which are not Postgres reserved key words, but if used as type names are always translated into an alternate, native type: BOOLEAN DOUBLE FLOAT INT INTEGER INTERVAL REAL SMALLINT The following are either SQL92 or SQL3 reserved key words which are not key words in Postgres. These have no proscribed usage in Postgres at the time of writing (v6.5) but may become reserved key words in the future: Some of these key words represent functions in SQL92. These functions are defined in Postgres, but the parser does not consider the names to be key words and they are allowed in other contexts. ALLOCATE ARE ASSERTION AT AUTHORIZATION AVG BIT BIT_LENGTH CASCADED CATALOG COLLATION CONNECT CONNECTION CONTINUE CONVERT CORRESPONDING COUNT DATE DEALLOCATE DEC DESCRIBE DESCRIPTOR DIAGNOSTICS DISCONNECT DOMAIN ESCAPE EXCEPT EXCEPTION EXEC EXTERNAL FIRST FOUND GET GO GOTO IDENTITY INDICATOR INPUT INTERSECT LAST LOWER MAX MIN MODULE OCTET_LENGTH OPEN OUTPUT OVERLAPS PREPARE PRESERVE ROWS SCHEMA SECTION SESSION SIZE SOME SQL SQLCODE SQLERROR SQLSTATE SUM SYSTEM_USER TEMPORARY TRANSLATE TRANSLATION UNKNOWN UPPER USAGE VALUE WHENEVER WRITE Non-reserved Keywords SQL92 and SQL3 have non-reserved keywords which have a prescribed meaning in the language but which are also allowed as identifiers. Postgres has additional keywords which allow similar unrestricted usage. In particular, these keywords are allowed as column or table names. The following are Postgres non-reserved key words which are neither SQL92 nor SQL3 non-reserved key words: ACCESS AFTER AGGREGATE BACKWARD BEFORE CACHE CREATEDB CREATEUSER CYCLE DATABASE DELIMITERS EACH ENCODING EXCLUSIVE FORWARD FUNCTION HANDLER INCREMENT INDEX INHERITS INSENSITIVE INSTEAD ISNULL LANCOMPILER LOCATION MAXVALUE MINVALUE MODE NOCREATEDB NOCREATEUSER NOTHING NOTNULL OIDS OPERATOR PASSWORD PROCEDURAL RECIPE RENAME RETURNS ROW RULE SEQUENCE SERIAL SHARE START STATEMENT STDIN STDOUT TRUSTED VALID VERSION The following are Postgres non-reserved key words which are SQL92 or SQL3 reserved key words: ABSOLUTE ACTION CONSTRAINTS DAY DEFERRABLE DEFERRED HOUR IMMEDIATE INITIALLY INSENSITIVE ISOLATION KEY LANGUAGE LEVEL MATCH MINUTE MONTH NEXT OF ONLY OPTION PENDANT PRIOR PRIVILEGES READ RELATIVE RESTRICT SCROLL SECOND TIME TIMESTAMP TIMEZONE_HOUR TIMEZONE_MINUTE TRIGGER YEAR ZONE The following are Postgres non-reserved key words which are also either SQL92 or SQL3 non-reserved key words: COMMITTED SERIALIZABLE TYPE The following are either SQL92 or SQL3 non-reserved key words which are not key words of any kind in Postgres: ADA C CATALOG_NAME CHARACTER_SET_CATALOG CHARACTER_SET_NAME CHARACTER_SET_SCHEMA CLASS_ORIGIN COBOL COLLATION_CATALOG COLLATION_NAME COLLATION_SCHEMA COLUMN_NAME COMMAND_FUNCTION CONDITION_NUMBER CONNECTION_NAME CONSTRAINT_CATALOG CONSTRAINT_NAME CONSTRAINT_SCHEMA CURSOR_NAME DATA DATE_TIME_INTERVAL_CODE DATE_TIME_INTERVAL_PRECISION DYNAMIC_FUNCTION FORTRAN LENGTH MESSAGE_LENGTH MESSAGE_OCTET_LENGTH MORE MUMPS NAME NULLABLE NUMBER PAD PASCAL PLI REPEATABLE RETURNED_LENGTH RETURNED_OCTET_LENGTH RETURNED_SQLSTATE ROW_COUNT SCALE SCHEMA_NAME SERVER_NAME SPACE SUBCLASS_ORIGIN TABLE_NAME UNCOMMITTED UNNAMED Comments A comment is an arbitrary sequence of characters following double dashes up to the end of the line. We also support double-slashes as comments, e.g.: -- This is a standard SQL comment // And this is another supported comment style, like C++ We also support C-style block comments, e.g.: /* multi line comment */ Names Names in SQL are sequences of less than NAMEDATALEN alphanumeric characters, starting with an alphabetic character. By default, NAMEDATALEN is set to 32, but at the time the system is built, NAMEDATALEN can be changed by changing the #define in src/backend/include/postgres.h. Underscore ("_") is considered an alphabetic character. In some contexts, names may contain other characters if surrounded by double quotes. For example, table or column names may contain otherwise disallowed characters such as spaces, ampersands, etc. using this technique. Constants There are three implicitly typed constants for use in Postgres: strings, integers, and floating point numbers. Constants can also be specified with explicit types, which can enable more accurate representation and more efficient handling by the backend. The implicit constants are described below; explicit constants are discussed afterwards. String Constants Strings in SQL are arbitrary sequences of ASCII characters bounded by single quotes ("'", e.g. 'This is a string'). Uppercase alphabetics within strings are accepted literally. Non-printing characters may be embedded within strings by prepending them with a backslash ("\"; e.g. "\tab". SQL92 allows single quotes to be embedded in strings by typing two adjacent single quotes (e.g. 'Dianne''s horse'), and for historical reasons Postgres also allows single quotes to be escaped with a backslash (e.g. 'Dianne\'s horse'). Because of the limitations on instance sizes, string constants are currently limited to a length of a little less than 8192 bytes. Larger strings may be handled using the Postgres Large Object interface. Integer Constants Integer constants in SQL are collection of ASCII digits with no decimal point. Legal values range from -2147483648 to +2147483647. This will vary depending on the operating system and host machine. Note that larger integers can be specified for int8 by using SQL92 string notation or Postgres type notation: int8 '4000000000' -- string style '4000000000'::int8 -- Postgres (historical) style Floating Point Constants Floating point constants consist of an integer part, a decimal point, and a fraction part or scientific notation of the following format: {dig}.{dig} [e [+-] {dig}] where dig is one or more digits. You must include at least one dig after the period and after the [+-] if you use those options. An exponent with a missing mantissa has a mantissa of 1 inserted. There may be no extra characters embedded in the string. Floating point constaints are of type float8. float4 can be specified explicitly by using SQL92 string notation or Postgres type notation: float4 '1.23' -- string style '1.23'::float4 -- Postgres (historical) style Constants of Postgres User-Defined Types A constant of an arbitrary type can be entered using any one of the following notations: type 'string' 'string'::type CAST 'string' AS type The value inside the string is passed to the input conversion routine for the type called type. The result is a constant of the indicated type. The explicit typecast may be omitted if there is no ambiguity as to the type the constant must be, in which case it is automatically coerced. Array constants Array constants are arrays of any Postgres type, including other arrays, string constants, etc. The general format of an array constant is the following: {val1delimval2delim} where delim is the delimiter for the type stored in the pg_type class. (For built-in types, this is the comma character (","). An example of an array constant is {{1,2,3},{4,5,6},{7,8,9}} This constant is a two-dimensional, 3 by 3 array consisting of three sub-arrays of integers. Individual array elements can and should be placed between quotation marks whenever possible to avoid ambiguity problems with respect to leading white space. Fields and Columns Fields A field is either an attribute of a given class or one of the following: oid stands for the unique identifier of an instance which is added by Postgres to all instances automatically. Oids are not reused and are 32 bit quantities. xmin The identity of the inserting transaction. xmax The identity of the deleting transaction. cmin The command identifier within the transaction. cmax The identity of the deleting command. For further information on these fields consult . Times are represented internally as instances of the abstime data type. Transaction and command identifiers are 32 bit quantities. Transactions are assigned sequentially starting at 512. Columns A column is a construct of the form: instance{.composite_field}.field `['number`]' instance identifies a particular class and can be thought of as standing for the instances of that class. An instance variable is either a class name, a surrogate for a class defined by means of a FROM clause, or the keyword NEW or CURRENT. NEW and CURRENT can only appear in the action portion of a rule, while other instance variables can be used in any SQL statement. composite_field is a field of of one of the Postgres composite types, while successive composite fields address attributes in the class(s) to which the composite field evaluates. Lastly, field is a normal (base type) field in the class(s) last addressed. If field is of type array, then the optional number designator indicates a specific element in the array. If no number is indicated, then all array elements are returned. Operators Any built-in system, or user-defined operator may be used in SQL. For the list of built-in and system operators consult . For a list of user-defined operators consult your system administrator or run a query on the pg_operator class. Parentheses may be used for arbitrary grouping of operators in expressions. Expressions SQL92 allows expressions to transform data in tables. Expressions may contain operators (see for more details) and functions ( has more information). An expression is one of the following: ( a_expr ) constant attribute a_expr binary_operator a_expr a_expr right_unary_operator left_unary_operator a_expr parameter functional expression aggregate expression We have already discussed constants and attributes. The three kinds of operator expressions indicate respectively binary (infix), right-unary (suffix) and left-unary (prefix) operators. The following sections discuss the remaining options. Parameters A parameter is used to indicate a parameter in a SQL function. Typically this is used in SQL function definition statement. The form of a parameter is: $number For example, consider the definition of a function, dept, as CREATE FUNCTION dept (name) RETURNS dept AS 'select * from dept where name=$1' LANGUAGE 'sql'; Functional Expressions A functional expression is the name of a legal SQL function, followed by its argument list enclosed in parentheses: function (a_expr [, a_expr ... ] ) For example, the following computes the square root of an employee salary: sqrt(emp.salary) Aggregate Expressions An aggregate expression represents the application of an aggregate function across the rows selected by a query. An aggregate function reduces multiple inputs to a single output value, such as the sum or average of the inputs. The syntax of an aggregate expression is one of the following: aggregate_name (expression) aggregate_name (DISTINCT expression) aggregate_name ( * ) where aggregate_name is a previously defined aggregate, and expression is any expression that doesn't itself contain an aggregate expression. The first form of aggregate expression invokes the aggregate across all input rows for which the given expression yields a non-null value. The second form invokes the aggregate for all distinct non-null values of the expression found in the input rows. The last form invokes the aggregate once for each input row regardless of null or non-null values; since no particular input value is specified, it is generally only useful for the count() aggregate. For example, count(*) yields the total number of input rows; count(f1) yields the number of input rows in which f1 is non-null; count(distinct f1) yields the number of distinct non-null values of f1. Target List A target list is a parenthesized, comma-separated list of one or more elements, each of which must be of the form: a_expr [ AS result_attname ] where result_attname is the name of the attribute to be created (or an already existing attribute name in the case of update statements.) If result_attname is not present, then a_expr must contain only one attribute name which is assumed to be the name of the result field. In Postgres default naming is only used if a_expr is an attribute. Qualification A qualification consists of any number of clauses connected by the logical operators: NOT AND OR A clause is an a_expr that evaluates to a boolean over a set of instances. From List The from list is a comma-separated list of from expressions. Each "from expression" is of the form: [ class_reference ] instance_variable {, [ class_ref ] instance_variable... } where class_reference is of the form class_name [ * ] The "from expression" defines one or more instance variables to range over the class indicated in class_reference. One can also request the instance variable to range over all classes that are beneath the indicated class in the inheritance hierarchy by postpending the designator asterisk ("*").