mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-09-19 06:30:20 +02:00
863a62064c
test isn't that complete up to now, but I think it shows enough of the capabilities of the module. The Makefile assumes it is located in a directory under pgsql/src/pl. Since it includes Makefile.global and Makefile.port and doesn't use any own compiler/linker calls, it should build on most of our supported platforms (I only tested under Linux up to now). It requires flex and bison I think. Maybe we should ship prepared gram.c etc. like for the main parser too? Jan
449 lines
17 KiB
Plaintext
449 lines
17 KiB
Plaintext
PL/pgSQL
|
|
A procedural language for the PostgreSQL RDBMS
|
|
|
|
Jan Wieck <jwieck@debis.com>
|
|
|
|
|
|
|
|
Preface
|
|
|
|
PL/pgSQL is a procedural language based on SQL designed for
|
|
the PostgreSQL database system.
|
|
|
|
The extensibility features of PostgreSQL are mostly based on
|
|
the ability to define functions for various operations.
|
|
Functions could have been written in PostgreSQL's SQL dialect
|
|
or in the C programming language. Functions written in C are
|
|
compiled into a shared object and loaded by the database
|
|
backend process on demand. Also the trigger features of
|
|
PostgreSQL are based on functions but required the use of the
|
|
C language.
|
|
|
|
Since version 6.3 PostgreSQL supports the definition of
|
|
procedural languages. In the case of a function or trigger
|
|
procedure defined in a procedural language, the database has
|
|
no builtin knowlege how to interpret the functions source
|
|
text. Instead, the function and trigger calls are passed into
|
|
a handler that knows the details of the language. The
|
|
handler itself is a function compiled into a shared object
|
|
and loaded on demand.
|
|
|
|
|
|
Overview
|
|
|
|
The PL/pgSQL language is case insensitive. All keywords and
|
|
identifiers can be used in upper-/lowercase mixed.
|
|
|
|
PL/pgSQL is a block oriented language. A block is defined as
|
|
|
|
[<<label>>]
|
|
[DECLARE
|
|
-- declarations]
|
|
BEGIN
|
|
-- statements
|
|
END;
|
|
|
|
There can be any number of subblocks in the statements
|
|
section of a block. Subblocks can be used to hide variables
|
|
from outside a block of statements (see Scope and visability
|
|
below). The variables declared in the declarations section
|
|
preceding a block are initialized to their default values
|
|
every time the block is entered, not only once per function
|
|
call.
|
|
|
|
It is important not to misunderstand the meaning of BEGIN/END
|
|
for grouping statements in PL/pgSQL and the database commands
|
|
for transaction control. Functions or trigger procedures
|
|
cannot start or commit transactions and PostgreSQL
|
|
transactions cannot have subtransactions.
|
|
|
|
|
|
Comments
|
|
|
|
There are two types of comments in PL/pgSQL. A double dash
|
|
'--' starts a comment that extends to the end of the line. A
|
|
'/*' starts a block comment that extends to the next '*/'.
|
|
Block comments cannot be nested, but double dash comments can
|
|
be enclosed into a block comment and double dashes can hide
|
|
'/*' and '*/'.
|
|
|
|
|
|
Declarations
|
|
|
|
All variables, rows and records used in a block or it's
|
|
subblocks must be declared in the declarations section of the
|
|
block except for the loop variable of a FOR loop iterating
|
|
over a range of integer values. The parameters given to the
|
|
function are automatically declared with the usual
|
|
identifiers $n. The declarations have the following syntax:
|
|
|
|
<name> [CONSTANT] <type> [NOT NULL]
|
|
[DEFAULT | := <value>];
|
|
|
|
Declares a variable of the specified type. If the
|
|
variable is declared as CONSTANT, the value cannot be
|
|
changed. If NOT NULL is specified, an assignment of a
|
|
NULL value results in a runtime error. Since the
|
|
default value of a variable is the SQL NULL value,
|
|
all variables declared as NOT NULL must also have a
|
|
default value.
|
|
|
|
The default value is evaluated at the actual function
|
|
call. So assigning 'now' to an abstime variable
|
|
causes the variable to have the time of the actual
|
|
function call, not when the function was compiled
|
|
(during it's first call since the lifetime of the
|
|
database connection).
|
|
|
|
<name> <class>%ROWTYPE;
|
|
|
|
Declares a row with the structure of the given class.
|
|
Class must be an existing table- or viewname of the
|
|
database. The fields of the row are accessed in the
|
|
dot notation. Parameters to a procedure could be
|
|
tuple types. In that case the corresponding
|
|
identifier $n will be a rowtype. Only the user
|
|
attributes of a tuple are accessible in the row.
|
|
There must be no whitespaces between the classname,
|
|
the percent and the ROWTYPE keyword.
|
|
|
|
The fields of the rowtype inherit the tables
|
|
fieldsizes for char() etc. data types (atttypmod
|
|
from pg_attribute).
|
|
|
|
<name> RECORD;
|
|
|
|
Records are similar to rowtypes, but they have no
|
|
predefined structure They are used in selections and
|
|
FOR loops to hold one actual database tuple from a
|
|
select operation. One and the same record can be used
|
|
in different selections. Accessing a record or an
|
|
attempt to assign a value to a record field when
|
|
there's no actual tuple in it results in a runtime
|
|
error.
|
|
|
|
The new and old tuples in triggers are given to the
|
|
trigger procedure as records. This is necessary,
|
|
because under PostgreSQL one and the same trigger
|
|
procedure can handle trigger events for different
|
|
tables.
|
|
|
|
<name> ALIAS FOR $n;
|
|
|
|
For better readability of the code it's possible to
|
|
define an alias for a positional parameter to the
|
|
function.
|
|
|
|
RENAME <oldname> TO <newname>;
|
|
|
|
Change the name of a variable, record or rowtype.
|
|
This is useful if new or old should be referenced by
|
|
another name inside a trigger procedure.
|
|
|
|
Datatypes
|
|
|
|
The type of a variable can be any of the existing data types
|
|
of the database. <type> above is defined as:
|
|
|
|
postgesql-basetype
|
|
or variable%TYPE
|
|
or rowtype.field%TYPE
|
|
or class.field%TYPE
|
|
|
|
As for the rowtype declaration, there must be no whitespaces
|
|
between the classname, the percent and the TYPE keyword.
|
|
|
|
Expressions
|
|
|
|
All expressions used in PL/pgSQL statements are processed
|
|
using the backends executor. Since even a constant looking
|
|
expression can have a totally different meaning for a
|
|
particular data type (as 'now' for abstime), it is impossible
|
|
for the PL/pgSQL parser to identify real constant values
|
|
other than the NULL keyword. The expressions are evaluated by
|
|
internally executing a query
|
|
|
|
SELECT <expr>
|
|
|
|
over the SPI manager. In the expression, occurences of
|
|
variable identifiers are substituted by parameters and the
|
|
actual values from the variables are passed to the executor
|
|
as query parameters. All the expressions used in a PL/pgSQL
|
|
function are only prepared and saved once.
|
|
|
|
If record fields are used in expressions or database
|
|
statements, the data types of the fields should not change
|
|
between calls of one and the same expression. Keep this in
|
|
mind when writing trigger procedures that handle events for
|
|
more than one table.
|
|
|
|
Statements
|
|
|
|
Anything not understood by the parser as specified below will
|
|
be put into a query and sent down to the database engine to
|
|
execute. The resulting query should not return any data
|
|
(insert, update, delete queries and all utility statements).
|
|
|
|
Assignment
|
|
|
|
An assignment of a value to a variable or rowtype field
|
|
is written as:
|
|
|
|
<identifier> := <expr>;
|
|
|
|
If the expressions result data type doesn't match the
|
|
variables data type, or the variables atttypmod value is
|
|
known (as for char(20)), the result value will be
|
|
implicitly casted by the PL/pgSQL executor using the
|
|
result types output- and the variables type input-
|
|
functions. Note that this could potentially result in
|
|
runtime errors generated by the types input functions.
|
|
|
|
An assignment of a complete selection into a record or
|
|
rowtype can be done as:
|
|
|
|
SELECT expressions INTO <target> FROM fromlist;
|
|
|
|
Target can be a record or rowtype variable, or a comma
|
|
separated list of variables and record/row fields.
|
|
|
|
If a rowtype or a variable list is used as target, the
|
|
selected values must exactly match the structure of the
|
|
target(s) or a runtime error occurs. The fromlist can be
|
|
followed by any valid qualification, grouping, sorting
|
|
etc.
|
|
|
|
There is a special condition [NOT] FOUND that can be used
|
|
immediately after a SELECT INTO to check if the data has
|
|
been found.
|
|
|
|
SELECT * INTO myrec FROM EMP WHERE empname = myname;
|
|
IF NOT FOUND THEN
|
|
RAISE EXCEPTION 'employee % not found', myname;
|
|
END IF;
|
|
|
|
If the selection returns multiple rows, only the first is
|
|
moved into the target fields. All others are discarded.
|
|
|
|
|
|
Calling another function
|
|
|
|
If a function should be called, this is normally done by
|
|
a SELECT query. But there are cases where someone isn't
|
|
interested in the functions result.
|
|
|
|
PERFORM querystring;
|
|
|
|
executes a 'SELECT querystring' over the SPI manager and
|
|
discards the result.
|
|
|
|
|
|
Returning from the function
|
|
|
|
RETURN <expr>;
|
|
|
|
The function terminates and the value of <expr> will be
|
|
returned to the upper executor. The return value of a
|
|
function cannot be undefined. If control reaches the end
|
|
of the toplevel block of the function without hitting a
|
|
RETURN statement, a runtime error will occur.
|
|
|
|
|
|
Aborting and messages
|
|
|
|
As indicated above there is an RAISE statement that can
|
|
throw messages into the PostgreSQL elog mechanism.
|
|
|
|
RAISE level 'format' [, identifier [...]];
|
|
|
|
Inside the format, % can be used as a placeholder for the
|
|
following, comma separated identifiers. The identifiers
|
|
must specify an existing variable or row/record field.
|
|
|
|
|
|
Conditionals
|
|
|
|
IF <expr> THEN
|
|
-- statements
|
|
[ELSE
|
|
-- statements]
|
|
END IF;
|
|
|
|
The expression <expr> must return a value that at least
|
|
can be casted into a boolean.
|
|
|
|
|
|
Loops
|
|
|
|
There are multiple types of loops.
|
|
|
|
[<<label>>]
|
|
LOOP
|
|
-- statements
|
|
END LOOP;
|
|
|
|
An unconditional loop that must be terminated explicitly
|
|
by an EXIT statement. The optional label can be used by
|
|
EXIT statements of nested loops to specify which level of
|
|
nesting should be terminated.
|
|
|
|
[<<label>>]
|
|
WHILE <expr> LOOP
|
|
-- statements
|
|
END LOOP;
|
|
|
|
A conditional loop that is executed as long as the
|
|
evaluation of <expr> returns true.
|
|
|
|
[<<label>>]
|
|
FOR <name> IN [REVERSE] <expr>..<expr> LOOP
|
|
-- statements
|
|
END LOOP.
|
|
|
|
A loop that iterates over a range of integer values. The
|
|
variable <name> is automatically created as type integer
|
|
and exists only inside the loop. The two expressions
|
|
giving the lower and upper bound of the range are
|
|
evaluated only when entering the loop. The iteration step
|
|
is 1.
|
|
|
|
FOR <recname|rowname> IN <select_clause> LOOP
|
|
-- statements
|
|
END LOOP;
|
|
|
|
The record or row is assigned all the rows resulting from
|
|
the select clause and the statements executed for each.
|
|
If the loop is terminated with an EXIT statement, the
|
|
last accessed row is still accessible in the record or
|
|
rowtype.
|
|
|
|
EXIT [label] [WHEN <expr>];
|
|
|
|
If no label given, the innermost loop is terminated and
|
|
the statement following END LOOP is executed next. If
|
|
label is given, it must be the label of the current or an
|
|
upper level of nested loops or blocks. Then the named
|
|
loop or block is terminated and control continues with
|
|
the statement after the loops/blocks corresponding END.
|
|
|
|
Trigger procedures
|
|
|
|
PL/pgSQL can also be used to define trigger procedures. They
|
|
are created using CREATE FUNCTION as a function with no
|
|
arguments and a return type of opaque.
|
|
|
|
There are some PostgreSQL specific details in functions used
|
|
as trigger procedures.
|
|
|
|
First they have some special variables created above the
|
|
toplevel statement block. These are:
|
|
|
|
new (record)
|
|
The new database tuple on INSERT/UPDATE operations at
|
|
ROW level.
|
|
|
|
old (record)
|
|
The old database tuple on UPDATE/DELETE operations at
|
|
ROW level.
|
|
|
|
tg_name (type name)
|
|
The triggers name from pg_trigger.
|
|
|
|
tg_when (type text)
|
|
A string of either 'BEFORE' or 'AFTER' depending on
|
|
the triggers definition.
|
|
|
|
tg_level (type text)
|
|
A string of either 'ROW' or 'STATEMENT' depending on
|
|
the triggers definition.
|
|
|
|
tg_op (type text)
|
|
A string of 'INSERT', 'UPDATE' or 'DELETE' telling
|
|
for which operation the trigger is actually fired.
|
|
|
|
tg_relid (type oid)
|
|
The Oid of the relation for which the trigger is
|
|
actually fired.
|
|
|
|
tg_relname (type name)
|
|
The relations name for which the trigger is actually
|
|
fired.
|
|
|
|
tg_nargs (type integer)
|
|
The number of arguments given to the trigger
|
|
procedure in the CREATE TRIGGER statement.
|
|
|
|
tg_argv[] (types text)
|
|
The arguments from the CREATE TRIGGER statement. The
|
|
index counts from 0 and can be given as expression.
|
|
Invalid indices (< 0 or >= tg_nargs) result in a NULL
|
|
value.
|
|
|
|
Second, they must return either NULL, or a record/rowtype
|
|
containing exactly the structure of the table the trigger was
|
|
fired for. Triggers fired AFTER might allways return NULL
|
|
with no effect. Triggers fired BEFORE signal the trigger
|
|
manager to skip the operation for this actual row when
|
|
returning NULL. Otherwise, the returned record/rowtype
|
|
replaces the inserted/updated tuple in the operation. It is
|
|
possible to replace single values directly in new and return
|
|
that, or to build a complete new record/rowtype to return.
|
|
|
|
Exceptions
|
|
|
|
PostgreSQL doesn't have a very smart exception handling
|
|
model. Whenever the parser, planner/optimizer or executor
|
|
decide that a statement cannot be processed any longer, the
|
|
whole transaction gets aborted and the the system jumps back
|
|
into the mainloop using longjmp() to get the next query from
|
|
the client application.
|
|
|
|
It is possible to hook into the longjmp() mechanism to notice
|
|
that this happens. But currently it's impossible to tell what
|
|
really caused the abort (input/output conversion error,
|
|
floating point error, parse error) And it's possible that the
|
|
backend is in an inconsistent state at this point so
|
|
returning to the upper executor or issuing more commands
|
|
might corrupt the whole database.
|
|
|
|
Thus, the only thing PL/pgSQL currently does when it
|
|
encounters an abort during execution of a function or trigger
|
|
procedure is to write some additional DEBUG log messages
|
|
telling in which function and where (line number and type of
|
|
statement) this happened.
|
|
|
|
This might change in the future.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|