PL/Python - Python Procedural LanguagePL/Python>>
Python>>
Introduction
The PL/Python procedural language allows
PostgreSQL functions to be written in
the Python language.
The current version of PL/Python
functions as a trusted language only; access to the file system and
other local resources is disabled. Specifically,
PL/Python uses the Python restricted
execution environment, further restricts it to prevent the use of
the file open> call, and allows only modules from a
specific list to be imported. Presently, that list includes:
array, bisect, binascii, calendar, cmath, codecs, errno, marshal,
math, md5, mpz, operator, pcre, pickle, random, re, regex, sre,
sha, string, StringIO, struct, time, whrandom, and zlib.
In the current version, any database error encountered while
running a PL/Python function will result
in the immediate termination of that function by the server. It is
not possible to trap error conditions using Python try
... catch constructs. For example, a syntax error in an
SQL statement passed to the plpy.execute() call
will terminate the function. This behavior may be changed in a
future release.
Installation
To build PL/Python, the option needs
to be specified when running configure. If
after building and installing you have a file called
plpython.so (possibly a different extension),
then everything went well. Otherwise you should have seen a notice
like this flying by:
*** Cannot build PL/Python because libpython is not a shared library.
*** You might have to rebuild your Python installation. Refer to
*** the documentation for details.
That means you have to rebuild (part of) your Python installation
to supply this shared library.
The catch is that the Python distribution or the Python maintainers
do not provide any direct way to do this. The closest thing we can
offer you is the information in Python FAQ
3.30. On some operating systems you don't really have to
build a shared library, but then you will have to convince the
PostgreSQL build system of this. Consult the
Makefile in the
src/pl/plpython directory for details.
Using PL/Python
There are sample functions in
plpython_function.sql. The Python code you
write gets transformed into a function. E.g.,
CREATE FUNCTION myfunc(text) RETURNS text AS
'return args[0]'
LANGUAGE 'plpython';
gets transformed into
def __plpython_procedure_myfunc_23456():
return args[0]
where 23456 is the Oid of the function.
If you do not provide a return value, Python returns the default
None which may or may not be what you want. The
language module translates Python's None into SQL NULL.
PostgreSQL> function variables are available in the global
args list. In the myfunc
example, args[0]> contains whatever was passed in as the text
argument. For myfunc2(text, integer), args[0]>
would contain the text variable and args[1] the integer variable.
The global dictionary SD is available to store data between
function calls. This variable is private static data. The global
dictionary GD is public data, available to all python functions
within a backend. Use with care.
Each function gets its own restricted execution object in the
Python interpreter, so that global data and function arguments from
myfunc are not available to
myfunc2. The exception is the data in the GD
dictionary, as mentioned above.
When a function is used in a trigger, the dictionary TD contains
transaction related values. The trigger tuples are in TD["new"]>
and/or TD["old"]> depending on the trigger event. TD["event"]>
contains the event as a string (INSERT>, UPDATE>, DELETE>, or
UNKNOWN>). TD["when"] contains one of (BEFORE>, AFTER>, or
UNKNOWN>). TD["level"]> contains one of ROW>, STATEMENT>, or
UNKNOWN>. TD["name"]> contains the trigger name, and TD["relid"]>
contains the relation id of the table on which the trigger occurred.
If the trigger was called with arguments they are available
in TD["args"][0]> to TD["args"][(n -1)]>.
If the trigger when is BEFORE>, you may return None or "OK"
from the Python function to indicate the tuple is unmodified,
"SKIP"> to abort the event, or "MODIFIED"> to indicate you've
modified the tuple.
The PL/Python language module automatically imports a Python module
called plpy. The functions and constants in
this module are available to you in the Python code as
plpy.foo. At present
plpy implements the functions
plpy.error("msg"),
plpy.fatal("msg"),
plpy.debug("msg"), and
plpy.notice("msg"). They are mostly equivalent
to calling elog(LEVEL>, "msg"),
where LEVEL> is DEBUG, ERROR, FATAL or NOTICE.
plpy.error and plpy.fatal
actually raise a Python exception which, if uncaught, causes the
PL/Python module to call elog(ERROR, msg) when
the function handler returns from the Python interpreter. Long
jumping out of the Python interpreter is probably not good.
raise plpy.ERROR("msg") and raise
plpy.FATAL("msg") are equivalent to calling
plpy.error or plpy.fatal.
Additionally, the plpy module provides two functions called
execute and prepare.
Calling plpy.execute with a query string, and
an optional limit argument, causes that query to be run, and the
result returned in a result object. The result object emulates a
list or dictionary object. The result object can be accessed by
row number, and field name. It has these additional methods:
nrows() which returns the number of rows
returned by the query, and status which is the
SPI_exec return variable. The result object
can be modified.
rv = plpy.execute("SELECT * FROM my_table", 5)
returns up to 5 rows from my_table. Ff my_table has a column
my_field it would be accessed as
foo = rv[i]["my_field"]
The second function plpy.prepare is called
with a query string, and a list of argument types if you have bind
variables in the query.
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
text is the type of the variable you will be passing as $1. After
preparing you use the function plpy.execute to
run it.
rv = plpy.execute(plan, [ "name" ], 5)
The limit argument is optional in the call to
plpy.execute.
When you prepare a plan using the PL/Python module it is
automatically saved. Read the SPI documentation () for a description of what this means. The take
home message is if you do
plan = plpy.prepare("SOME QUERY")
plan = plpy.prepare("SOME OTHER QUERY")
you are leaking memory, as I know of no way to free a saved plan.
The alternative of using unsaved plans it even more painful (for
me).