2002-01-07 03:29:15 +01:00
|
|
|
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/plpython.sgml,v 1.8 2002/01/07 02:29:13 petere Exp $ -->
|
2001-05-12 19:49:32 +02:00
|
|
|
|
|
|
|
<chapter id="plpython">
|
|
|
|
<title>PL/Python - Python Procedural Language</title>
|
|
|
|
|
2001-11-12 20:19:39 +01:00
|
|
|
<indexterm zone="plpython"><primary>PL/Python</></>
|
|
|
|
<indexterm zone="plpython"><primary>Python</></>
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<sect1 id="plpython-intro">
|
|
|
|
<title>Introduction</title>
|
|
|
|
|
2001-05-12 19:49:32 +02:00
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
The <application>PL/Python</application> procedural language allows
|
|
|
|
<productname>PostgreSQL</productname> functions to be written in
|
|
|
|
the <ulink url="http://www.python.org">Python</ulink> language.
|
2001-05-12 19:49:32 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2001-11-20 22:09:53 +01:00
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
The current version of <application>PL/Python</application>
|
|
|
|
functions as a trusted language only; access to the file system and
|
|
|
|
other local resources is disabled. Specifically,
|
|
|
|
<application>PL/Python</application> uses the Python restricted
|
|
|
|
execution environment, further restricts it to prevent the use of
|
|
|
|
the file <function>open</> call, and allows only modules from a
|
|
|
|
specific list to be imported. Presently, that list includes:
|
|
|
|
array, bisect, binascii, calendar, cmath, codecs, errno, marshal,
|
|
|
|
math, md5, mpz, operator, pcre, pickle, random, re, regex, sre,
|
|
|
|
sha, string, StringIO, struct, time, whrandom, and zlib.
|
2001-11-20 22:09:53 +01:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2001-11-20 22:09:53 +01:00
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
In the current version, any database error encountered while
|
|
|
|
running a <application>PL/Python</application> function will result
|
|
|
|
in the immediate termination of that function by the server. It is
|
|
|
|
not possible to trap error conditions using Python <literal>try
|
|
|
|
... catch</literal> constructs. For example, a syntax error in an
|
|
|
|
SQL statement passed to the <literal>plpy.execute()</literal> call
|
|
|
|
will terminate the function. This behavior may be changed in a
|
2001-11-20 22:09:53 +01:00
|
|
|
future release.
|
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
</sect1>
|
2001-05-12 19:49:32 +02:00
|
|
|
|
|
|
|
<sect1 id="plpython-install">
|
|
|
|
<title>Installation</title>
|
|
|
|
|
|
|
|
<para>
|
2002-01-07 03:29:15 +01:00
|
|
|
To build PL/Python, the <option>--with-python</option> option needs
|
|
|
|
to be specified when running <filename>configure</filename>. If
|
|
|
|
after building and installing you have a file called
|
|
|
|
<filename>plpython.so</filename> (possibly a different extension),
|
|
|
|
then everything went well. Otherwise you should have seen a notice
|
|
|
|
like this flying by:
|
|
|
|
<screen>
|
|
|
|
*** Cannot build PL/Python because libpython is not a shared library.
|
|
|
|
*** You might have to rebuild your Python installation. Refer to
|
|
|
|
*** the documentation for details.
|
|
|
|
</screen>
|
|
|
|
That means you have to rebuild (part of) your Python installation
|
|
|
|
to supply this shared library.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The catch is that the Python distribution or the Python maintainers
|
|
|
|
do not provide any direct way to do this. The closest thing we can
|
|
|
|
offer you is the information in <ulink
|
|
|
|
url="http://www.python.org/doc/FAQ.html#3.30">Python FAQ
|
|
|
|
3.30</ulink>. On some operating systems you don't really have to
|
|
|
|
build a shared library, but then you will have to convince the
|
|
|
|
PostgreSQL build system of this. Consult the
|
|
|
|
<filename>Makefile</filename> in the
|
|
|
|
<filename>src/pl/plpython</filename> directory for details.
|
2001-05-12 19:49:32 +02:00
|
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="plpython-using">
|
2002-01-07 03:29:15 +01:00
|
|
|
<title>Using PL/Python</title>
|
2001-05-12 19:49:32 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
There are sample functions in
|
|
|
|
<filename>plpython_function.sql</filename>. The Python code you
|
|
|
|
write gets transformed into a function. E.g.,
|
|
|
|
<programlisting>
|
|
|
|
CREATE FUNCTION myfunc(text) RETURNS text AS
|
|
|
|
'return args[0]'
|
|
|
|
LANGUAGE 'plpython';
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
gets transformed into
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
def __plpython_procedure_myfunc_23456():
|
|
|
|
return args[0]
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
where 23456 is the Oid of the function.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you do not provide a return value, Python returns the default
|
|
|
|
<symbol>None</symbol> which may or may not be what you want. The
|
|
|
|
language module translates Python's None into SQL NULL.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-11-21 06:53:41 +01:00
|
|
|
<productname>PostgreSQL</> function variables are available in the global
|
2001-05-12 19:49:32 +02:00
|
|
|
<varname>args</varname> list. In the <function>myfunc</function>
|
2001-09-10 23:58:47 +02:00
|
|
|
example, <varname>args[0]</> contains whatever was passed in as the text
|
|
|
|
argument. For <literal>myfunc2(text, integer)</literal>, <varname>args[0]</>
|
|
|
|
would contain the <type>text</type> variable and <varname>args[1]</varname> the <type>integer</type> variable.
|
2001-05-12 19:49:32 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The global dictionary SD is available to store data between
|
|
|
|
function calls. This variable is private static data. The global
|
|
|
|
dictionary GD is public data, available to all python functions
|
2001-09-12 05:58:15 +02:00
|
|
|
within a backend. Use with care.
|
2001-05-12 19:49:32 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Each function gets its own restricted execution object in the
|
|
|
|
Python interpreter, so that global data and function arguments from
|
|
|
|
<function>myfunc</function> are not available to
|
|
|
|
<function>myfunc2</function>. The exception is the data in the GD
|
|
|
|
dictionary, as mentioned above.
|
|
|
|
</para>
|
|
|
|
|
2001-09-13 17:55:24 +02:00
|
|
|
<para>
|
2001-09-12 05:58:15 +02:00
|
|
|
When a function is used in a trigger, the dictionary TD contains
|
2001-09-13 17:55:24 +02:00
|
|
|
transaction related values. The trigger tuples are in <literal>TD["new"]</>
|
|
|
|
and/or <literal>TD["old"]</> depending on the trigger event. <literal>TD["event"]</>
|
|
|
|
contains the event as a string (<literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>, or
|
|
|
|
<literal>UNKNOWN</>). TD["when"] contains one of (<literal>BEFORE</>, <literal>AFTER</>, or
|
|
|
|
<literal>UNKNOWN</>). <literal>TD["level"]</> contains one of <literal>ROW</>, <literal>STATEMENT</>, or
|
|
|
|
<literal>UNKNOWN</>. <literal>TD["name"]</> contains the trigger name, and <literal>TD["relid"]</>
|
2001-09-12 05:58:15 +02:00
|
|
|
contains the relation id of the table on which the trigger occurred.
|
|
|
|
If the trigger was called with arguments they are available
|
2001-09-13 17:55:24 +02:00
|
|
|
in <literal>TD["args"][0]</> to <literal>TD["args"][(n -1)]</>.
|
2001-09-12 05:58:15 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-09-13 17:55:24 +02:00
|
|
|
If the trigger <quote>when</quote> is <literal>BEFORE</>, you may return <literal>None</literal> or <literal>"OK"</literal>
|
|
|
|
from the Python function to indicate the tuple is unmodified,
|
|
|
|
<literal>"SKIP"</> to abort the event, or <literal>"MODIFIED"</> to indicate you've
|
2001-09-12 05:58:15 +02:00
|
|
|
modified the tuple.
|
|
|
|
</para>
|
|
|
|
|
2001-05-12 19:49:32 +02:00
|
|
|
<para>
|
|
|
|
The PL/Python language module automatically imports a Python module
|
|
|
|
called <literal>plpy</literal>. The functions and constants in
|
|
|
|
this module are available to you in the Python code as
|
|
|
|
<literal>plpy.<replaceable>foo</replaceable></literal>. At present
|
|
|
|
<literal>plpy</literal> implements the functions
|
|
|
|
<literal>plpy.error("msg")</literal>,
|
|
|
|
<literal>plpy.fatal("msg")</literal>,
|
|
|
|
<literal>plpy.debug("msg")</literal>, and
|
|
|
|
<literal>plpy.notice("msg")</literal>. They are mostly equivalent
|
|
|
|
to calling <literal>elog(<replaceable>LEVEL</>, "msg")</literal>,
|
|
|
|
where <replaceable>LEVEL</> is DEBUG, ERROR, FATAL or NOTICE.
|
|
|
|
<function>plpy.error</function> and <function>plpy.fatal</function>
|
|
|
|
actually raise a Python exception which, if uncaught, causes the
|
|
|
|
PL/Python module to call <literal>elog(ERROR, msg)</literal> when
|
|
|
|
the function handler returns from the Python interpreter. Long
|
|
|
|
jumping out of the Python interpreter is probably not good.
|
|
|
|
<literal>raise plpy.ERROR("msg")</literal> and <literal>raise
|
|
|
|
plpy.FATAL("msg")</literal> are equivalent to calling
|
|
|
|
<function>plpy.error</function> or <function>plpy.fatal</function>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-09-10 23:58:47 +02:00
|
|
|
Additionally, the <literal>plpy</literal> module provides two functions called
|
2001-05-12 19:49:32 +02:00
|
|
|
<function>execute</function> and <function>prepare</function>.
|
|
|
|
Calling <function>plpy.execute</function> with a query string, and
|
|
|
|
an optional limit argument, causes that query to be run, and the
|
|
|
|
result returned in a result object. The result object emulates a
|
|
|
|
list or dictionary object. The result object can be accessed by
|
|
|
|
row number, and field name. It has these additional methods:
|
|
|
|
<function>nrows()</function> which returns the number of rows
|
|
|
|
returned by the query, and <function>status</function> which is the
|
|
|
|
<function>SPI_exec</function> return variable. The result object
|
|
|
|
can be modified.
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
rv = plpy.execute("SELECT * FROM my_table", 5)
|
|
|
|
</programlisting>
|
|
|
|
returns up to 5 rows from my_table. Ff my_table has a column
|
|
|
|
my_field it would be accessed as
|
|
|
|
<programlisting>
|
|
|
|
foo = rv[i]["my_field"]
|
|
|
|
</programlisting>
|
|
|
|
The second function <function>plpy.prepare</function> is called
|
|
|
|
with a query string, and a list of argument types if you have bind
|
|
|
|
variables in the query.
|
|
|
|
<programlisting>
|
|
|
|
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
|
|
|
|
</programlisting>
|
|
|
|
text is the type of the variable you will be passing as $1. After
|
|
|
|
preparing you use the function <function>plpy.execute</function> to
|
|
|
|
run it.
|
|
|
|
<programlisting>
|
|
|
|
rv = plpy.execute(plan, [ "name" ], 5)
|
|
|
|
</programlisting>
|
|
|
|
The limit argument is optional in the call to
|
|
|
|
<function>plpy.execute</function>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
When you prepare a plan using the PL/Python module it is
|
|
|
|
automatically saved. Read the SPI documentation (<xref
|
|
|
|
linkend="spi">) for a description of what this means. The take
|
|
|
|
home message is if you do
|
|
|
|
<programlisting>
|
|
|
|
plan = plpy.prepare("SOME QUERY")
|
|
|
|
plan = plpy.prepare("SOME OTHER QUERY")
|
|
|
|
</programlisting>
|
|
|
|
you are leaking memory, as I know of no way to free a saved plan.
|
|
|
|
The alternative of using unsaved plans it even more painful (for
|
|
|
|
me).
|
|
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
</chapter>
|