postgresql/doc/src/sgml/plpython.sgml

249 lines
9.4 KiB
Plaintext
Raw Normal View History

2003-04-07 03:29:26 +02:00
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/plpython.sgml,v 1.16 2003/04/07 01:29:25 petere Exp $ -->
<chapter id="plpython">
<title>PL/Python - Python Procedural Language</title>
2001-11-12 20:19:39 +01:00
<indexterm zone="plpython"><primary>PL/Python</></>
<indexterm zone="plpython"><primary>Python</></>
<para>
The <application>PL/Python</application> procedural language allows
<productname>PostgreSQL</productname> functions to be written in the
<ulink url="http://www.python.org">Python</ulink> language.
</para>
2002-01-07 03:29:15 +01:00
<para>
To install PL/Python in a particular database, use
<literal>createlang plpython <replaceable>dbname</></literal>.
</para>
2002-01-07 03:29:15 +01:00
2003-04-07 03:29:26 +02:00
<tip>
<para>
If a language is installed into <literal>template1</>, all subsequently
created databases will have the language installed automatically.
</para>
</tip>
<note>
<para>
Users of source packages must specially enable the build of
2003-04-07 03:29:26 +02:00
PL/Python during the installation process. (Refer to the
installation instructions for more information.) Users of binary
packages might find PL/Python in a separate subpackage.
</para>
</note>
2002-01-07 03:29:15 +01:00
<sect1 id="plpython-funcs">
<title>PL/Python Functions</title>
<para>
2003-04-07 03:29:26 +02:00
The Python code you write gets transformed into a Python function. E.g.,
<programlisting>
CREATE FUNCTION myfunc(text) RETURNS text
AS 'return args[0]'
2003-04-07 03:29:26 +02:00
LANGUAGE plpython;
</programlisting>
gets transformed into
<programlisting>
def __plpython_procedure_myfunc_23456():
return args[0]
</programlisting>
2002-03-22 20:20:45 +01:00
where 23456 is the OID of the function.
</para>
<para>
If you do not provide a return value, Python returns the default
2003-04-07 03:29:26 +02:00
<symbol>None</symbol>. The
language module translates Python's <symbol>None</symbol> into the
SQL null value.
</para>
<para>
The <productname>PostgreSQL</> function parameters are available in
the global <varname>args</varname> list. In the
<function>myfunc</function> example, <varname>args[0]</> contains
whatever was passed in as the text argument. For
<literal>myfunc2(text, integer)</literal>, <varname>args[0]</>
2003-04-07 03:29:26 +02:00
would contain the <type>text</type> argument and
<varname>args[1]</varname> the <type>integer</type> argument.
</para>
<para>
The global dictionary <varname>SD</varname> is available to store
data between function calls. This variable is private static data.
The global dictionary <varname>GD</varname> is public data,
available to all Python functions within a session. Use with care.
</para>
<para>
Each function gets its own restricted execution object in the
Python interpreter, so that global data and function arguments from
<function>myfunc</function> are not available to
<function>myfunc2</function>. The exception is the data in the
<varname>GD</varname> dictionary, as mentioned above.
</para>
</sect1>
<sect1 id="plpython-trigger">
<title>Trigger Functions</title>
<para>
When a function is used in a trigger, the dictionary
<literal>TD</literal> contains trigger-related values. The trigger
rows are in <literal>TD["new"]</> and/or <literal>TD["old"]</>
depending on the trigger event. <literal>TD["event"]</> contains
the event as a string (<literal>INSERT</>, <literal>UPDATE</>,
<literal>DELETE</>, or <literal>UNKNOWN</>).
<literal>TD["when"]</> contains one of <literal>BEFORE</>,
<literal>AFTER</>, and <literal>UNKNOWN</>.
<literal>TD["level"]</> contains one of <literal>ROW</>,
<literal>STATEMENT</>, and <literal>UNKNOWN</>.
<literal>TD["name"]</> contains the trigger name, and
2003-04-07 03:29:26 +02:00
<literal>TD["relid"]</> contains the OID of the table on
which the trigger occurred. If the trigger was called with
arguments they are available in <literal>TD["args"][0]</> to
<literal>TD["args"][(n-1)]</>.
</para>
<para>
2003-04-07 03:29:26 +02:00
If <literal>TD["when"]</literal> is <literal>BEFORE</>, you may
return <literal>None</literal> or <literal>"OK"</literal> from the
Python function to indicate the row is unmodified,
<literal>"SKIP"</> to abort the event, or <literal>"MODIFY"</> to
indicate you've modified the row.
</para>
</sect1>
<sect1 id="plpython-database">
<title>Database Access</title>
<para>
The PL/Python language module automatically imports a Python module
called <literal>plpy</literal>. The functions and constants in
this module are available to you in the Python code as
<literal>plpy.<replaceable>foo</replaceable></literal>. At present
<literal>plpy</literal> implements the functions
<literal>plpy.debug("msg")</literal>,
<literal>plpy.log("msg")</literal>,
<literal>plpy.info("msg")</literal>,
<literal>plpy.notice("msg")</literal>,
<literal>plpy.warning("msg")</literal>,
<literal>plpy.error("msg")</literal>, and
<literal>plpy.fatal("msg")</literal>. They are mostly equivalent
to calling <literal>elog(<replaceable>LEVEL</>, "msg")</literal>
from C code. <function>plpy.error</function> and
<function>plpy.fatal</function> actually raise a Python exception
which, if uncaught, causes the PL/Python module to call
<literal>elog(ERROR, msg)</literal> when the function handler
returns from the Python interpreter. Long-jumping out of the
Python interpreter is probably not good. <literal>raise
plpy.ERROR("msg")</literal> and <literal>raise
plpy.FATAL("msg")</literal> are equivalent to calling
<function>plpy.error</function> and
<function>plpy.fatal</function>, respectively.
</para>
<para>
Additionally, the <literal>plpy</literal> module provides two
functions called <function>execute</function> and
<function>prepare</function>. Calling
<function>plpy.execute</function> with a query string and an
optional limit argument causes that query to be run and the result
to be returned in a result object. The result object emulates a
list or dictionary object. The result object can be accessed by
2003-04-07 03:29:26 +02:00
row number and column name. It has these additional methods:
<function>nrows</function> which returns the number of rows
returned by the query, and <function>status</function> which is the
2003-04-07 03:29:26 +02:00
<function>SPI_exec()</function> return value. The result object
can be modified.
</para>
<para>
For example,
<programlisting>
rv = plpy.execute("SELECT * FROM my_table", 5)
</programlisting>
returns up to 5 rows from <literal>my_table</literal>. If
<literal>my_table</literal> has a column
2003-04-07 03:29:26 +02:00
<literal>my_column</literal>, it would be accessed as
<programlisting>
2003-04-07 03:29:26 +02:00
foo = rv[i]["my_column"]
</programlisting>
</para>
<para>
2003-04-07 03:29:26 +02:00
The second function, <function>plpy.prepare</function>, prepares the
execution plan for a query. It is called with a query string and a
list of parameter types, if you have parameter references in the
query. For example:
<programlisting>
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
</programlisting>
<literal>text</literal> is the type of the variable you will be
2003-04-07 03:29:26 +02:00
passing for <literal>$1</literal>. After preparing a statement, you
use the function <function>plpy.execute</function> to run it:
<programlisting>
rv = plpy.execute(plan, [ "name" ], 5)
</programlisting>
2003-04-07 03:29:26 +02:00
The third argument is the limit and is optional.
</para>
<para>
In the current version, any database error encountered while
running a <application>PL/Python</application> function will result
in the immediate termination of that function by the server; it is
not possible to trap error conditions using Python <literal>try
... catch</literal> constructs. For example, a syntax error in an
2003-04-07 03:29:26 +02:00
SQL statement passed to the <literal>plpy.execute</literal> call
will terminate the function. This behavior may be changed in a
future release.
</para>
<para>
When you prepare a plan using the PL/Python module it is
automatically saved. Read the SPI documentation (<xref
linkend="spi">) for a description of what this means.
In order to make effective use of this across function calls
one needs to use one of the persistent storage dictionaries
2003-04-07 03:29:26 +02:00
<literal>SD</literal> or <literal>GD</literal> (see
<xref linkend="plpython-funcs">). For example:
<programlisting>
2003-04-07 03:29:26 +02:00
CREATE FUNCTION usesavedplan() RETURNS trigger AS '
if SD.has_key("plan"):
plan = SD["plan"]
else:
plan = plpy.prepare("SELECT 1")
SD["plan"] = plan
# rest of function
' LANGUAGE plpython;
</programlisting>
</para>
</sect1>
<sect1 id="plpython-trusted">
<title>Restricted Environment</title>
<para>
The current version of <application>PL/Python</application>
functions as a trusted language only; access to the file system and
other local resources is disabled. Specifically,
<application>PL/Python</application> uses the Python restricted
execution environment, further restricts it to prevent the use of
the file <function>open</> call, and allows only modules from a
specific list to be imported. Presently, that list includes:
2002-09-22 20:47:24 +02:00
<literal>array</>, <literal>bisect</>, <literal>binascii</>,
<literal>calendar</>, <literal>cmath</>, <literal>codecs</>,
<literal>errno</>, <literal>marshal</>, <literal>math</>, <literal>md5</>,
<literal>mpz</>, <literal>operator</>, <literal>pcre</>,
<literal>pickle</>, <literal>random</>, <literal>re</>, <literal>regex</>,
<literal>sre</>, <literal>sha</>, <literal>string</>, <literal>StringIO</>,
<literal>struct</>, <literal>time</>, <literal>whrandom</>, and
<literal>zlib</>.
</para>
</sect1>
</chapter>