postgresql/doc/src/sgml/plpython.sgml

<!-- $Header: /cvsroot/pgsql/doc/src/sgml/plpython.sgml,v 1.16 2003/04/07 01:29:25 petere Exp $ -->

<chapter id="plpython">
 <title>PL/Python - Python Procedural Language</title>

 <indexterm zone="plpython"><primary>PL/Python</></>
 <indexterm zone="plpython"><primary>Python</></>

 <para>
  The <application>PL/Python</application> procedural language allows
  <productname>PostgreSQL</productname> functions to be written in the
  <ulink url="http://www.python.org">Python</ulink> language.
 </para>

 <para>
  To install PL/Python in a particular database, use
  <literal>createlang plpython <replaceable>dbname</></literal>.
 </para>

  <tip>
   <para>
    If a language is installed into <literal>template1</>, all subsequently
    created databases will have the language installed automatically.
   </para>
  </tip>

 <note>
  <para>
   Users of source packages must specially enable the build of
   PL/Python during the installation process.  (Refer to the
   installation instructions for more information.)  Users of binary
   packages might find PL/Python in a separate subpackage.
  </para>
 </note>

 <sect1 id="plpython-funcs">
  <title>PL/Python Functions</title>

  <para>
   The Python code you write gets transformed into a Python function.  E.g.,
<programlisting>
CREATE FUNCTION myfunc(text) RETURNS text
    AS 'return args[0]'
    LANGUAGE plpython;
</programlisting>

   gets transformed into

<programlisting>
def __plpython_procedure_myfunc_23456():
        return args[0]
</programlisting>

   where 23456 is the OID of the function.
  </para>

  <para>
   If you do not provide a return value, Python returns the default
   <symbol>None</symbol>. The
   language module translates Python's <symbol>None</symbol> into the
   SQL null value.
  </para>

  <para>
   The <productname>PostgreSQL</> function parameters are available in
   the global <varname>args</varname> list.  In the
   <function>myfunc</function> example, <varname>args[0]</> contains
   whatever was passed in as the text argument.  For
   <literal>myfunc2(text, integer)</literal>, <varname>args[0]</>
   would contain the <type>text</type> argument and
   <varname>args[1]</varname> the <type>integer</type> argument.
  </para>

  <para>
   The global dictionary <varname>SD</varname> is available to store
   data between function calls.  This variable is private static data.
   The global dictionary <varname>GD</varname> is public data,
   available to all Python functions within a session.  Use with care.
  </para>

  <para>
   Each function gets its own restricted execution object in the
   Python interpreter, so that global data and function arguments from
   <function>myfunc</function> are not available to
   <function>myfunc2</function>.  The exception is the data in the
   <varname>GD</varname> dictionary, as mentioned above.
  </para>
 </sect1>

 <sect1 id="plpython-trigger">
  <title>Trigger Functions</title>

  <para>
   When a function is used in a trigger, the dictionary
   <literal>TD</literal> contains trigger-related values.  The trigger
   rows are in <literal>TD["new"]</> and/or <literal>TD["old"]</>
   depending on the trigger event.  <literal>TD["event"]</> contains
   the event as a string (<literal>INSERT</>, <literal>UPDATE</>,
   <literal>DELETE</>, or <literal>UNKNOWN</>).
   <literal>TD["when"]</> contains one of <literal>BEFORE</>,
   <literal>AFTER</>, and <literal>UNKNOWN</>.
   <literal>TD["level"]</> contains one of <literal>ROW</>,
   <literal>STATEMENT</>, and <literal>UNKNOWN</>.
   <literal>TD["name"]</> contains the trigger name, and
   <literal>TD["relid"]</> contains the OID of the table on
   which the trigger occurred.  If the trigger was called with
   arguments they are available in <literal>TD["args"][0]</> to
   <literal>TD["args"][(n-1)]</>.
  </para>

  <para>
   If <literal>TD["when"]</literal> is <literal>BEFORE</>, you may
   return <literal>None</literal> or <literal>"OK"</literal> from the
   Python function to indicate the row is unmodified,
   <literal>"SKIP"</> to abort the event, or <literal>"MODIFY"</> to
   indicate you've modified the row.
  </para>
 </sect1>

 <sect1 id="plpython-database">
  <title>Database Access</title>

  <para>
   The PL/Python language module automatically imports a Python module
   called <literal>plpy</literal>.  The functions and constants in
   this module are available to you in the Python code as
   <literal>plpy.<replaceable>foo</replaceable></literal>.  At present
   <literal>plpy</literal> implements the functions
   <literal>plpy.debug("msg")</literal>,
   <literal>plpy.log("msg")</literal>,
   <literal>plpy.info("msg")</literal>,
   <literal>plpy.notice("msg")</literal>,
   <literal>plpy.warning("msg")</literal>,
   <literal>plpy.error("msg")</literal>, and
   <literal>plpy.fatal("msg")</literal>.  They are mostly equivalent
   to calling <literal>elog(<replaceable>LEVEL</>, "msg")</literal>
   from C code.  <function>plpy.error</function> and
   <function>plpy.fatal</function> actually raise a Python exception
   which, if uncaught, causes the PL/Python module to call
   <literal>elog(ERROR, msg)</literal> when the function handler
   returns from the Python interpreter.  Long-jumping out of the
   Python interpreter is probably not good.  <literal>raise
   plpy.ERROR("msg")</literal> and <literal>raise
   plpy.FATAL("msg")</literal> are equivalent to calling
   <function>plpy.error</function> and
   <function>plpy.fatal</function>, respectively.
  </para>

  <para>
   Additionally, the <literal>plpy</literal> module provides two
   functions called <function>execute</function> and
   <function>prepare</function>.  Calling
   <function>plpy.execute</function> with a query string and an
   optional limit argument causes that query to be run and the result
   to be returned in a result object.  The result object emulates a
   list or dictionary object.  The result object can be accessed by
   row number and column name.  It has these additional methods:
   <function>nrows</function> which returns the number of rows
   returned by the query, and <function>status</function> which is the
   <function>SPI_exec()</function> return value.  The result object
   can be modified.
  </para>

  <para>
   For example,
<programlisting>
rv = plpy.execute("SELECT * FROM my_table", 5)
</programlisting>
   returns up to 5 rows from <literal>my_table</literal>.  If
   <literal>my_table</literal> has a column
   <literal>my_column</literal>, it would be accessed as
<programlisting>
foo = rv[i]["my_column"]
</programlisting>
  </para>

  <para>
   The second function, <function>plpy.prepare</function>, prepares the
   execution plan for a query.  It is called with a query string and a
   list of parameter types, if you have parameter references in the
   query.  For example:
<programlisting>
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
</programlisting>
   <literal>text</literal> is the type of the variable you will be
   passing for <literal>$1</literal>.  After preparing a statement, you
   use the function <function>plpy.execute</function> to run it:
<programlisting>
rv = plpy.execute(plan, [ "name" ], 5)
</programlisting>
   The third argument is the limit and is optional.
  </para>

  <para>
   In the current version, any database error encountered while
   running a <application>PL/Python</application> function will result
   in the immediate termination of that function by the server; it is
   not possible to trap error conditions using Python <literal>try
   ... catch</literal> constructs.  For example, a syntax error in an
   SQL statement passed to the <literal>plpy.execute</literal> call
   will terminate the function.  This behavior may be changed in a
   future release.
  </para>

  <para>
   When you prepare a plan using the PL/Python module it is
   automatically saved.  Read the SPI documentation (<xref
   linkend="spi">) for a description of what this means.
   In order to make effective use of this across function calls
   one needs to use one of the persistent storage dictionaries
   <literal>SD</literal> or <literal>GD</literal> (see
   <xref linkend="plpython-funcs">). For example:
<programlisting>
CREATE FUNCTION usesavedplan() RETURNS trigger AS '
    if SD.has_key("plan"):
        plan = SD["plan"]
    else:
        plan = plpy.prepare("SELECT 1")
        SD["plan"] = plan
    # rest of function
' LANGUAGE plpython;
</programlisting>
  </para>
 </sect1>

 <sect1 id="plpython-trusted">
  <title>Restricted Environment</title>

  <para>
   The current version of <application>PL/Python</application>
   functions as a trusted language only; access to the file system and
   other local resources is disabled.  Specifically,
   <application>PL/Python</application> uses the Python restricted
   execution environment, further restricts it to prevent the use of
   the file <function>open</> call, and allows only modules from a
   specific list to be imported.  Presently, that list includes:
   <literal>array</>, <literal>bisect</>, <literal>binascii</>,
   <literal>calendar</>, <literal>cmath</>, <literal>codecs</>,
   <literal>errno</>, <literal>marshal</>, <literal>math</>, <literal>md5</>,
   <literal>mpz</>, <literal>operator</>, <literal>pcre</>,
   <literal>pickle</>, <literal>random</>, <literal>re</>, <literal>regex</>,
   <literal>sre</>, <literal>sha</>, <literal>string</>, <literal>StringIO</>,
   <literal>struct</>, <literal>time</>, <literal>whrandom</>, and
   <literal>zlib</>.
  </para>
 </sect1>

</chapter>