postgresql/doc/src/sgml/plpython.sgml

<!-- $Header: /cvsroot/pgsql/doc/src/sgml/plpython.sgml,v 1.8 2002/01/07 02:29:13 petere Exp $ -->

<chapter id="plpython">
 <title>PL/Python - Python Procedural Language</title>

 <indexterm zone="plpython"><primary>PL/Python</></>
 <indexterm zone="plpython"><primary>Python</></>

 <sect1 id="plpython-intro">
  <title>Introduction</title>

  <para>
   The <application>PL/Python</application> procedural language allows
   <productname>PostgreSQL</productname> functions to be written in
   the <ulink url="http://www.python.org">Python</ulink> language.
  </para>

  <para>
   The current version of <application>PL/Python</application>
   functions as a trusted language only; access to the file system and
   other local resources is disabled.  Specifically,
   <application>PL/Python</application> uses the Python restricted
   execution environment, further restricts it to prevent the use of
   the file <function>open</> call, and allows only modules from a
   specific list to be imported.  Presently, that list includes:
   array, bisect, binascii, calendar, cmath, codecs, errno, marshal,
   math, md5, mpz, operator, pcre, pickle, random, re, regex, sre,
   sha, string, StringIO, struct, time, whrandom, and zlib.
  </para>

  <para>
   In the current version, any database error encountered while
   running a <application>PL/Python</application> function will result
   in the immediate termination of that function by the server.  It is
   not possible to trap error conditions using Python <literal>try
   ... catch</literal> constructs.  For example, a syntax error in an
   SQL statement passed to the <literal>plpy.execute()</literal> call
   will terminate the function.  This behavior may be changed in a
   future release.
  </para>
 </sect1>

 <sect1 id="plpython-install">
  <title>Installation</title>

  <para>
   To build PL/Python, the <option>--with-python</option> option needs
   to be specified when running <filename>configure</filename>.  If
   after building and installing you have a file called
   <filename>plpython.so</filename> (possibly a different extension),
   then everything went well.  Otherwise you should have seen a notice
   like this flying by:
<screen>
*** Cannot build PL/Python because libpython is not a shared library.
*** You might have to rebuild your Python installation.  Refer to
*** the documentation for details.
</screen>
   That means you have to rebuild (part of) your Python installation
   to supply this shared library.
  </para>

  <para>
   The catch is that the Python distribution or the Python maintainers
   do not provide any direct way to do this.  The closest thing we can
   offer you is the information in <ulink
   url="http://www.python.org/doc/FAQ.html#3.30">Python FAQ
   3.30</ulink>.  On some operating systems you don't really have to
   build a shared library, but then you will have to convince the
   PostgreSQL build system of this.  Consult the
   <filename>Makefile</filename> in the
   <filename>src/pl/plpython</filename> directory for details.
  </para>
 </sect1>

 <sect1 id="plpython-using">
  <title>Using PL/Python</title>

  <para>
   There are sample functions in
   <filename>plpython_function.sql</filename>.  The Python code you
   write gets transformed into a function.  E.g.,
<programlisting>
CREATE FUNCTION myfunc(text) RETURNS text AS
'return args[0]'
LANGUAGE 'plpython';
</programlisting>

   gets transformed into

<programlisting>
def __plpython_procedure_myfunc_23456():
	return args[0]
</programlisting>

   where 23456 is the Oid of the function.
  </para>

  <para>
   If you do not provide a return value, Python returns the default
   <symbol>None</symbol> which may or may not be what you want.  The
   language module translates Python's None into SQL NULL.
  </para>

  <para>
   <productname>PostgreSQL</> function variables are available in the global
   <varname>args</varname> list.  In the <function>myfunc</function>
   example, <varname>args[0]</> contains whatever was passed in as the text
   argument.  For <literal>myfunc2(text, integer)</literal>, <varname>args[0]</>
   would contain the <type>text</type> variable and <varname>args[1]</varname> the <type>integer</type> variable.
  </para>

  <para>
   The global dictionary SD is available to store data between
   function calls.  This variable is private static data.  The global
   dictionary GD is public data, available to all python functions
   within a backend.  Use with care.
  </para>

  <para>
   Each function gets its own restricted execution object in the
   Python interpreter, so that global data and function arguments from
   <function>myfunc</function> are not available to
   <function>myfunc2</function>.  The exception is the data in the GD
   dictionary, as mentioned above.
  </para>

  <para>
   When a function is used in a trigger, the dictionary TD contains
   transaction related values.  The trigger tuples are in <literal>TD["new"]</>
   and/or <literal>TD["old"]</> depending on the trigger event.  <literal>TD["event"]</>
   contains the event as a string (<literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>, or
   <literal>UNKNOWN</>).  TD["when"] contains one of (<literal>BEFORE</>, <literal>AFTER</>, or
   <literal>UNKNOWN</>).  <literal>TD["level"]</> contains one of <literal>ROW</>, <literal>STATEMENT</>, or
   <literal>UNKNOWN</>.  <literal>TD["name"]</> contains the trigger name, and <literal>TD["relid"]</>
   contains the relation id of the table on which the trigger occurred.
   If the trigger was called with arguments they are available
   in <literal>TD["args"][0]</> to <literal>TD["args"][(n -1)]</>.
  </para>

  <para>
   If the trigger <quote>when</quote> is <literal>BEFORE</>, you may return <literal>None</literal> or <literal>"OK"</literal>
   from the Python function to indicate the tuple is unmodified,
   <literal>"SKIP"</> to abort the event, or <literal>"MODIFIED"</> to indicate you've
   modified the tuple.
  </para>

  <para>
   The PL/Python language module automatically imports a Python module
   called <literal>plpy</literal>.  The functions and constants in
   this module are available to you in the Python code as
   <literal>plpy.<replaceable>foo</replaceable></literal>.  At present
   <literal>plpy</literal> implements the functions
   <literal>plpy.error("msg")</literal>,
   <literal>plpy.fatal("msg")</literal>,
   <literal>plpy.debug("msg")</literal>, and
   <literal>plpy.notice("msg")</literal>.  They are mostly equivalent
   to calling <literal>elog(<replaceable>LEVEL</>, "msg")</literal>,
   where <replaceable>LEVEL</> is DEBUG, ERROR, FATAL or NOTICE.
   <function>plpy.error</function> and <function>plpy.fatal</function>
   actually raise a Python exception which, if uncaught, causes the
   PL/Python module to call <literal>elog(ERROR, msg)</literal> when
   the function handler returns from the Python interpreter.  Long
   jumping out of the Python interpreter is probably not good.
   <literal>raise plpy.ERROR("msg")</literal> and <literal>raise
   plpy.FATAL("msg")</literal> are equivalent to calling
   <function>plpy.error</function> or <function>plpy.fatal</function>.
  </para>

  <para>
   Additionally, the <literal>plpy</literal> module provides two functions called
   <function>execute</function> and <function>prepare</function>.
   Calling <function>plpy.execute</function> with a query string, and
   an optional limit argument, causes that query to be run, and the
   result returned in a result object.  The result object emulates a
   list or dictionary object.  The result object can be accessed by
   row number, and field name.  It has these additional methods:
   <function>nrows()</function> which returns the number of rows
   returned by the query, and <function>status</function> which is the
   <function>SPI_exec</function> return variable.  The result object
   can be modified.

<programlisting>
rv = plpy.execute("SELECT * FROM my_table", 5)
</programlisting>
   returns up to 5 rows from my_table.  Ff my_table has a column
   my_field it would be accessed as
<programlisting>
foo = rv[i]["my_field"]
</programlisting>
   The second function <function>plpy.prepare</function> is called
   with a query string, and a list of argument types if you have bind
   variables in the query.
<programlisting>
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
</programlisting>
   text is the type of the variable you will be passing as $1.  After
   preparing you use the function <function>plpy.execute</function> to
   run it.
<programlisting>
rv = plpy.execute(plan, [ "name" ], 5)
</programlisting>
   The limit argument is optional in the call to
   <function>plpy.execute</function>.
  </para>

  <para>
   When you prepare a plan using the PL/Python module it is
   automatically saved.  Read the SPI documentation (<xref
   linkend="spi">) for a description of what this means.  The take
   home message is if you do
<programlisting>
plan = plpy.prepare("SOME QUERY")
plan = plpy.prepare("SOME OTHER QUERY")
</programlisting>
   you are leaking memory, as I know of no way to free a saved plan.
   The alternative of using unsaved plans it even more painful (for
   me).
  </para>
 </sect1>

</chapter>
Editorial review 2002-01-07 03:29:15 +01:00			`<!-- $Header: /cvsroot/pgsql/doc/src/sgml/plpython.sgml,v 1.8 2002/01/07 02:29:13 petere Exp $ -->`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00
			`<chapter id="plpython">`
			`<title>PL/Python - Python Procedural Language</title>`

Add some more index entries. 2001-11-12 20:19:39 +01:00			`<indexterm zone="plpython"><primary>PL/Python</></>`
			`<indexterm zone="plpython"><primary>Python</></>`

Editorial review 2002-01-07 03:29:15 +01:00			`<sect1 id="plpython-intro">`
			`<title>Introduction</title>`

PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`<para>`
Editorial review 2002-01-07 03:29:15 +01:00			`The <application>PL/Python</application> procedural language allows`
			`<productname>PostgreSQL</productname> functions to be written in`
			`the <ulink url="http://www.python.org">Python</ulink> language.`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`</para>`
Editorial review 2002-01-07 03:29:15 +01:00
Added a few paragraphs to explain current restricted execution environment, and error handling behavior. Bradley McLean 2001-11-20 22:09:53 +01:00			`<para>`
Editorial review 2002-01-07 03:29:15 +01:00			`The current version of <application>PL/Python</application>`
			`functions as a trusted language only; access to the file system and`
			`other local resources is disabled. Specifically,`
			`<application>PL/Python</application> uses the Python restricted`
			`execution environment, further restricts it to prevent the use of`
			`the file <function>open</> call, and allows only modules from a`
			`specific list to be imported. Presently, that list includes:`
			`array, bisect, binascii, calendar, cmath, codecs, errno, marshal,`
			`math, md5, mpz, operator, pcre, pickle, random, re, regex, sre,`
			`sha, string, StringIO, struct, time, whrandom, and zlib.`
Added a few paragraphs to explain current restricted execution environment, and error handling behavior. Bradley McLean 2001-11-20 22:09:53 +01:00			`</para>`
Editorial review 2002-01-07 03:29:15 +01:00
Added a few paragraphs to explain current restricted execution environment, and error handling behavior. Bradley McLean 2001-11-20 22:09:53 +01:00			`<para>`
Editorial review 2002-01-07 03:29:15 +01:00			`In the current version, any database error encountered while`
			`running a <application>PL/Python</application> function will result`
			`in the immediate termination of that function by the server. It is`
			`not possible to trap error conditions using Python <literal>try`
			`... catch</literal> constructs. For example, a syntax error in an`
			`SQL statement passed to the <literal>plpy.execute()</literal> call`
			`will terminate the function. This behavior may be changed in a`
Added a few paragraphs to explain current restricted execution environment, and error handling behavior. Bradley McLean 2001-11-20 22:09:53 +01:00			`future release.`
			`</para>`
Editorial review 2002-01-07 03:29:15 +01:00			`</sect1>`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00
			`<sect1 id="plpython-install">`
			`<title>Installation</title>`

			`<para>`
Editorial review 2002-01-07 03:29:15 +01:00			`To build PL/Python, the <option>--with-python</option> option needs`
			`to be specified when running <filename>configure</filename>. If`
			`after building and installing you have a file called`
			`<filename>plpython.so</filename> (possibly a different extension),`
			`then everything went well. Otherwise you should have seen a notice`
			`like this flying by:`
			`<screen>`
			`*** Cannot build PL/Python because libpython is not a shared library.`
			`*** You might have to rebuild your Python installation. Refer to`
			`*** the documentation for details.`
			`</screen>`
			`That means you have to rebuild (part of) your Python installation`
			`to supply this shared library.`
			`</para>`

			`<para>`
			`The catch is that the Python distribution or the Python maintainers`
			`do not provide any direct way to do this. The closest thing we can`
			`offer you is the information in <ulink`
			`url="http://www.python.org/doc/FAQ.html#3.30">Python FAQ`
			`3.30</ulink>. On some operating systems you don't really have to`
			`build a shared library, but then you will have to convince the`
			`PostgreSQL build system of this. Consult the`
			`<filename>Makefile</filename> in the`
			`<filename>src/pl/plpython</filename> directory for details.`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`</para>`
			`</sect1>`

			`<sect1 id="plpython-using">`
Editorial review 2002-01-07 03:29:15 +01:00			`<title>Using PL/Python</title>`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00
			`<para>`
			`There are sample functions in`
			`<filename>plpython_function.sql</filename>. The Python code you`
			`write gets transformed into a function. E.g.,`
			`<programlisting>`
			`CREATE FUNCTION myfunc(text) RETURNS text AS`
			`'return args[0]'`
			`LANGUAGE 'plpython';`
			`</programlisting>`

			`gets transformed into`

			`<programlisting>`
			`def __plpython_procedure_myfunc_23456():`
			`return args[0]`
			`</programlisting>`

			`where 23456 is the Oid of the function.`
			`</para>`

			`<para>`
			`If you do not provide a return value, Python returns the default`
			`<symbol>None</symbol> which may or may not be what you want. The`
			`language module translates Python's None into SQL NULL.`
			`</para>`

			`<para>`
Deprecate 'current' for date/time input. Fix up references to "PostgreSQL" rather than "Postgres". Was roughly evenly split between the two before. ref/ files not yet done. 2001-11-21 06:53:41 +01:00			`<productname>PostgreSQL</> function variables are available in the global`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`<varname>args</varname> list. In the <function>myfunc</function>`
Markup and spell-check run over Programmer's Guide (rather incomplete still). 2001-09-10 23:58:47 +02:00			`example, <varname>args[0]</> contains whatever was passed in as the text`
			`argument. For <literal>myfunc2(text, integer)</literal>, <varname>args[0]</>`
			`would contain the <type>text</type> variable and <varname>args[1]</varname> the <type>integer</type> variable.`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`</para>`

			`<para>`
			`The global dictionary SD is available to store data between`
			`function calls. This variable is private static data. The global`
			`dictionary GD is public data, available to all python functions`
Here's some matching documentation, including some other undocumented items in TD. Should doc patches alse be sent to pgsql-patches, or do I have to subscribe to pgsql-docs? The archive link for pgsql-patches is broken, and I don't see any patches in spot checking the archive for pgsql-docs. -Brad McLean. 2001-09-12 05:58:15 +02:00			`within a backend. Use with care.`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`</para>`

			`<para>`
			`Each function gets its own restricted execution object in the`
			`Python interpreter, so that global data and function arguments from`
			`<function>myfunc</function> are not available to`
			`<function>myfunc2</function>. The exception is the data in the GD`
			`dictionary, as mentioned above.`
			`</para>`

Replace ASCII-quotes with proper markup. 2001-09-13 17:55:24 +02:00			`<para>`
Here's some matching documentation, including some other undocumented items in TD. Should doc patches alse be sent to pgsql-patches, or do I have to subscribe to pgsql-docs? The archive link for pgsql-patches is broken, and I don't see any patches in spot checking the archive for pgsql-docs. -Brad McLean. 2001-09-12 05:58:15 +02:00			`When a function is used in a trigger, the dictionary TD contains`
Replace ASCII-quotes with proper markup. 2001-09-13 17:55:24 +02:00			`transaction related values. The trigger tuples are in <literal>TD["new"]</>`
			`and/or <literal>TD["old"]</> depending on the trigger event. <literal>TD["event"]</>`
			`contains the event as a string (<literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>, or`
			`<literal>UNKNOWN</>). TD["when"] contains one of (<literal>BEFORE</>, <literal>AFTER</>, or`
			`<literal>UNKNOWN</>). <literal>TD["level"]</> contains one of <literal>ROW</>, <literal>STATEMENT</>, or`
			`<literal>UNKNOWN</>. <literal>TD["name"]</> contains the trigger name, and <literal>TD["relid"]</>`
Here's some matching documentation, including some other undocumented items in TD. Should doc patches alse be sent to pgsql-patches, or do I have to subscribe to pgsql-docs? The archive link for pgsql-patches is broken, and I don't see any patches in spot checking the archive for pgsql-docs. -Brad McLean. 2001-09-12 05:58:15 +02:00			`contains the relation id of the table on which the trigger occurred.`
			`If the trigger was called with arguments they are available`
Replace ASCII-quotes with proper markup. 2001-09-13 17:55:24 +02:00			`in <literal>TD["args"][0]</> to <literal>TD["args"][(n -1)]</>.`
Here's some matching documentation, including some other undocumented items in TD. Should doc patches alse be sent to pgsql-patches, or do I have to subscribe to pgsql-docs? The archive link for pgsql-patches is broken, and I don't see any patches in spot checking the archive for pgsql-docs. -Brad McLean. 2001-09-12 05:58:15 +02:00			`</para>`

			`<para>`
Replace ASCII-quotes with proper markup. 2001-09-13 17:55:24 +02:00			`If the trigger <quote>when</quote> is <literal>BEFORE</>, you may return <literal>None</literal> or <literal>"OK"</literal>`
			`from the Python function to indicate the tuple is unmodified,`
			`<literal>"SKIP"</> to abort the event, or <literal>"MODIFIED"</> to indicate you've`
Here's some matching documentation, including some other undocumented items in TD. Should doc patches alse be sent to pgsql-patches, or do I have to subscribe to pgsql-docs? The archive link for pgsql-patches is broken, and I don't see any patches in spot checking the archive for pgsql-docs. -Brad McLean. 2001-09-12 05:58:15 +02:00			`modified the tuple.`
			`</para>`

PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`<para>`
			`The PL/Python language module automatically imports a Python module`
			`called <literal>plpy</literal>. The functions and constants in`
			`this module are available to you in the Python code as`
			`<literal>plpy.<replaceable>foo</replaceable></literal>. At present`
			`<literal>plpy</literal> implements the functions`
			`<literal>plpy.error("msg")</literal>,`
			`<literal>plpy.fatal("msg")</literal>,`
			`<literal>plpy.debug("msg")</literal>, and`
			`<literal>plpy.notice("msg")</literal>. They are mostly equivalent`
			`to calling <literal>elog(<replaceable>LEVEL</>, "msg")</literal>,`
			`where <replaceable>LEVEL</> is DEBUG, ERROR, FATAL or NOTICE.`
			`<function>plpy.error</function> and <function>plpy.fatal</function>`
			`actually raise a Python exception which, if uncaught, causes the`
			`PL/Python module to call <literal>elog(ERROR, msg)</literal> when`
			`the function handler returns from the Python interpreter. Long`
			`jumping out of the Python interpreter is probably not good.`
			`<literal>raise plpy.ERROR("msg")</literal> and <literal>raise`
			`plpy.FATAL("msg")</literal> are equivalent to calling`
			`<function>plpy.error</function> or <function>plpy.fatal</function>.`
			`</para>`

			`<para>`
Markup and spell-check run over Programmer's Guide (rather incomplete still). 2001-09-10 23:58:47 +02:00			`Additionally, the <literal>plpy</literal> module provides two functions called`
PL/Python should build portably now, if you can get over the fact that there's no shared libpython. Test suite works as well. Also, add some documentation. 2001-05-12 19:49:32 +02:00			`<function>execute</function> and <function>prepare</function>.`
			`Calling <function>plpy.execute</function> with a query string, and`
			`an optional limit argument, causes that query to be run, and the`
			`result returned in a result object. The result object emulates a`
			`list or dictionary object. The result object can be accessed by`
			`row number, and field name. It has these additional methods:`
			`<function>nrows()</function> which returns the number of rows`
			`returned by the query, and <function>status</function> which is the`
			`<function>SPI_exec</function> return variable. The result object`
			`can be modified.`

			`<programlisting>`
			`rv = plpy.execute("SELECT * FROM my_table", 5)`
			`</programlisting>`
			`returns up to 5 rows from my_table. Ff my_table has a column`
			`my_field it would be accessed as`
			`<programlisting>`
			`foo = rv[i]["my_field"]`
			`</programlisting>`
			`The second function <function>plpy.prepare</function> is called`
			`with a query string, and a list of argument types if you have bind`
			`variables in the query.`
			`<programlisting>`
			`plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])`
			`</programlisting>`
			`text is the type of the variable you will be passing as $1. After`
			`preparing you use the function <function>plpy.execute</function> to`
			`run it.`
			`<programlisting>`
			`rv = plpy.execute(plan, [ "name" ], 5)`
			`</programlisting>`
			`The limit argument is optional in the call to`
			`<function>plpy.execute</function>.`
			`</para>`

			`<para>`
			`When you prepare a plan using the PL/Python module it is`
			`automatically saved. Read the SPI documentation (<xref`
			`linkend="spi">) for a description of what this means. The take`
			`home message is if you do`
			`<programlisting>`
			`plan = plpy.prepare("SOME QUERY")`
			`plan = plpy.prepare("SOME OTHER QUERY")`
			`</programlisting>`
			`you are leaking memory, as I know of no way to free a saved plan.`
			`The alternative of using unsaved plans it even more painful (for`
			`me).`
			`</para>`
			`</sect1>`

			`</chapter>`