postgresql/doc/src/sgml/plperl.sgml

<!--
$PostgreSQL: pgsql/doc/src/sgml/plperl.sgml,v 2.32 2004/11/21 21:17:01 tgl Exp $
-->

 <chapter id="plperl">
  <title>PL/Perl - Perl Procedural Language</title>

  <indexterm zone="plperl">
   <primary>PL/Perl</primary>
  </indexterm>

  <indexterm zone="plperl">
   <primary>Perl</primary>
  </indexterm>

  <para>
   PL/Perl is a loadable procedural language that enables you to write
   <productname>PostgreSQL</productname> functions in the <ulink
   url="http://www.perl.com">Perl</ulink> programming language.
  </para>

  <para>
   To install PL/Perl in a particular database, use
   <literal>createlang plperl <replaceable>dbname</></literal>.
  </para>

  <tip>
   <para>
    If a language is installed into <literal>template1</>, all subsequently
    created databases will have the language installed automatically.
   </para>
  </tip>

  <note>
   <para>
    Users of source packages must specially enable the build of
    PL/Perl during the installation process.  (Refer to <xref
    linkend="install-short"> for more information.)  Users of
    binary packages might find PL/Perl in a separate subpackage.
   </para>
  </note>

 <sect1 id="plperl-funcs">
  <title>PL/Perl Functions and Arguments</title>

  <para>
   To create a function in the PL/Perl language, use the standard syntax:
<programlisting>
CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-types</replaceable>) RETURNS <replaceable>return-type</replaceable> AS $$
    # PL/Perl function body
$$ LANGUAGE plperl;
</programlisting>
   The body of the function is ordinary Perl code.
  </para>

   <para>
    The syntax of the <command>CREATE FUNCTION</command> command requires
    the function body to be written as a string constant.  It is usually
    most convenient to use dollar quoting (see <xref
    linkend="sql-syntax-dollar-quoting">) for the string constant.
    If you choose to use regular single-quoted string constant syntax,
    you must escape single quote marks (<literal>'</>) and backslashes
    (<literal>\</>) used in the body of the function, typically by
    doubling them (see <xref linkend="sql-syntax-strings">).
   </para>

  <para>
   Arguments and results are handled as in any other Perl subroutine:
   arguments are passed in <varname>@_</varname>, and a result value
   is returned with <literal>return</> or as the last expression
   evaluated in the function.
  </para>

  <para>
   For example, a function returning the greater of two integer values
   could be defined as:

<programlisting>
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
    if ($_[0] &gt; $_[1]) { return $_[0]; }
    return $_[1];
$$ LANGUAGE plperl;
</programlisting>
  </para>

  <para>
   If an SQL null value<indexterm><primary>null value</><secondary
   sortas="PL/Perl">in PL/Perl</></indexterm> is passed to a function,
   the argument value will appear as <quote>undefined</> in Perl.  The
   above function definition will not behave very nicely with null
   inputs (in fact, it will act as though they are zeroes).  We could
   add <literal>STRICT</> to the function definition to make
   <productname>PostgreSQL</productname> do something more reasonable:
   if a null value is passed, the function will not be called at all,
   but will just return a null result automatically.  Alternatively,
   we could check for undefined inputs in the function body.  For
   example, suppose that we wanted <function>perl_max</function> with
   one null and one nonnull argument to return the nonnull argument,
   rather than a null value:

<programlisting>
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
    my ($a,$b) = @_;
    if (! defined $a) {
        if (! defined $b) { return undef; }
        return $b;
    }
    if (! defined $b) { return $a; }
    if ($a &gt; $b) { return $a; }
    return $b;
$$ LANGUAGE plperl;
</programlisting>
   As shown above, to return an SQL null value from a PL/Perl
   function, return an undefined value.  This can be done whether the
   function is strict or not.
  </para>

  <para>
   Composite-type arguments are passed to the function as references
   to hashes.  The keys of the hash are the attribute names of the
   composite type.  Here is an example:

<programlisting>
CREATE TABLE employee (
    name text,
    basesalary integer,
    bonus integer
);

CREATE FUNCTION empcomp(employee) RETURNS integer AS $$
    my ($emp) = @_;
    return $emp-&gt;{basesalary} + $emp-&gt;{bonus};
$$ LANGUAGE plperl;

SELECT name, empcomp(employee) FROM employee;
</programlisting>
  </para>
 </sect1>

 <sect1 id="plperl-database">
  <title>Database Access from PL/Perl</title>

  <para>
   Access to the database itself from your Perl function can be done
   via the function <function>spi_exec_query</function> described
   below, or via an experimental module <ulink
   url="http://www.cpan.org/modules/by-module/DBD/APILOS/"><literal>DBD::PgSPI</literal></ulink>
   (also available at <ulink
   url="http://www.cpan.org/SITES.html"><acronym>CPAN</> mirror
   sites</ulink>).  This module makes available a
   <acronym>DBI</>-compliant database-handle named
   <varname>$pg_dbh</varname> that can be used to perform queries with
   normal <acronym>DBI</>
   syntax.<indexterm><primary>DBI</></indexterm>
  </para>

  <para>
   PL/Perl itself presently provides two additional Perl commands:

   <variablelist>
    <varlistentry>
     <indexterm>
      <primary>spi_exec_query</primary>
      <secondary>in PL/Perl</secondary>
     </indexterm>

     <term><literal><function>spi_exec_query</>(<replaceable>query</replaceable> [, <replaceable>max-rows</replaceable>])</literal></term>
     <term><literal><function>spi_exec_query</>(<replaceable>command</replaceable>)</literal></term>
     <listitem>
      <para>
       Executes an SQL command.  Here is an example of a query
       (<command>SELECT</command> command) with the optional maximum
       number of rows:
<programlisting>
$rv = spi_exec_query('SELECT * FROM my_table', 5);
</programlisting>
        This returns up to 5 rows from the table
        <literal>my_table</literal>.  If <literal>my_table</literal>
        has a column <literal>my_column</literal>, you can get that
        value from row <literal>$i</literal> of the result like this:
<programlisting>
$foo = $rv-&gt;{rows}[$i]-&gt;{my_column};
</programlisting>
       The total number of rows returned from a <command>SELECT</command>
       query can be accessed like this:
<programlisting>
$nrows = $rv-&gt;{processed}
</programlisting>
      </para>

      <para>
       Here is an example using a different command type:
<programlisting>
$query = "INSERT INTO my_table VALUES (1, 'test')";
$rv = spi_exec_query($query);
</programlisting>
       You can then access the command status (e.g.,
       <literal>SPI_OK_INSERT</literal>) like this:
<programlisting>
$res = $rv-&gt;{status};
</programlisting>
       To get the number of rows affected, do:
<programlisting>
$nrows = $rv-&gt;{processed};
</programlisting>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <indexterm>
      <primary>elog</primary>
      <secondary>in PL/Perl</secondary>
     </indexterm>

     <term><literal><function>elog</>(<replaceable>level</replaceable>, <replaceable>msg</replaceable>)</literal></term>
     <listitem>
      <para>
       Emit a log or error message. Possible levels are
       <literal>DEBUG</>, <literal>LOG</>, <literal>INFO</>,
       <literal>NOTICE</>, <literal>WARNING</>, and <literal>ERROR</>.
       <literal>ERROR</>
        raises an error condition; if this is not trapped by the surrounding
        Perl code, the error propagates out to the calling query, causing
        the current transaction or subtransaction to be aborted.  This
        is effectively the same as the Perl <literal>die</> command.
        The other levels simply report the message to the system log
        and/or client.
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>
 </sect1>

 <sect1 id="plperl-data">
  <title>Data Values in PL/Perl</title>

  <para>
   The argument values supplied to a PL/Perl function's code are
   simply the input arguments converted to text form (just as if they
   had been displayed by a <command>SELECT</command> statement).
   Conversely, the <literal>return</> command will accept any string
   that is acceptable input format for the function's declared return
   type.  So, the PL/Perl programmer can manipulate data values as if
   they were just text.
  </para>

  <para>
   PL/Perl can also return row sets and composite types, and row sets
   of composite types.  Here is an example of a PL/Perl function
   returning a row set of a row type.  Note that a composite type is
   always represented as a hash reference.
<programlisting>
CREATE TABLE test (
    i int,
    v varchar
);

INSERT INTO test (i, v) VALUES (1, 'first line');
INSERT INTO test (i, v) VALUES (2, 'second line');
INSERT INTO test (i, v) VALUES (3, 'third line');
INSERT INTO test (i, v) VALUES (4, 'immortal');

CREATE FUNCTION test_munge() RETURNS SETOF test AS $$
    my $res = [];
    my $rv = spi_exec_query('select i, v from test;');
    my $status = $rv-&gt;{status};
    my $nrows = $rv-&gt;{processed};
    foreach my $rn (0 .. $nrows - 1) {
        my $row = $rv-&gt;{rows}[$rn];
        $row-&gt;{i} += 200 if defined($row-&gt;{i});
        $row-&gt;{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row-&gt;{v}));
        push @$res, $row;
    }
    return $res;
$$ LANGUAGE plperl;

SELECT * FROM test_munge();
</programlisting>
  </para>

  <para>
   Here is an example of a PL/Perl function returning a composite
   type:
<programlisting>
CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text);

CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$
    return {f2 =&gt; 'hello', f1 =&gt; 1, f3 =&gt; 'world'};
$$ LANGUAGE plperl;
</programlisting>
  </para>

  <para>
   Here is an example of a PL/Perl function returning a row set of a
   composite type.  Since a row set is always a reference to an array
   and a composite type is always a reference to a hash, a rowset of a
   composite type is a reference to an array of hash references.
<programlisting>
CREATE TYPE testsetperl AS (f1 integer, f2 text, f3 text);

CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testsetperl AS $$
    return [
        { f1 =&gt; 1, f2 =&gt; 'Hello', f3 =&gt;  'World' },
        { f1 =&gt; 2, f2 =&gt; 'Hello', f3 =&gt;  'PostgreSQL' },
        { f1 =&gt; 3, f2 =&gt; 'Hello', f3 =&gt;  'PL/Perl' }
    ];
$$  LANGUAGE plperl;
</programlisting>
  </para>
 </sect1>

 <sect1 id="plperl-global">
  <title>Global Values in PL/Perl</title>

  <para>
   You can use the global hash <varname>%_SHARED</varname> to store
   data between function calls.  For example:
<programlisting>
CREATE OR REPLACE FUNCTION set_var(name text, val text) RETURNS text AS $$
    if ($_SHARED{$_[0]} = $_[1]) {
        return 'ok';
    } else {
        return "can't set shared variable $_[0] to $_[1]";
    }
$$ LANGUAGE plperl;

CREATE OR REPLACE FUNCTION get_var(name text) RETURNS text AS $$
    return $_SHARED{$_[0]};
$$ LANGUAGE plperl;

SELECT set_var('sample', 'Hello, PL/Perl!  How's tricks?');
SELECT get_var('sample');
</programlisting>
  </para>
 </sect1>

 <sect1 id="plperl-trusted">
  <title>Trusted and Untrusted PL/Perl</title>

  <indexterm zone="plperl-trusted">
   <primary>trusted</primary>
   <secondary>PL/Perl</secondary>
  </indexterm>

  <para>
   Normally, PL/Perl is installed as a <quote>trusted</> programming
   language named <literal>plperl</>.  In this setup, certain Perl
   operations are disabled to preserve security.  In general, the
   operations that are restricted are those that interact with the
   environment. This includes file handle operations,
   <literal>require</literal>, and <literal>use</literal> (for
   external modules).  There is no way to access internals of the
   database server process or to gain OS-level access with the
   permissions of the server process,
   as a C function can do.  Thus, any unprivileged database user may
   be permitted to use this language.
  </para>

  <para>
   Here is an example of a function that will not work because file
   system operations are not allowed for security reasons:
<programlisting>
CREATE FUNCTION badfunc() RETURNS integer AS $$
    open(TEMP, "&gt;/tmp/badfile");
    print TEMP "Gotcha!\n";
    return 1;
$$ LANGUAGE plperl;
</programlisting>
   The creation of the function will succeed, but executing it will not.
  </para>

  <para>
   Sometimes it is desirable to write Perl functions that are not
   restricted.  For example, one might want a Perl function that sends
   mail.  To handle these cases, PL/Perl can also be installed as an
   <quote>untrusted</> language (usually called
   <application>PL/PerlU</application><indexterm><primary>PL/PerlU</></indexterm>).
   In this case the full Perl language is available.  If the
   <command>createlang</command> program is used to install the
   language, the language name <literal>plperlu</literal> will select
   the untrusted PL/Perl variant.
  </para>

  <para>
   The writer of a <application>PL/PerlU</> function must take care that the function
   cannot be used to do anything unwanted, since it will be able to do
   anything that could be done by a user logged in as the database
   administrator.  Note that the database system allows only database
   superusers to create functions in untrusted languages.
  </para>

  <para>
   If the above function was created by a superuser using the language
   <literal>plperlu</>, execution would succeed.
  </para>
 </sect1>

 <sect1 id="plperl-triggers">
  <title>PL/Perl Triggers</title>

  <para>
   PL/Perl can be used to write trigger functions.  In a trigger function,
   the hash reference <varname>$_TD</varname> contains information about the
   current trigger event.  The fields of the <varname>$_TD</varname> hash
   reference are:

   <variablelist>
    <varlistentry>
     <term><literal>$_TD-&gt;{new}{foo}</literal></term>
     <listitem>
      <para>
       <literal>NEW</literal> value of column <literal>foo</literal>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD-&gt;{old}{foo}</literal></term>
     <listitem>
      <para>
       <literal>OLD</literal> value of column <literal>foo</literal>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{name}</literal></term>
     <listitem>
      <para>
       Name of the trigger being called
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{event}</literal></term>
     <listitem>
      <para>
       Trigger event: <literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>, or <literal>UNKNOWN</>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{when}</literal></term>
     <listitem>
      <para>
       When the trigger was called: <literal>BEFORE</literal>, <literal>AFTER</literal>, or <literal>UNKNOWN</literal>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{level}</literal></term>
     <listitem>
      <para>
       The trigger level: <literal>ROW</literal>, <literal>STATEMENT</literal>, or <literal>UNKNOWN</literal>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{relid}</literal></term>
     <listitem>
      <para>
       OID of the table on which the trigger fired
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{relname}</literal></term>
     <listitem>
      <para>
       Name of the table on which the trigger fired
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>@{$_TD{argv}}</literal></term>
     <listitem>
      <para>
       Arguments of the trigger function
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>$_TD{argc}</literal></term>
     <listitem>
      <para>
       Number of arguments of the trigger function
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

  <para>
   Triggers can return one of the following:

   <variablelist>
    <varlistentry>
     <term><literal>return;</literal></term>
     <listitem>
      <para>
       Execute the statement
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>"SKIP"</literal></term>
     <listitem>
      <para>
       Don't execute the statement
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>"MODIFY"</literal></term>
     <listitem>
      <para>
       Indicates that the <literal>NEW</literal> row was modified by
       the trigger function
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

  <para>
   Here is an example of a trigger function, illustrating some of the
   above:
<programlisting>
CREATE TABLE test (
    i int,
    v varchar
);

CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$
    if (($_TD-&gt;{new}{i} &gt;= 100) || ($_TD-&gt;{new}{i} &lt;= 0)) {
        return "SKIP";    # skip INSERT/UPDATE command
    } elsif ($_TD-&gt;{new}{v} ne "immortal") {
        $_TD-&gt;{new}{v} .= "(modified by trigger)";
        return "MODIFY";  # modify row and execute INSERT/UPDATE command
    } else {
        return;           # execute INSERT/UPDATE command
    }
$$ LANGUAGE plperl;

CREATE TRIGGER test_valid_id_trig
    BEFORE INSERT OR UPDATE ON test
    FOR EACH ROW EXECUTE PROCEDURE valid_id();
</programlisting>
  </para>
 </sect1>

 <sect1 id="plperl-missing">
  <title>Limitations and Missing Features</title>

  <para>
   The following features are currently missing from PL/Perl, but they
   would make welcome contributions.

   <itemizedlist>
    <listitem>
     <para>
      PL/Perl functions cannot call each other directly (because they
      are anonymous subroutines inside Perl).
     </para>
    </listitem>

    <listitem>
     <para>
      SPI is not yet fully implemented.
     </para>
    </listitem>

    <listitem>
     <para>
      In the current implementation, if you are fetching or returning
      very large data sets, you should be aware that these will all go
      into memory.  Future features will help with this.  In the
      meantime, we suggest that you not use PL/Perl if you will fetch
      or return very large result sets.
     </para>
    </listitem>
   </itemizedlist>
  </para>
 </sect1>

</chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->