postgresql/doc/src/sgml/plperl.sgml
Bruce Momjian 8dec0c1bf2 lease find enclosed a patch that matches the PL/Perl documentation
(fairly closely, I hope) to the current PL/Perl implementation.

David Fetter
2004-07-21 20:23:05 +00:00

517 lines
15 KiB
Plaintext

<!--
$PostgreSQL: pgsql/doc/src/sgml/plperl.sgml,v 2.24 2004/07/21 20:22:57 momjian Exp $
-->
<chapter id="plperl">
<title>PL/Perl - Perl Procedural Language</title>
<indexterm zone="plperl">
<primary>PL/Perl</primary>
</indexterm>
<indexterm zone="plperl">
<primary>Perl</primary>
</indexterm>
<para>
PL/Perl is a loadable procedural language that enables you to write
<productname>PostgreSQL</productname> functions in the <ulink
url="http://www.perl.com">Perl</ulink> programming language.
</para>
<para>
To install PL/Perl in a particular database, use
<literal>createlang plperl <replaceable>dbname</></literal>.
</para>
<tip>
<para>
If a language is installed into <literal>template1</>, all subsequently
created databases will have the language installed automatically.
</para>
</tip>
<note>
<para>
Users of source packages must specially enable the build of
PL/Perl during the installation process. (Refer to <xref
linkend="install-short"> for more information.) Users of
binary packages might find PL/Perl in a separate subpackage.
</para>
</note>
<sect1 id="plperl-funcs">
<title>PL/Perl Functions and Arguments</title>
<para>
To create a function in the PL/Perl language, use the standard syntax:
<programlisting>
CREATE FUNCTION <replaceable>funcname</replaceable>
(<replaceable>argument-types</replaceable>) RETURNS <replaceable>return-type</replaceable> AS $$
# PL/Perl function body
$$ LANGUAGE plperl;
</programlisting>
The body of the function is ordinary Perl code. Since the body of
the function is treated as a string by
<productname>PostgreSQL</productname>, it can be specified using
dollar quoting (as shown above), or via the legacy single quote
syntax (see <xref linkend="sql-syntax-strings"> for more
information).
</para>
<para>
Arguments and results are handled as in any other Perl subroutine:
Arguments are passed in <varname>@_</varname>, and a result value
is returned with <literal>return</> or as the last expression
evaluated in the function.
</para>
<para>
For example, a function returning the greater of two integer values
could be defined as:
<programlisting>
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
if ($_[0] > $_[1]) { return $_[0]; }
return $_[1];
$$ LANGUAGE plperl;
</programlisting>
</para>
<para>
If an SQL <literal>NULL</literal> value<indexterm><primary>null
value</><secondary sortas="PL/Perl">in PL/Perl</></indexterm> is
passed to a function, the argument value will appear as
<quote>undefined</> in Perl. The above function definition will not
behave very nicely with <literal>NULL</literal> inputs (in fact, it
will act as though they are zeroes). We could add <literal>STRICT</>
to the function definition to make
<productname>PostgreSQL</productname> do something more reasonable: if
a <literal>NULL</literal> value is passed, the function will not be
called at all, but will just return a <literal>NULL</literal> result
automatically. Alternatively, we could check for undefined inputs in
the function body. For example, suppose that we wanted
<function>perl_max</function> with one <literal>NULL</literal> and one
non-<literal>NULL</literal> argument to return the
non-<literal>NULL</literal> argument, rather than a
<literal>NULL</literal> value:
<programlisting>
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
my ($a,$b) = @_;
if (! defined $a) {
if (! defined $b) { return undef; }
return $b;
}
if (! defined $b) { return $a; }
if ($a > $b) { return $a; }
return $b;
$$ LANGUAGE plperl;
</programlisting>
</para>
<para>
As shown above, to return an SQL <literal>NULL</literal> value from
a PL/Perl function, return an undefined value. This can be done
whether the function is strict or not.
</para>
<para>
Composite-type arguments are passed to the function as references
to hashes. The keys of the hash are the attribute names of the
composite type. Here is an example:
<programlisting>
CREATE TABLE employee (
name text,
basesalary integer,
bonus integer
);
CREATE FUNCTION empcomp(employee) RETURNS integer AS $$
my ($emp) = @_;
return $emp->{basesalary} + $emp->{bonus};
$$ LANGUAGE plperl;
SELECT name, empcomp(employee) FROM employee;
</programlisting>
</para>
<para>
There is now support for returning a composite-type result value.
</para>
</sect1>
<sect1 id="plperl-database">
<title>Database Access from PL/Perl</title>
<para>
Access to the database itself from your Perl function can be done via
spi_exec_query, or via an experimental module <ulink
url="http://www.cpan.org/modules/by-module/DBD/APILOS/"><literal>DBD::PgSPI</literal></ulink>
(also available at <ulink url="http://www.cpan.org/SITES.html"><acronym>CPAN</>
mirror sites</ulink>). This module makes available a
<acronym>DBI</>-compliant database-handle named
<varname>$pg_dbh</varname> that can be used to perform queries
with normal <acronym>DBI</> syntax.<indexterm><primary>DBI</></indexterm>
</para>
<para>
PL/Perl itself presently provides two additional Perl commands:
<variablelist>
<varlistentry>
<indexterm>
<primary>spi_exec_query</primary>
<secondary>in PL/Perl</secondary>
</indexterm>
<indexterm>
<primary>elog</primary>
<secondary>in PL/Perl</secondary>
</indexterm>
<term><function>spi_exec_query(</> [ <replaceable>SELECT query</replaceable> [, <replaceable>max_rows</replaceable>]] | [<replaceable>non-SELECT query</replaceable>] ) </term>
<listitem>
<para>
Here is an example of a SELECT query with the optional maximum
number of rows.
<programlisting>
$rv = spi_exec_query('SELECT * from my_table', 5);
</programlisting>
This returns up to 5 rows from my_table.
</para>
<para>
If my_table has a column my_column, it would be accessed as
<programlisting>
$foo = $rv->{rows}[$i]->{my_column};
</programlisting>
</para>
<para>
The number of rows actually returned would be:
<programlisting>
$nrows = @{$rv->{rows}};
</programlisting>
</para>
<para>
Here is an example using a non-SELECT statement.
<programlisting>
$query = "INSERT INTO my_table VALUES (1, 'test')";
$rv = spi_exec_query($query);
</programlisting>
You can then access status (SPI_OK_INSERT, e.g.) like this.
<programlisting>
$res = $rv->{status};
</programlisting>
</para>
<para>
To get the rows affected, do:
<programlisting>
$nrows = $rv->{rows};
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><function>elog</> <replaceable>level</replaceable>, <replaceable>msg</replaceable></term>
<listitem>
<para>
Emit a log or error message. Possible levels are
<literal>DEBUG</>, <literal>LOG</>, <literal>INFO</>,
<literal>NOTICE</>, <literal>WARNING</>, and <literal>ERROR</>.
<literal>ERROR</> raises an error condition: further execution
of the function is abandoned, and the current transaction is
aborted.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect1>
<sect1 id="plperl-data">
<title>Data Values in PL/Perl</title>
<para>
The argument values supplied to a PL/Perl function's code are
simply the input arguments converted to text form (just as if they
had been displayed by a <command>SELECT</command> statement).
Conversely, the <literal>return</> command will accept any string
that is acceptable input format for the function's declared return
type. So, the PL/Perl programmer can manipulate data values as if
they were just text.
</para>
<para>
PL/Perl can now return rowsets and composite types, and rowsets of
composite types.
</para>
<para>
Here is an example of a PL/Perl function returning a rowset of a row type:
<programlisting>
CREATE TABLE test (
i int,
v varchar
);
INSERT INTO test (i, v) VALUES (1,'first line');
INSERT INTO test (i, v) VALUES (2,'second line');
INSERT INTO test (i, v) VALUES (3,'third line');
INSERT INTO test (i, v) VALUES (4,'immortal');
create function test_munge() returns setof test language plperl as $$
my $res = [];
my $rv = spi_exec_query('select i,v from test;');
my $status = $rv->{status};
my $rows = @{$rv->{rows}};
my $processed = $rv->{processed};
foreach my $rn (0..$rows-1) {
my $row = $rv->{rows}[$rn];
$row->{i} += 200 if defined($row->{i});
$row->{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row->{v}));
push @$res,$row;
}
return $res;
$$;
select * from test_munge();
</programlisting>
</para>
<para>
Here is an example of a PL/Perl function returning a composite type:
<programlisting>
CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text);
CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$
return {f2 => 'hello', f1 => 1, f3 => 'world'};
$$ LANGUAGE plperl;
</programlisting>
</para>
<para>
Here is an example of a PL/Perl function returning a rowset of a composite type.
<programlisting>
CREATE TYPE testsetperl AS (f1 integer, f2 text, f3 text);
CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testsetperl AS $$
return[
{f1 => 1, f2 => 'hello', f3 => 'world'},
{f1 => 2, f2 => 'hello', f3 => 'postgres'},
{f1 => 3, f2 => 'hello', f3 => 'plperl'}
];
$$ LANGUAGE plperl;
</programlisting>
</para>
</sect1>
<sect1 id="plperl-global">
<title>Global Values in PL/Perl</title>
<para>
You can use the %_SHARED to store data between function calls. WHY
IS THIS A HASH, AND NOT A HASH REF?
</para>
<para>
For example:
<programlisting>
CREATE OR REPLACE FUNCTION set_var(TEXT) RETURNS TEXT AS $$
$_SHARED{first} = 'Hello, PL/Perl!';
return 'ok';
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION get_var() RETURNS text AS $$
return $_SHARED{first};
$$ LANGUAGE plperl;
SELECT set_var('hello plperl');
SELECT get_var();
</programlisting>
</para>
</sect1>
<sect1 id="plperl-trusted">
<title>Trusted and Untrusted PL/Perl</title>
<indexterm zone="plperl-trusted">
<primary>trusted</primary>
<secondary>PL/Perl</secondary>
</indexterm>
<para>
Normally, PL/Perl is installed as a <quote>trusted</> programming
language named <literal>plperl</>. In this setup, certain Perl
operations are disabled to preserve security. In general, the
operations that are restricted are those that interact with the
environment. This includes file handle operations,
<literal>require</literal>, and <literal>use</literal> (for
external modules). There is no way to access internals of the
database server process or to gain OS-level access with the
permissions of the server process,
as a C function can do. Thus, any unprivileged database user may
be permitted to use this language.
</para>
<para>
Here is an example of a function that will not work because file
system operations are not allowed for security reasons:
<programlisting>
CREATE FUNCTION badfunc() RETURNS integer AS $$
open(TEMP, ">/tmp/badfile");
print TEMP "Gotcha!\n";
return 1;
$$ LANGUAGE plperl;
</programlisting>
The creation of the function will succeed, but executing it will not.
</para>
<para>
Sometimes it is desirable to write Perl functions that are not
restricted. For example, one might want a Perl function that sends
mail. To handle these cases, PL/Perl can also be installed as an
<quote>untrusted</> language (usually called
<application>PL/PerlU</application><indexterm><primary>PL/PerlU</></indexterm>).
In this case the full Perl language is available. If the
<command>createlang</command> program is used to install the
language, the language name <literal>plperlu</literal> will select
the untrusted PL/Perl variant.
</para>
<para>
The writer of a <application>PL/PerlU</> function must take care that the function
cannot be used to do anything unwanted, since it will be able to do
anything that could be done by a user logged in as the database
administrator. Note that the database system allows only database
superusers to create functions in untrusted languages.
</para>
<para>
If the above function was created by a superuser using the language
<literal>plperlu</>, execution would succeed.
</para>
</sect1>
<sect1 id="plperl-triggers">
<title>PL/Perl Triggers</title>
<para>
PL/Perl can now be used to write trigger functions using the
<varname>$_TD</varname> hash reference.
</para>
<para>
Some useful parts of the $_TD hash reference are:
<programlisting>
$_TD->{new}{foo} # NEW value of column foo
$_TD->{old}{bar} # OLD value of column bar
$_TD{name} # Name of the trigger being called
$_TD{event} # INSERT, UPDATE, DELETE or UNKNOWN
$_TD{when} # BEFORE, AFTER or UNKNOWN
$_TD{level} # ROW, STATEMENT or UNKNOWN
$_TD{relid} # Relation ID of the table on which the trigger occurred.
$_TD{relname} # Name of the table on which the trigger occurred.
@{$_TD{argv}} # Array of arguments to the trigger function. May be empty.
$_TD{argc} # Number of arguments to the trigger. Why is this here?
</programlisting>
</para>
<para>
Triggers can return one of the following:
<programlisting>
return; -- Executes the statement
SKIP; -- Doesn't execute the statement
MODIFY; -- Says it modified a NEW row
</programlisting>
</para>
<para>
Here is an example of a trigger function, illustrating some of the
above.
<programlisting>
CREATE TABLE test (
i int,
v varchar
);
CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$
if (($_TD->{new}{i}>=100) || ($_TD->{new}{i}<=0)) {
return "SKIP"; # Skip INSERT/UPDATE command
} elsif ($_TD->{new}{v} ne "immortal") {
$_TD->{new}{v} .= "(modified by trigger)";
return "MODIFY"; # Modify tuple and proceed INSERT/UPDATE command
} else {
return; # Proceed INSERT/UPDATE command
}
$$ LANGUAGE plperl;
CREATE TRIGGER "test_valid_id_trig" BEFORE INSERT OR UPDATE ON test
FOR EACH ROW EXECUTE PROCEDURE "valid_id"();
</programlisting>
</para>
</sect1>
<sect1 id="plperl-missing">
<title>Limitations and Missing Features</title>
<para>
The following features are currently missing from PL/Perl, but they
would make welcome contributions.
<itemizedlist>
<listitem>
<para>
PL/Perl functions cannot call each other directly (because they
are anonymous subroutines inside Perl).
</para>
</listitem>
<listitem>
<para>
<application>Full SPI</application> is not yet implemented.
</para>
</listitem>
<listitem>
<para>
In the current implementation, if you are fetching or
returning very large datasets, you should be aware that these
will all go into memory. Future features will help with this.
In the meantime, we suggest that you not use pl/perl if you
will fetch or return very large result sets.
</para>
</listitem>
</itemizedlist>
</para>
</sect1>
</chapter>
<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->