Update information about compiling extension modules.

This commit is contained in:
Peter Eisentraut 2001-01-12 22:15:32 +00:00
parent 6162432de9
commit a32542a1c0
3 changed files with 386 additions and 471 deletions

View File

@ -1,373 +1,296 @@
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/dfunc.sgml,v 1.11 2000/09/29 20:21:33 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/dfunc.sgml,v 1.12 2001/01/12 22:15:32 petere Exp $
-->
<chapter id="dfunc">
<title id="dfunc-title">Linking Dynamically-Loaded Functions</title>
<sect2 id="dfunc">
<title id="dfunc-title">Compiling and Linking Dynamically-Loaded Functions</title>
<para>
<para>
Before you are able to use your
<productname>PostgreSQL</productname> extension function written in
C they need to be compiled and linked in a special way in order to
allow it to be dynamically loaded as needed by the server. To be
precise, a <firstterm>shared library</firstterm> needs to be created.
</para>
After you have created and registered a user-defined
function, your work is essentially done.
<productname>Postgres</productname>,
however, must load the object code
(e.g., a <literal>.o</literal> file, or
a shared library) that implements your function. As
previously mentioned, <productname>Postgres</productname>
loads your code at
runtime, as required. In order to allow your code to be
dynamically loaded, you may have to compile and
link-edit it in a special way. This section briefly
describes how to perform the compilation and
link-editing required before you can load your user-defined
functions into a running <productname>Postgres</productname> server.
</para>
<para>
For more information you should read the documentation of your
operating system, in particular the manual pages for the C compiler,
<command>cc</command>, and the link editor, <command>ld</command>.
In addition, the <productname>PostgreSQL</productname> source code
contains several working examples in the
<filename>contrib</filename> directory. If you rely on these
examples you will make your modules dependent on the documentation
of the <productname>PostgreSQL</productname> source code, however.
</para>
<para>
Creating shared libraries is generally analoguous to linking
executables: first the source files are compiled into object files,
then the object files are linked together. The object files need to
be created as <firstterm>position-independent code</firstterm>
(<acronym>PIC</acronym>), which conceptually means that it can be
placed at an arbitrary location in memory when it is loaded by the
executable. (Object files intended for executables are not compiled
that way.) The command to link a shared library contains special
flags to distinguish it from linking an executable. --- At least
this is the theory. On some systems the practice is much uglier.
</para>
<para>
In the following examples we assume that your source code is in a
file <filename>foo.c</filename> and we will create an shared library
<filename>foo.so</filename>. The intermediate object file will be
called <filename>foo.o</filename> unless otherwise noted. A shared
library can contain more than one object file, but we only use one
here.
</para>
<para>
<!--
<tip>
<para>
The old <productname>Postgres</productname> dynamic
loading mechanism required
in-depth knowledge in terms of executable format, placement
and alignment of executable instructions within memory, etc.
on the part of the person writing the dynamic loader. Such
loaders tended to be slow and buggy. As of Version 4.2, the
<productname>Postgres</productname> dynamic loading mechanism
has been rewritten to use
the dynamic loading mechanism provided by the operating
system. This approach is generally faster, more reliable and
more portable than our previous dynamic loading mechanism.
The reason for this is that nearly all modern versions of
Unix use a dynamic loading mechanism to implement shared
libraries and must therefore provide a fast and reliable
mechanism. On the other hand, the object file must be
postprocessed a bit before it can be loaded into
<productname>Postgres</productname>. We
hope that the large increase in speed and reliability will
make up for the slight decrease in convenience.
</para>
</tip>
</para>
Note: Reading GNU Libtool sources is generally a good way of figuring out
this information. The methods used within PostgreSQL source code are not
necessarily ideal.
-->
<variablelist>
<varlistentry>
<term><productname>BSD/OS</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is
<option>-fpic</option>. The linker flag to create shared
libraries is <option>-shared</option>.
<programlisting>
gcc -fpic -c foo.c
ld -shared -o foo.so foo.o
</programlisting>
This is applicable as of version 4.0 of
<productname>BSD/OS</productname>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>FreeBSD</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is
<option>-fpic</option>. To create shared libraries the compiler
flag is <option>-shared</option>.
<programlisting>
gcc -fpic -c foo.c
gcc -shared -o foo.so foo.o
</programlisting>
This is applicable as of version 3.0 of
<productname>FreeBSD</productname>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>HP-UX</productname></term>
<listitem>
<para>
The compiler flag of the system compiler to create
<acronym>PIC</acronym> is <option>+z</option>. When using
<productname>GCC</productname> it's <option>-fpic</option>. The
linker flag for shared libraries is <option>-b</option>. So
<programlisting>
cc +z -c foo.c
</programlisting>
or
<programlisting>
gcc -fpic -c foo.c
</programlisting>
and then
<programlisting>
ld -b -o foo.sl foo.o
</programlisting>
<productname>HP-UX</productname> uses the extension
<filename>.sl</filename> for shared libraries, unlike most other
systems.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>Irix</productname></term>
<listitem>
<para>
<acronym>PIC</acronym> is the default, no special compiler
options are necessary. The linker option to produce shared
libraries is <option>-shared</option>.
<programlisting>
cc -c foo.c
ld -shared -o foo.so foo.o
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>Linux</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is
<option>-fpic</option>. On some platforms in some situations
<option>-fPIC</option> must be used if <option>-fpic</option>
does not work. Refer to the GCC manual for more information.
The compiler flag to create a shared library is
<option>-shared</option>. A complete example looks like this:
<programlisting>
cc -fpic -c foo.c
cc -shared -o foo.so foo.o
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>NetBSD</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is
<option>-fpic</option>. For <acronym>ELF</acronym> systems, the
compiler with the flag <option>-shared</option> is used to link
shared libraries. On the older non-ELF systems, <literal>ld
-Bshareable</literal> is used.
<programlisting>
gcc -fpic -c foo.c
gcc -shared -o foo.so foo.o
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>OpenBSD</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is
<option>-fpic</option>. <literal>ld -Bshareable</literal> is
used to link shared libraries.
<programlisting>
gcc -fpic -c foo.c
ld -Bshareable -o foo.so foo.o
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Digital Unix/Tru64 UNIX</term>
<listitem>
<para>
<acronym>PIC</acronym> is the default, so the compilation command
is the usual one. <command>ld</command> with special options is
used to do the linking:
<programlisting>
cc -c foo.c
ld -shared -expect_unresolved '*' -o foo.so foo.o
</programlisting>
The same procedure is used with GCC instead of the system
compiler; no special options are required.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>Solaris</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is
<option>-KPIC</option> with the Sun compiler and
<option>-fpic</option> with <productname>GCC</productname>. To
link shared libraries, the compiler option is
<option>-G</option> with either compiler or alternatively
<option>-shared</option> with <productname>GCC</productname>.
<programlisting>
cc -KPIC -c foo.c
cc -G -o foo.so foo.o
</programlisting>
or
<programlisting>
gcc -fpic -c foo.c
gcc -G -o foo.so foo.o
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><productname>Unixware</productname></term>
<listitem>
<para>
The compiler flag to create <acronym>PIC</acronym> is <option>-K
PIC</option> with the SCO compiler and <option>-fpic</option>
with <productname>GCC</productname>. To link shared libraries,
the compiler option is <option>-G</option> with the SCO compiler
and <option>-shared</option> with
<productname>GCC</productname>.
<programlisting>
cc -K PIC -c foo.c
cc -G -o foo.so foo.o
</programlisting>
or
<programlisting>
gcc -fpic -c foo.c
gcc -shared -o foo.so foo.o
</programlisting>
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<tip>
<para>
You should expect to read (and reread, and re-reread) the
manual pages for the C compiler, cc(1), and the link
editor, ld(1), if you have specific questions. In
addition, the contrib area (<filename>PGROOT/contrib</filename>)
and the regression test suites in the directory
<filename>PGROOT/src/test/regress</filename> contain several
working examples of this process. If you copy an example then
you should not have any problems.
If you want to package your extension modules for wide distribution
you should consider using <ulink
url="http://www.gnu.org/software/libtool/"><productname>GNU
Libtool</productname></ulink> for building shared libraries. It
encapsulates the platform differences into a general and powerful
interface. Serious packaging also requires considerations about
library versioning, symbol resolution methods, and other issues.
</para>
</tip>
<para>
The following terminology will be used below:
<itemizedlist>
<listitem>
<para>
<firstterm>Dynamic loading</firstterm>
is what <productname>Postgres</productname> does to an object file. The
object file is copied into the running <productname>Postgres</productname>
server and the functions and variables within the
file are made available to the functions within
the <productname>Postgres</productname> process.
<productname>Postgres</productname> does this using
the dynamic loading mechanism provided by the
operating system.
</para>
</listitem>
<listitem>
<para>
<firstterm>Loading and link editing</firstterm>
is what you do to an object file in order to produce
another kind of object file (e.g., an executable
program or a shared library). You perform
this using the link editing program, ld(1).
</para>
</listitem>
</itemizedlist>
</para>
<para>
The following general restrictions and notes also apply
to the discussion below:
<itemizedlist>
<listitem>
<para>
Paths given to the create function command must be
absolute paths (i.e., start with "/") that refer to
directories visible on the machine on which the
<productname>Postgres</productname> server is running.
<tip>
<para>
Relative paths do in fact work,
but are relative to
the directory where the database resides (which is generally
invisible to the frontend application). Obviously, it makes
no sense to make the path relative to the directory in which
the user started the frontend application, since the server
could be running on a completely different machine!
</para>
</tip>
</para>
</listitem>
<listitem>
<para>
The <productname>Postgres</productname> user must be able to traverse the path
given to the create function command and be able to
read the object file. This is because the <productname>Postgres</productname>
server runs as the <productname>Postgres</productname> user, not as the user
who starts up the frontend process. (Making the
file or a higher-level directory unreadable and/or
unexecutable by the "postgres" user is an extremely
common mistake.)
</para>
</listitem>
<listitem>
<para>
Symbol names defined within object files must not
conflict with each other or with symbols defined in
<productname>Postgres</productname>.
</para>
</listitem>
<listitem>
<para>
The GNU C compiler usually does not provide the special
options that are required to use the operating
system's dynamic loader interface. In such cases,
the C compiler that comes with the operating system
must be used.
</para>
</listitem>
</itemizedlist>
</para>
<sect1 id="dload-linux">
<title>Linux</title>
<para>
The resulting shared library file can then be loaded into
<productname>Postgres</productname>. When specifying the file name
to the <command>CREATE FUNCTION</command> command, one must give it
the name of the shared library file (ending in
<filename>.so</filename>) rather than the simple object file.
<note>
<para>
Under Linux ELF, object files can be generated by specifying the compiler
flag -fpic.
Actually, <productname>Postgres</productname> does not care what
you name the file as long as it is a shared library file.
</para>
</note>
<para>
For example,
<programlisting>
# simple Linux example
% cc -fpic -c <replaceable>foo.c</replaceable>
</programlisting>
produces an object file called <replaceable>foo.o</replaceable>
that can then be
dynamically loaded into <productname>Postgres</productname>.
No additional loading or link-editing must be performed.
</para>
</sect1>
Paths given to the <command>CREATE FUNCTION</command> command must
be absolute paths (i.e., start with <literal>/</literal>) that refer
to directories visible on the machine on which the
<productname>Postgres</productname> server is running. Relative
paths do in fact work, but are relative to the directory where the
database resides (which is generally invisible to the frontend
application). Obviously, it makes no sense to make the path
relative to the directory in which the user started the frontend
application, since the server could be running on a completely
different machine! The user id the
<productname>Postgres</productname> server runs as must be able to
traverse the path given to the <command>CREATE FUNCTION</command>
command and be able to read the shared library file. (Making the
file or a higher-level directory not readable and/or not executable
by the <quote>postgres</quote> user is a common mistake.)
</para>
<!--
<sect1 id="dload-ultrix">
<title><acronym>ULTRIX</acronym></title>
<para>
It is very easy to build dynamically-loaded object
files under ULTRIX. ULTRIX does not have any shared library
mechanism and hence does not place any restrictions on
the dynamic loader interface. On the other
hand, we had to (re)write a non-portable dynamic loader
ourselves and could not use true shared libraries.
Under ULTRIX, the only restriction is that you must
produce each object file with the option -G 0. (Notice
that that's the numeral ``0'' and not the letter
``O''). For example,
<programlisting>
# simple ULTRIX example
% cc -G 0 -c foo.c
</programlisting>
produces an object file called foo.o that can then be
dynamically loaded into <productname>Postgres</productname>.
No additional loading or link-editing must be performed.
</para>
</sect1>
-->
<sect1 id="dload-osf">
<title><acronym>DEC OSF/1</acronym></title>
<para>
Under DEC OSF/1, you can take any simple object file
and produce a shared object file by running the ld command
over it with the correct options. The commands to
do this look like:
<programlisting>
# simple DEC OSF/1 example
% cc -c foo.c
% ld -shared -expect_unresolved '*' -o foo.so foo.o
</programlisting>
The resulting shared object file can then be loaded
into <productname>Postgres</productname>. When specifying the object file name to
the create function command, one must give it the name
of the shared object file (ending in .so) rather than
the simple object file.
<tip>
<para>
Actually, <productname>Postgres</productname> does not care
what you name the
file as long as it is a shared object file. If you prefer
to name your shared object files with the extension .o, this
is fine with <productname>Postgres</productname>
so long as you make sure that the correct
file name is given to the create function command. In
other words, you must simply be consistent. However, from a
pragmatic point of view, we discourage this practice because
you will undoubtedly confuse yourself with regards to which
files have been made into shared object files and which have
not. For example, it's very hard to write Makefiles to do
the link-editing automatically if both the object file and
the shared object file end in .o!
</para>
</tip>
If the file you specify is
not a shared object, the backend will hang!
</para>
</sect1>
<sect1 id="dload-other">
<title>
<acronym>SunOS 4.x</acronym>, <acronym>Solaris 2.x</acronym> and
<acronym>HP-UX</acronym></title>
<para>
Under SunOS 4.x, Solaris 2.x and HP-UX, the simple
object file must be created by compiling the source
file with special compiler flags and a shared library
must be produced.
The necessary steps with HP-UX are as follows. The +z
flag to the HP-UX C compiler produces
<firstterm>Position Independent Code</firstterm> (PIC)
and the +u flag removes
some alignment restrictions that the PA-RISC architecture
normally enforces. The object file must be turned
into a shared library using the HP-UX link editor with
the -b option. This sounds complicated but is actually
very simple, since the commands to do it are just:
<programlisting>
# simple HP-UX example
% cc +z +u -c foo.c
% ld -b -o foo.sl foo.o
</programlisting>
</para>
<para>
As with the .so files mentioned in the last subsection,
the create function command must be told which file is
the correct file to load (i.e., you must give it the
location of the shared library, or .sl file).
Under SunOS 4.x, the commands look like:
<programlisting>
# simple SunOS 4.x example
% cc -PIC -c foo.c
% ld -dc -dp -Bdynamic -o foo.so foo.o
</programlisting>
and the equivalent lines under Solaris 2.x are:
<programlisting>
# simple Solaris 2.x example
% cc -K PIC -c foo.c
% ld -G -Bdynamic -o foo.so foo.o
</programlisting>
or
<programlisting>
# simple Solaris 2.x example
% gcc -fPIC -c foo.c
% ld -G -Bdynamic -o foo.so foo.o
</programlisting>
</para>
<para>
When linking shared libraries, you may have to specify
some additional shared libraries (typically system
libraries, such as the C and math libraries) on your ld
command line.
</para>
</sect1>
<!--
Future integration: Create separate sections for these operating
systems and integrate the info from this old man page.
- thomas 2000-04-21
Under HP-UX, DEC OSF/1, AIX and SunOS 4, all object files must be
turned into
.IR "shared libraries"
using the operating system's native object file loader,
.IR ld(1).
.PP
Under HP-UX, an object file must be compiled using the native HP-UX C
compiler,
.IR /bin/cc ,
with both the \*(lq+z\*(rq and \*(lq+u\*(rq flags turned on. The
first flag turns the object file into \*(lqposition-independent
code\*(rq (PIC); the second flag removes some alignment restrictions
that the PA-RISC architecture normally enforces. The object file must
then be turned into a shared library using the HP-UX loader,
.IR /bin/ld .
The command lines to compile a C source file, \*(lqfoo.c\*(rq, look
like:
.nf
cc <other flags> +z +u -c foo.c
ld <other flags> -b -o foo.sl foo.o
.fi
The object file name in the
.BR as
clause should end in \*(lq.sl\*(rq.
.PP
An extra step is required under versions of HP-UX prior to 9.00. If
the Postgres header file
.nf
include/c.h
.fi
is not included in the source file, then the following line must also
be added at the top of every source file:
.nf
#pragma HP_ALIGN HPUX_NATURAL_S500
.fi
However, this line must not appear in programs compiled under HP-UX
9.00 or later.
.PP
Under DEC OSF/1, an object file must be compiled and then turned
into a shared library using the OSF/1 loader,
.IR /bin/ld .
In this case, the command lines look like:
.nf
cc <other flags> -c foo.c
ld <other flags> -shared -expect_unresolved '*' -o foo.so foo.o
.fi
The object file name in the
.BR as
clause should end in \*(lq.so\*(rq.
.PP
Under SunOS 4, an object file must be compiled and then turned into a
shared library using the SunOS 4 loader,
.IR /bin/ld .
The command lines look like:
.nf
cc <other flags> -PIC -c foo.c
ld <other flags> -dc -dp -Bdynamic -o foo.so foo.o
.fi
The object file name in the
.BR as
clause should end in \*(lq.so\*(rq.
.PP
Under AIX, object files are compiled normally but building the shared
library requires a couple of steps. First, create the object file:
.nf
@ -389,7 +312,7 @@ procedure.
-->
</chapter>
</sect2>
<!-- Keep this comment at the end of the file
Local variables:

View File

@ -1,5 +1,5 @@
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/programmer.sgml,v 1.29 2000/11/24 17:44:21 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/programmer.sgml,v 1.30 2001/01/12 22:15:32 petere Exp $
PostgreSQL Programmer's Guide.
-->
@ -72,7 +72,6 @@ PostgreSQL Programmer's Guide.
&indexcost;
&gist;
&xplang;
&dfunc;
<!-- reference -->

View File

@ -1,5 +1,5 @@
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.26 2000/12/26 00:10:37 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.27 2001/01/12 22:15:32 petere Exp $
-->
<chapter id="xfunc">
@ -632,10 +632,10 @@ SELECT clean_EMP();
the <literal>int4</literal> type on Unix
machines might be:
<programlisting>
<programlisting>
/* 4-byte integer, passed by value */
typedef int int4;
</programlisting>
</programlisting>
</para>
<para>
@ -643,13 +643,13 @@ typedef int int4;
be passed by-reference. For example, here is a sample
implementation of a <productname>Postgres</productname> type:
<programlisting>
<programlisting>
/* 16-byte structure, passed by reference */
typedef struct
{
double x, y;
} Point;
</programlisting>
</programlisting>
</para>
<para>
@ -670,12 +670,12 @@ typedef struct
(i.e., it includes the size of the length field
itself). We can define the text type as follows:
<programlisting>
<programlisting>
typedef struct {
int4 length;
char data[1];
} text;
</programlisting>
</programlisting>
</para>
<para>
@ -687,7 +687,7 @@ typedef struct {
For example, if we wanted to store 40 bytes in a text
structure, we might use a code fragment like this:
<programlisting>
<programlisting>
#include "postgres.h"
...
char buffer[40]; /* our source data */
@ -696,7 +696,7 @@ text *destination = (text *) palloc(VARHDRSZ + 40);
destination-&gt;length = VARHDRSZ + 40;
memmove(destination-&gt;data, buffer, 40);
...
</programlisting>
</programlisting>
</para>
<para>
@ -709,7 +709,7 @@ memmove(destination-&gt;data, buffer, 40);
<title>Version-0 Calling Conventions for C-Language Functions</title>
<para>
We present the "old style" calling convention first --- although
We present the <quote>old style</quote> calling convention first --- although
this approach is now deprecated, it's easier to get a handle on
initially. In the version-0 method, the arguments and result
of the C function are just declared in normal C style, but being
@ -720,7 +720,7 @@ memmove(destination-&gt;data, buffer, 40);
<para>
Here are some examples:
<programlisting>
<programlisting>
#include &lt;string.h&gt;
#include "postgres.h"
@ -786,7 +786,7 @@ concat_text(text *arg1, text *arg2)
strncat(VARDATA(new_text), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
return new_text;
}
</programlisting>
</programlisting>
</para>
<para>
@ -795,7 +795,7 @@ concat_text(text *arg1, text *arg2)
we could define the functions to <productname>Postgres</productname>
with commands like this:
<programlisting>
<programlisting>
CREATE FUNCTION add_one(int4) RETURNS int4
AS '<replaceable>PGROOT</replaceable>/tutorial/funcs.so' LANGUAGE 'c'
WITH (isStrict);
@ -817,7 +817,7 @@ CREATE FUNCTION copytext(text) RETURNS text
CREATE FUNCTION concat_text(text, text) RETURNS text
AS '<replaceable>PGROOT</replaceable>/tutorial/funcs.so' LANGUAGE 'c'
WITH (isStrict);
</programlisting>
</programlisting>
</para>
<para>
@ -855,13 +855,13 @@ CREATE FUNCTION concat_text(text, text) RETURNS text
The version-1 calling convention relies on macros to suppress most
of the complexity of passing arguments and results. The C declaration
of a version-1 function is always
<programlisting>
Datum funcname(PG_FUNCTION_ARGS)
</programlisting>
<programlisting>
Datum funcname(PG_FUNCTION_ARGS)
</programlisting>
In addition, the macro call
<programlisting>
PG_FUNCTION_INFO_V1(funcname);
</programlisting>
<programlisting>
PG_FUNCTION_INFO_V1(funcname);
</programlisting>
must appear in the same source file (conventionally it's written
just before the function itself). This macro call is not needed
for "internal"-language functions, since Postgres currently assumes
@ -870,16 +870,18 @@ CREATE FUNCTION concat_text(text, text) RETURNS text
</para>
<para>
In a version-1 function,
each actual argument is fetched using a PG_GETARG_xxx() macro that
corresponds to the argument's datatype, and the result is returned
using a PG_RETURN_xxx() macro for the return type.
In a version-1 function, each actual argument is fetched using a
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
macro that corresponds to the argument's datatype, and the result
is returned using a
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
macro for the return type.
</para>
<para>
Here we show the same functions as above, coded in new style:
<programlisting>
<programlisting>
#include &lt;string.h&gt;
#include "postgres.h"
#include "fmgr.h"
@ -962,7 +964,7 @@ concat_text(PG_FUNCTION_ARGS)
strncat(VARDATA(new_text), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
PG_RETURN_TEXT_P(new_text);
}
</programlisting>
</programlisting>
</para>
<para>
@ -971,27 +973,30 @@ concat_text(PG_FUNCTION_ARGS)
</para>
<para>
At first glance, the version-1 coding conventions may appear to be
just pointless obscurantism. However, they do offer a number of
improvements, because the macros can hide unnecessary detail.
At first glance, the version-1 coding conventions may appear to
be just pointless obscurantism. However, they do offer a number
of improvements, because the macros can hide unnecessary detail.
An example is that in coding add_one_float8, we no longer need to
be aware that float8 is a pass-by-reference type. Another example
is that the GETARG macros for variable-length types hide the need
to deal with fetching "toasted" (compressed or out-of-line) values.
The old-style copytext and concat_text functions shown above are
actually wrong in the presence of toasted values, because they don't
call pg_detoast_datum() on their inputs. (The handler for old-style
dynamically-loaded functions currently takes care of this detail,
but it does so less efficiently than is possible for a version-1
function.)
be aware that float8 is a pass-by-reference type. Another
example is that the GETARG macros for variable-length types hide
the need to deal with fetching "toasted" (compressed or
out-of-line) values. The old-style <function>copytext</function>
and <function>concat_text</function> functions shown above are
actually wrong in the presence of toasted values, because they
don't call <function>pg_detoast_datum()</function> on their
inputs. (The handler for old-style dynamically-loaded functions
currently takes care of this detail, but it does so less
efficiently than is possible for a version-1 function.)
</para>
<para>
The version-1 function call conventions also make it possible to
test for NULL inputs to a non-strict function, return a NULL result
(from either strict or non-strict functions), return "set" results,
and implement trigger functions and procedural-language call handlers.
For more details see <filename>src/backend/utils/fmgr/README</filename>.
test for NULL inputs to a non-strict function, return a NULL
result (from either strict or non-strict functions), return
<quote>set</quote> results, and implement trigger functions and
procedural-language call handlers. For more details see
<filename>src/backend/utils/fmgr/README</filename> in the source
distribution.
</para>
</sect2>
@ -1011,15 +1016,15 @@ concat_text(PG_FUNCTION_ARGS)
function as an opaque structure of type <literal>TUPLE</literal>.
Suppose we want to write a function to answer the query
<programlisting>
* SELECT name, c_overpaid(EMP, 1500) AS overpaid
FROM EMP
WHERE name = 'Bill' or name = 'Sam';
</programlisting>
<programlisting>
SELECT name, c_overpaid(emp, 1500) AS overpaid
FROM emp
WHERE name = 'Bill' OR name = 'Sam';
</programlisting>
In the query above, we can define c_overpaid as:
<programlisting>
<programlisting>
#include "postgres.h"
#include "executor/executor.h" /* for GetAttributeByName() */
@ -1055,31 +1060,31 @@ c_overpaid(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(salary &gt; limit);
}
</programlisting>
</programlisting>
</para>
<para>
<function>GetAttributeByName</function> is the
<productname>Postgres</productname> system function that
returns attributes out of the current instance. It has
three arguments: the argument of type TupleTableSlot* passed into
three arguments: the argument of type <type>TupleTableSlot*</type> passed into
the function, the name of the desired attribute, and a
return parameter that tells whether the attribute
is null. <function>GetAttributeByName</function> returns a Datum
value that you can convert to the proper datatype by using the
appropriate DatumGetXXX() macro.
appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function> macro.
</para>
<para>
The following query lets <productname>Postgres</productname>
know about the c_overpaid function:
know about the <function>c_overpaid</function> function:
<programlisting>
CREATE FUNCTION c_overpaid(EMP, int4)
<programlisting>
CREATE FUNCTION c_overpaid(emp, int4)
RETURNS bool
AS '<replaceable>PGROOT</replaceable>/tutorial/obj/funcs.so'
LANGUAGE 'c';
</programlisting>
</programlisting>
</para>
<para>
@ -1096,7 +1101,7 @@ LANGUAGE 'c';
We now turn to the more difficult task of writing
programming language functions. Be warned: this section
of the manual will not make you a programmer. You must
have a good understanding of <acronym>C</acronym>
have a good understanding of <acronym>C</acronym>
(including the use of pointers and the malloc memory manager)
before trying to write <acronym>C</acronym> functions for
use with <productname>Postgres</productname>. While it may
@ -1113,20 +1118,6 @@ LANGUAGE 'c';
are written in <acronym>C</acronym>.
</para>
<para>
C functions with base type arguments can be written in a
straightforward fashion. The C equivalents of built-in Postgres types
are accessible in a C file if
<filename><replaceable>PGROOT</replaceable>/src/backend/utils/builtins.h</filename>
is included as a header file. This can be achieved by having
<programlisting>
#include &lt;utils/builtins.h&gt;
</programlisting>
at the top of the C source file.
</para>
<para>
The basic rules for building <acronym>C</acronym> functions
are as follows:
@ -1134,66 +1125,65 @@ LANGUAGE 'c';
<itemizedlist>
<listitem>
<para>
Most of the header (include) files for
<productname>Postgres</productname>
should already be installed in
<filename><replaceable>PGROOT</replaceable>/include</filename> (see Figure 2).
You should always include
<programlisting>
-I$PGROOT/include
</programlisting>
on your cc command lines. Sometimes, you may
find that you require header files that are in
the server source itself (i.e., you need a file
we neglected to install in include). In those
cases you may need to add one or more of
<programlisting>
-I$PGROOT/src/backend
-I$PGROOT/src/backend/include
-I$PGROOT/src/backend/port/&lt;PORTNAME&gt;
-I$PGROOT/src/backend/obj
</programlisting>
(where &lt;PORTNAME&gt; is the name of the port, e.g.,
alpha or sparc).
The relevant header (include) files are installed under
<filename>/usr/local/pgsql/include</filename> or equivalent.
You can use <literal>pg_config --includedir</literal> to find
out where it is on your system (or the system that your
users will be running on). For very low-level work you might
need to have a complete <productname>PostgreSQL</productname>
source tree available.
</para>
</listitem>
<listitem>
<para>
When allocating memory, use the
<productname>Postgres</productname>
routines palloc and pfree instead of the
corresponding <acronym>C</acronym> library routines
malloc and free.
The memory allocated by palloc will be freed
automatically at the end of each transaction,
preventing memory leaks.
When allocating memory, use the
<productname>Postgres</productname> routines
<function>palloc</function> and <function>pfree</function>
instead of the corresponding <acronym>C</acronym> library
routines <function>malloc</function> and
<function>free</function>. The memory allocated by
<function>palloc</function> will be freed automatically at the
end of each transaction, preventing memory leaks.
</para>
</listitem>
<listitem>
<para>
Always zero the bytes of your structures using
memset or bzero. Several routines (such as the
hash access method, hash join and the sort algorithm)
compute functions of the raw bits contained in
your structure. Even if you initialize all fields
of your structure, there may be
several bytes of alignment padding (holes in the
structure) that may contain garbage values.
Always zero the bytes of your structures using
<function>memset</function> or <function>bzero</function>.
Several routines (such as the hash access method, hash join
and the sort algorithm) compute functions of the raw bits
contained in your structure. Even if you initialize all
fields of your structure, there may be several bytes of
alignment padding (holes in the structure) that may contain
garbage values.
</para>
</listitem>
<listitem>
<para>
Most of the internal <productname>Postgres</productname>
types are declared in <filename>postgres.h</filename>,
so it's a good
idea to always include that file as well. Including
postgres.h will also include elog.h and palloc.h for you.
Most of the internal <productname>Postgres</productname> types
are declared in <filename>postgres.h</filename>, the function
manager interfaces (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)
are in <filename>fmgr.h</filename>, so you will need to
include at least these two files. Including
<filename>postgres.h</filename> will also include
<filename>elog.h</filename> and <filename>palloc.h</filename>
for you.
</para>
</listitem>
<listitem>
<para>
Symbol names defined within object files must not conflict
with each other or with symbols defined in the
<productname>PostgreSQL</productname> server executable. You
will have to rename your functions or variables if you get
error messages to this effect.
</para>
</listitem>
<listitem>
<para>
Compiling and loading your object code so that
@ -1208,6 +1198,9 @@ LANGUAGE 'c';
</itemizedlist>
</para>
</sect2>
&dfunc;
</sect1>
<sect1 id="xfunc-overload">