From a32542a1c0d2c5efcb5210189b6febbc190946dd Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 12 Jan 2001 22:15:32 +0000 Subject: [PATCH] Update information about compiling extension modules. --- doc/src/sgml/dfunc.sgml | 629 +++++++++++++++-------------------- doc/src/sgml/programmer.sgml | 3 +- doc/src/sgml/xfunc.sgml | 225 ++++++------- 3 files changed, 386 insertions(+), 471 deletions(-) diff --git a/doc/src/sgml/dfunc.sgml b/doc/src/sgml/dfunc.sgml index 7c547d52d0..aa3fa8bdad 100644 --- a/doc/src/sgml/dfunc.sgml +++ b/doc/src/sgml/dfunc.sgml @@ -1,373 +1,296 @@ - - Linking Dynamically-Loaded Functions + + Compiling and Linking Dynamically-Loaded Functions - + + Before you are able to use your + PostgreSQL extension function written in + C they need to be compiled and linked in a special way in order to + allow it to be dynamically loaded as needed by the server. To be + precise, a shared library needs to be created. + - After you have created and registered a user-defined - function, your work is essentially done. - Postgres, - however, must load the object code - (e.g., a .o file, or - a shared library) that implements your function. As - previously mentioned, Postgres - loads your code at - runtime, as required. In order to allow your code to be - dynamically loaded, you may have to compile and - link-edit it in a special way. This section briefly - describes how to perform the compilation and - link-editing required before you can load your user-defined - functions into a running Postgres server. - + + For more information you should read the documentation of your + operating system, in particular the manual pages for the C compiler, + cc, and the link editor, ld. + In addition, the PostgreSQL source code + contains several working examples in the + contrib directory. If you rely on these + examples you will make your modules dependent on the documentation + of the PostgreSQL source code, however. + + + + Creating shared libraries is generally analoguous to linking + executables: first the source files are compiled into object files, + then the object files are linked together. The object files need to + be created as position-independent code + (PIC), which conceptually means that it can be + placed at an arbitrary location in memory when it is loaded by the + executable. (Object files intended for executables are not compiled + that way.) The command to link a shared library contains special + flags to distinguish it from linking an executable. --- At least + this is the theory. On some systems the practice is much uglier. + + + + In the following examples we assume that your source code is in a + file foo.c and we will create an shared library + foo.so. The intermediate object file will be + called foo.o unless otherwise noted. A shared + library can contain more than one object file, but we only use one + here. + + + + + + BSD/OS + + + The compiler flag to create PIC is + . The linker flag to create shared + libraries is . + +gcc -fpic -c foo.c +ld -shared -o foo.so foo.o + + This is applicable as of version 4.0 of + BSD/OS. + + + + + + FreeBSD + + + The compiler flag to create PIC is + . To create shared libraries the compiler + flag is . + +gcc -fpic -c foo.c +gcc -shared -o foo.so foo.o + + This is applicable as of version 3.0 of + FreeBSD. + + + + + + HP-UX + + + The compiler flag of the system compiler to create + PIC is . When using + GCC it's . The + linker flag for shared libraries is . So + +cc +z -c foo.c + + or + +gcc -fpic -c foo.c + + and then + +ld -b -o foo.sl foo.o + + HP-UX uses the extension + .sl for shared libraries, unlike most other + systems. + + + + + + Irix + + + PIC is the default, no special compiler + options are necessary. The linker option to produce shared + libraries is . + +cc -c foo.c +ld -shared -o foo.so foo.o + + + + + + + Linux + + + The compiler flag to create PIC is + . On some platforms in some situations + must be used if + does not work. Refer to the GCC manual for more information. + The compiler flag to create a shared library is + . A complete example looks like this: + +cc -fpic -c foo.c +cc -shared -o foo.so foo.o + + + + + + + NetBSD + + + The compiler flag to create PIC is + . For ELF systems, the + compiler with the flag is used to link + shared libraries. On the older non-ELF systems, ld + -Bshareable is used. + +gcc -fpic -c foo.c +gcc -shared -o foo.so foo.o + + + + + + + OpenBSD + + + The compiler flag to create PIC is + . ld -Bshareable is + used to link shared libraries. + +gcc -fpic -c foo.c +ld -Bshareable -o foo.so foo.o + + + + + + + Digital Unix/Tru64 UNIX + + + + PIC is the default, so the compilation command + is the usual one. ld with special options is + used to do the linking: + +cc -c foo.c +ld -shared -expect_unresolved '*' -o foo.so foo.o + + The same procedure is used with GCC instead of the system + compiler; no special options are required. + + + + + + Solaris + + + The compiler flag to create PIC is + with the Sun compiler and + with GCC. To + link shared libraries, the compiler option is + with either compiler or alternatively + with GCC. + +cc -KPIC -c foo.c +cc -G -o foo.so foo.o + + or + +gcc -fpic -c foo.c +gcc -G -o foo.so foo.o + + + + + + + Unixware + + + The compiler flag to create PIC is with the SCO compiler and + with GCC. To link shared libraries, + the compiler option is with the SCO compiler + and with + GCC. + +cc -K PIC -c foo.c +cc -G -o foo.so foo.o + + or + +gcc -fpic -c foo.c +gcc -shared -o foo.so foo.o + + + + + + + + + - You should expect to read (and reread, and re-reread) the - manual pages for the C compiler, cc(1), and the link - editor, ld(1), if you have specific questions. In - addition, the contrib area (PGROOT/contrib) - and the regression test suites in the directory - PGROOT/src/test/regress contain several - working examples of this process. If you copy an example then - you should not have any problems. + If you want to package your extension modules for wide distribution + you should consider using GNU + Libtool for building shared libraries. It + encapsulates the platform differences into a general and powerful + interface. Serious packaging also requires considerations about + library versioning, symbol resolution methods, and other issues. + - - The following terminology will be used below: - - - - - Dynamic loading - is what Postgres does to an object file. The - object file is copied into the running Postgres - server and the functions and variables within the - file are made available to the functions within - the Postgres process. - Postgres does this using - the dynamic loading mechanism provided by the - operating system. - - - - - Loading and link editing - is what you do to an object file in order to produce - another kind of object file (e.g., an executable - program or a shared library). You perform - this using the link editing program, ld(1). - - - - - - - The following general restrictions and notes also apply - to the discussion below: - - - - Paths given to the create function command must be - absolute paths (i.e., start with "/") that refer to - directories visible on the machine on which the - Postgres server is running. - - - - Relative paths do in fact work, - but are relative to - the directory where the database resides (which is generally - invisible to the frontend application). Obviously, it makes - no sense to make the path relative to the directory in which - the user started the frontend application, since the server - could be running on a completely different machine! - - - - - - - - The Postgres user must be able to traverse the path - given to the create function command and be able to - read the object file. This is because the Postgres - server runs as the Postgres user, not as the user - who starts up the frontend process. (Making the - file or a higher-level directory unreadable and/or - unexecutable by the "postgres" user is an extremely - common mistake.) - - - - - - Symbol names defined within object files must not - conflict with each other or with symbols defined in - Postgres. - - - - - - The GNU C compiler usually does not provide the special - options that are required to use the operating - system's dynamic loader interface. In such cases, - the C compiler that comes with the operating system - must be used. - - - - - - - Linux + + The resulting shared library file can then be loaded into + Postgres. When specifying the file name + to the CREATE FUNCTION command, one must give it + the name of the shared library file (ending in + .so) rather than the simple object file. + - Under Linux ELF, object files can be generated by specifying the compiler - flag -fpic. + Actually, Postgres does not care what + you name the file as long as it is a shared library file. + - - For example, - -# simple Linux example -% cc -fpic -c foo.c - - - produces an object file called foo.o - that can then be - dynamically loaded into Postgres. - No additional loading or link-editing must be performed. - - + Paths given to the CREATE FUNCTION command must + be absolute paths (i.e., start with /) that refer + to directories visible on the machine on which the + Postgres server is running. Relative + paths do in fact work, but are relative to the directory where the + database resides (which is generally invisible to the frontend + application). Obviously, it makes no sense to make the path + relative to the directory in which the user started the frontend + application, since the server could be running on a completely + different machine! The user id the + Postgres server runs as must be able to + traverse the path given to the CREATE FUNCTION + command and be able to read the shared library file. (Making the + file or a higher-level directory not readable and/or not executable + by the postgres user is a common mistake.) + - - - <acronym>DEC OSF/1</acronym> - - - Under DEC OSF/1, you can take any simple object file - and produce a shared object file by running the ld command - over it with the correct options. The commands to - do this look like: - -# simple DEC OSF/1 example -% cc -c foo.c -% ld -shared -expect_unresolved '*' -o foo.so foo.o - - - The resulting shared object file can then be loaded - into Postgres. When specifying the object file name to - the create function command, one must give it the name - of the shared object file (ending in .so) rather than - the simple object file. - - - - Actually, Postgres does not care - what you name the - file as long as it is a shared object file. If you prefer - to name your shared object files with the extension .o, this - is fine with Postgres - so long as you make sure that the correct - file name is given to the create function command. In - other words, you must simply be consistent. However, from a - pragmatic point of view, we discourage this practice because - you will undoubtedly confuse yourself with regards to which - files have been made into shared object files and which have - not. For example, it's very hard to write Makefiles to do - the link-editing automatically if both the object file and - the shared object file end in .o! - - - - If the file you specify is - not a shared object, the backend will hang! - - - - - - <acronym>SunOS 4.x</acronym>, <acronym>Solaris 2.x</acronym> and - <acronym>HP-UX</acronym> - - - Under SunOS 4.x, Solaris 2.x and HP-UX, the simple - object file must be created by compiling the source - file with special compiler flags and a shared library - must be produced. - The necessary steps with HP-UX are as follows. The +z - flag to the HP-UX C compiler produces - Position Independent Code (PIC) - and the +u flag removes - some alignment restrictions that the PA-RISC architecture - normally enforces. The object file must be turned - into a shared library using the HP-UX link editor with - the -b option. This sounds complicated but is actually - very simple, since the commands to do it are just: - - -# simple HP-UX example -% cc +z +u -c foo.c -% ld -b -o foo.sl foo.o - - - - - As with the .so files mentioned in the last subsection, - the create function command must be told which file is - the correct file to load (i.e., you must give it the - location of the shared library, or .sl file). - Under SunOS 4.x, the commands look like: - -# simple SunOS 4.x example -% cc -PIC -c foo.c -% ld -dc -dp -Bdynamic -o foo.so foo.o - - - and the equivalent lines under Solaris 2.x are: - -# simple Solaris 2.x example -% cc -K PIC -c foo.c -% ld -G -Bdynamic -o foo.so foo.o - - or - -# simple Solaris 2.x example -% gcc -fPIC -c foo.c -% ld -G -Bdynamic -o foo.so foo.o - - - - - When linking shared libraries, you may have to specify - some additional shared libraries (typically system - libraries, such as the C and math libraries) on your ld - command line. - - - - - + @@ -72,7 +72,6 @@ PostgreSQL Programmer's Guide. &indexcost; &gist; &xplang; - &dfunc; diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 5c13afb61d..6ce9cb712f 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -1,5 +1,5 @@ @@ -632,10 +632,10 @@ SELECT clean_EMP(); the int4 type on Unix machines might be: - + /* 4-byte integer, passed by value */ typedef int int4; - + @@ -643,13 +643,13 @@ typedef int int4; be passed by-reference. For example, here is a sample implementation of a Postgres type: - + /* 16-byte structure, passed by reference */ typedef struct { double x, y; } Point; - + @@ -670,12 +670,12 @@ typedef struct (i.e., it includes the size of the length field itself). We can define the text type as follows: - + typedef struct { int4 length; char data[1]; } text; - + @@ -687,7 +687,7 @@ typedef struct { For example, if we wanted to store 40 bytes in a text structure, we might use a code fragment like this: - + #include "postgres.h" ... char buffer[40]; /* our source data */ @@ -696,7 +696,7 @@ text *destination = (text *) palloc(VARHDRSZ + 40); destination->length = VARHDRSZ + 40; memmove(destination->data, buffer, 40); ... - + @@ -709,7 +709,7 @@ memmove(destination->data, buffer, 40); Version-0 Calling Conventions for C-Language Functions - We present the "old style" calling convention first --- although + We present the old style calling convention first --- although this approach is now deprecated, it's easier to get a handle on initially. In the version-0 method, the arguments and result of the C function are just declared in normal C style, but being @@ -720,7 +720,7 @@ memmove(destination->data, buffer, 40); Here are some examples: - + #include <string.h> #include "postgres.h" @@ -786,7 +786,7 @@ concat_text(text *arg1, text *arg2) strncat(VARDATA(new_text), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ); return new_text; } - + @@ -795,7 +795,7 @@ concat_text(text *arg1, text *arg2) we could define the functions to Postgres with commands like this: - + CREATE FUNCTION add_one(int4) RETURNS int4 AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c' WITH (isStrict); @@ -817,7 +817,7 @@ CREATE FUNCTION copytext(text) RETURNS text CREATE FUNCTION concat_text(text, text) RETURNS text AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c' WITH (isStrict); - + @@ -855,13 +855,13 @@ CREATE FUNCTION concat_text(text, text) RETURNS text The version-1 calling convention relies on macros to suppress most of the complexity of passing arguments and results. The C declaration of a version-1 function is always - - Datum funcname(PG_FUNCTION_ARGS) - + +Datum funcname(PG_FUNCTION_ARGS) + In addition, the macro call - - PG_FUNCTION_INFO_V1(funcname); - + +PG_FUNCTION_INFO_V1(funcname); + must appear in the same source file (conventionally it's written just before the function itself). This macro call is not needed for "internal"-language functions, since Postgres currently assumes @@ -870,16 +870,18 @@ CREATE FUNCTION concat_text(text, text) RETURNS text - In a version-1 function, - each actual argument is fetched using a PG_GETARG_xxx() macro that - corresponds to the argument's datatype, and the result is returned - using a PG_RETURN_xxx() macro for the return type. + In a version-1 function, each actual argument is fetched using a + PG_GETARG_xxx() + macro that corresponds to the argument's datatype, and the result + is returned using a + PG_GETARG_xxx() + macro for the return type. Here we show the same functions as above, coded in new style: - + #include <string.h> #include "postgres.h" #include "fmgr.h" @@ -962,7 +964,7 @@ concat_text(PG_FUNCTION_ARGS) strncat(VARDATA(new_text), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ); PG_RETURN_TEXT_P(new_text); } - + @@ -971,27 +973,30 @@ concat_text(PG_FUNCTION_ARGS) - At first glance, the version-1 coding conventions may appear to be - just pointless obscurantism. However, they do offer a number of - improvements, because the macros can hide unnecessary detail. + At first glance, the version-1 coding conventions may appear to + be just pointless obscurantism. However, they do offer a number + of improvements, because the macros can hide unnecessary detail. An example is that in coding add_one_float8, we no longer need to - be aware that float8 is a pass-by-reference type. Another example - is that the GETARG macros for variable-length types hide the need - to deal with fetching "toasted" (compressed or out-of-line) values. - The old-style copytext and concat_text functions shown above are - actually wrong in the presence of toasted values, because they don't - call pg_detoast_datum() on their inputs. (The handler for old-style - dynamically-loaded functions currently takes care of this detail, - but it does so less efficiently than is possible for a version-1 - function.) + be aware that float8 is a pass-by-reference type. Another + example is that the GETARG macros for variable-length types hide + the need to deal with fetching "toasted" (compressed or + out-of-line) values. The old-style copytext + and concat_text functions shown above are + actually wrong in the presence of toasted values, because they + don't call pg_detoast_datum() on their + inputs. (The handler for old-style dynamically-loaded functions + currently takes care of this detail, but it does so less + efficiently than is possible for a version-1 function.) The version-1 function call conventions also make it possible to - test for NULL inputs to a non-strict function, return a NULL result - (from either strict or non-strict functions), return "set" results, - and implement trigger functions and procedural-language call handlers. - For more details see src/backend/utils/fmgr/README. + test for NULL inputs to a non-strict function, return a NULL + result (from either strict or non-strict functions), return + set results, and implement trigger functions and + procedural-language call handlers. For more details see + src/backend/utils/fmgr/README in the source + distribution. @@ -1011,15 +1016,15 @@ concat_text(PG_FUNCTION_ARGS) function as an opaque structure of type TUPLE. Suppose we want to write a function to answer the query - - * SELECT name, c_overpaid(EMP, 1500) AS overpaid - FROM EMP - WHERE name = 'Bill' or name = 'Sam'; - + +SELECT name, c_overpaid(emp, 1500) AS overpaid +FROM emp +WHERE name = 'Bill' OR name = 'Sam'; + In the query above, we can define c_overpaid as: - + #include "postgres.h" #include "executor/executor.h" /* for GetAttributeByName() */ @@ -1055,31 +1060,31 @@ c_overpaid(PG_FUNCTION_ARGS) PG_RETURN_BOOL(salary > limit); } - + GetAttributeByName is the Postgres system function that returns attributes out of the current instance. It has - three arguments: the argument of type TupleTableSlot* passed into + three arguments: the argument of type TupleTableSlot* passed into the function, the name of the desired attribute, and a return parameter that tells whether the attribute is null. GetAttributeByName returns a Datum value that you can convert to the proper datatype by using the - appropriate DatumGetXXX() macro. + appropriate DatumGetXXX() macro. The following query lets Postgres - know about the c_overpaid function: + know about the c_overpaid function: - -CREATE FUNCTION c_overpaid(EMP, int4) + +CREATE FUNCTION c_overpaid(emp, int4) RETURNS bool AS 'PGROOT/tutorial/obj/funcs.so' LANGUAGE 'c'; - + @@ -1096,7 +1101,7 @@ LANGUAGE 'c'; We now turn to the more difficult task of writing programming language functions. Be warned: this section of the manual will not make you a programmer. You must - have a good understanding of C + have a good understanding of C (including the use of pointers and the malloc memory manager) before trying to write C functions for use with Postgres. While it may @@ -1113,20 +1118,6 @@ LANGUAGE 'c'; are written in C. - - C functions with base type arguments can be written in a - straightforward fashion. The C equivalents of built-in Postgres types - are accessible in a C file if - PGROOT/src/backend/utils/builtins.h - is included as a header file. This can be achieved by having - - -#include <utils/builtins.h> - - - at the top of the C source file. - - The basic rules for building C functions are as follows: @@ -1134,66 +1125,65 @@ LANGUAGE 'c'; - Most of the header (include) files for - Postgres - should already be installed in - PGROOT/include (see Figure 2). - You should always include - - --I$PGROOT/include - - - on your cc command lines. Sometimes, you may - find that you require header files that are in - the server source itself (i.e., you need a file - we neglected to install in include). In those - cases you may need to add one or more of - - --I$PGROOT/src/backend --I$PGROOT/src/backend/include --I$PGROOT/src/backend/port/<PORTNAME> --I$PGROOT/src/backend/obj - - - (where <PORTNAME> is the name of the port, e.g., - alpha or sparc). + The relevant header (include) files are installed under + /usr/local/pgsql/include or equivalent. + You can use pg_config --includedir to find + out where it is on your system (or the system that your + users will be running on). For very low-level work you might + need to have a complete PostgreSQL + source tree available. + - When allocating memory, use the - Postgres - routines palloc and pfree instead of the - corresponding C library routines - malloc and free. - The memory allocated by palloc will be freed - automatically at the end of each transaction, - preventing memory leaks. + When allocating memory, use the + Postgres routines + palloc and pfree + instead of the corresponding C library + routines malloc and + free. The memory allocated by + palloc will be freed automatically at the + end of each transaction, preventing memory leaks. + - Always zero the bytes of your structures using - memset or bzero. Several routines (such as the - hash access method, hash join and the sort algorithm) - compute functions of the raw bits contained in - your structure. Even if you initialize all fields - of your structure, there may be - several bytes of alignment padding (holes in the - structure) that may contain garbage values. + Always zero the bytes of your structures using + memset or bzero. + Several routines (such as the hash access method, hash join + and the sort algorithm) compute functions of the raw bits + contained in your structure. Even if you initialize all + fields of your structure, there may be several bytes of + alignment padding (holes in the structure) that may contain + garbage values. + - Most of the internal Postgres - types are declared in postgres.h, - so it's a good - idea to always include that file as well. Including - postgres.h will also include elog.h and palloc.h for you. + Most of the internal Postgres types + are declared in postgres.h, the function + manager interfaces (PG_FUNCTION_ARGS, etc.) + are in fmgr.h, so you will need to + include at least these two files. Including + postgres.h will also include + elog.h and palloc.h + for you. + + + + Symbol names defined within object files must not conflict + with each other or with symbols defined in the + PostgreSQL server executable. You + will have to rename your functions or variables if you get + error messages to this effect. + + + Compiling and loading your object code so that @@ -1208,6 +1198,9 @@ LANGUAGE 'c'; + +&dfunc; +