diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml new file mode 100644 index 0000000000..7af8722512 --- /dev/null +++ b/doc/src/sgml/catalogs.sgml @@ -0,0 +1,450 @@ +.\" This is -*-nroff-*- +.\" XXX standard disclaimer belongs here.... +.\" $Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.1 1999/07/22 15:11:03 thomas Exp $ +.TH "SYSTEM CATALOGS" INTRO 03/13/94 PostgreSQL PostgreSQL +.SH "Section 7 - System Catalogs" +.de LS +.PP +.if n .ta 5 +13 +13 +.if t .ta 0.5i +1.3i +1.3i +.in 0 +.nf +.. +.de LE +.fi +.in +.. +.SH "DESCRIPTION" +In this +section we list each of the attributes of the system catalogs and +define their meanings. +.SH "CLASS/TYPE SYSTEM CATALOGS" +These catalogs form the core of the extensibility system: +.LS + \fBname\fP \fBshared/local\fP \fBdescription\fP + pg_aggregate local aggregate functions + pg_am local access methods + pg_amop local operators usable with specific access methods + pg_amproc local procedures used with specific access methods + pg_attribute local class attributes + pg_class local classes + pg_index local secondary indices + pg_inherits local class inheritance hierarchy + pg_language local procedure implementation languages + pg_opclass local operator classes + pg_operator local query language operators + pg_proc local procedures (functions) + pg_type local data types +.LE +.SH "ENTITIES" +These catalogs deal with identification of entities known throughout +the site: +.LS + \fBname\fP \fBshared/local\fP \fBdescription\fP + pg_database shared current databases + pg_group shared user groups + pg_shadow shared valid users +.LE +.SH "RULE SYSTEM CATALOGS" +.LS + \fBname\fP \fBshared/local\fP \fBdescription\fP + pg_listener local processes waiting on alerters + pg_prs2plans local instance system procedures + pg_prs2rule local instance system rules + pg_prs2stub local instance system ``stubs'' + pg_rewrite local rewrite system information +.LE +.SH "LARGE OBJECT CATALOGS" +.PP +These catalogs are specific to the Inversion file system and large +objects in general: +.LS + \fBname\fP \fBshared/local\fP \fBdescription\fP + pg_lobj local description of a large object + pg_naming local Inversion name space mapping + pg_platter local jukebox platter inventory + pg_plmap local jukebox platter extent map +.LE +.SH "INTERNAL CATALOGS" +.PP +These catalogs are internal classes that are not stored as normal +heaps and cannot be accessed through normal means (attempting to do so +causes an error). +.LS + \fBname\fP \fBshared/local\fP \fBdescription\fP + pg_log shared transaction commit/rollback log + pg_magic shared magic constant + pg_time shared commit/rollback times + pg_variable shared special variable values +.LE +.PP +There are several other classes defined with \*(lqpg_\*(rq names. +Aside from those that end in \*(lqind\*(rq (secondary indices), these +are all obsolete or otherwise deprecated. +.SH "CLASS/TYPE SYSTEM CATALOGS" +.PP +The following catalogs relate to the class/type system. +.nf M +/* + * aggregates + * + * see DEFINE AGGREGATE for an explanation of transition functions + */ +pg_aggregate + NameData aggname /* aggregate name (e.g., "count") */ + oid aggowner /* usesysid of creator */ + regproc aggtransfn1 /* first transition function */ + regproc aggtransfn2 /* second transition function */ + regproc aggfinalfn /* final function */ + oid aggbasetype /* type of data on which aggregate + operates */ + oid aggtranstype1 /* type returned by aggtransfn1 */ + oid aggtranstype2 /* type returned by aggtransfn2 */ + oid aggfinaltype /* type returned by aggfinalfn */ + text agginitval1 /* external format of initial + (starting) value of aggtransfn1 */ + text agginitval2 /* external format of initial + (starting) value of aggtransfn2 */ +.fi +.nf M +pg_am + NameData amname /* access method name */ + oid amowner /* usesysid of creator */ + char amkind /* - deprecated */ + /* originally: + h=hashed + o=ordered + s=special */ + int2 amstrategies /* total NUMBER of strategies by which + we can traverse/search this AM */ + int2 amsupport /* total NUMBER of support functions + that this AM uses */ + regproc amgettuple /* "next valid tuple" function */ + regproc aminsert /* "insert this tuple" function */ + regproc amdelete /* "delete this tuple" function */ + regproc amgetattr /* - deprecated */ + regproc amsetlock /* - deprecated */ + regproc amsettid /* - deprecated */ + regproc amfreetuple /* - deprecated */ + regproc ambeginscan /* "start new scan" function */ + regproc amrescan /* "restart this scan" function */ + regproc amendscan /* "end this scan" function */ + regproc ammarkpos /* "mark current scan position" + function */ + regproc amrestrpos /* "restore marked scan position" + function */ + regproc amopen /* - deprecated */ + regproc amclose /* - deprecated */ + regproc ambuild /* "build new index" function */ + regproc amcreate /* - deprecated */ + regproc amdestroy /* - deprecated */ +.fi +.nf M +pg_amop + oid amopid /* access method with which this + operator be used */ + oid amopclaid /* operator class with which this + operator can be used */ + oid amopopr /* the operator */ + int2 amopstrategy /* traversal/search strategy number + to which this operator applies */ + regproc amopselect /* function to calculate the operator + selectivity */ + regproc amopnpages /* function to calculate the number of + pages that will be examined */ +.fi +.nf M +pg_amproc + oid amid /* access method with which this + procedure is associated */ + oid amopclaid /* operator class with which this + operator can be used */ + oid amproc /* the procedure */ + int2 amprocnum /* support function number to which + this operator applies */ +.fi +.nf M +pg_class + NameData relname /* class name */ + oid relowner /* usesysid of owner */ + oid relam /* access method */ + int4 relpages /* # of 8KB pages */ + int4 reltuples /* # of instances */ + abstime relexpires /* time after which instances are + deleted from non-archival storage */ + reltime relpreserved /* timespan after which instances are + deleted from non-archival storage */ + bool relhasindex /* does the class have a secondary + index? */ + bool relisshared /* is the class shared or local? */ + char relkind /* type of relation: + i=index + r=relation (heap) + s=special + u=uncatalogued (temporary) */ + char relarch /* archive mode: + h=heavy + l=light + n=none */ + int2 relnatts /* current # of non-system + attributes */ + int2 relsmgr /* storage manager: + 0=magnetic disk + 1=sony WORM jukebox + 2=main memory */ + int28 relkey /* - unused */ + oid8 relkeyop /* - unused */ + aclitem relacl[1] /* access control lists */ +.fi +.nf M +pg_attribute + oid attrelid /* class containing this attribute */ + NameData attname /* attribute name */ + oid atttypid /* attribute type */ + oid attdefrel /* - deprecated */ + int4 attnvals /* - deprecated */ + oid atttyparg /* - deprecated */ + int2 attlen /* attribute length, in bytes + -1=variable */ + int2 attnum /* attribute number + >0=user attribute + <0=system attribute */ + int2 attbound /* - deprecated */ + bool attbyval /* type passed by value? */ + bool attcanindex /* - deprecated */ + oid attproc /* - deprecated */ + int4 attnelems /* # of array dimensions */ + int4 attcacheoff /* cached offset into tuple */ + bool attisset /* is attribute set-valued? */ +.fi +.nf M +pg_inherits + oid inhrel /* child class */ + oid inhparent /* parent class */ + int4 inhseqno /* - deprecated */ +.fi +.nf M + oid indexrelid /* oid of secondary index class */ + oid indrelid /* oid of indexed heap class */ + oid indproc /* function to compute index key from + attribute(s) in heap + 0=not a functional index */ + int28 indkey /* attribute numbers of key + attribute(s) */ + oid8 indclass /* opclass of each key */ + bool indisclustered /* is the index clustered? + - unused */ + bool indisarchived /* is the index archival? + - unused */ + text indpred /* query plan for partial index + predicate */ +.fi +.nf M +pg_type + NameData typname /* type name */ + oid typowner /* usesysid of owner */ + int2 typlen /* length in internal form + -1=variable-length */ + int2 typprtlen /* length in external form */ + bool typbyval /* type passed by value? */ + char typtype /* kind of type: + c=catalog (composite) + b=base */ + bool typisdefined /* defined or still a shell? */ + char typdelim /* delimiter for array external form */ + oid typrelid /* class (if composite) */ + oid typelem /* type of each array element */ + regproc typinput /* external-internal conversion + function */ + regproc typoutput /* internal-external conversion + function */ + regproc typreceive /* client-server conversion function */ + regproc typsend /* server-client conversion function */ + text typdefault /* default value */ +.fi +.nf M +pg_operator + NameData oprname /* operator name */ + oid oprowner /* usesysid of owner */ + int2 oprprec /* - deprecated */ + char oprkind /* kind of operator: + b=binary + l=left unary + r=right unary */ + bool oprisleft /* is operator left/right associative? */ + bool oprcanhash /* is operator usable for hashjoin? */ + oid oprleft /* left operand type */ + oid oprright /* right operand type */ + oid oprresult /* result type */ + oid oprcom /* commutator operator */ + oid oprnegate /* negator operator */ + oid oprlsortop /* sort operator for left operand */ + oid oprrsortop /* sort operator for right operand */ + regproc oprcode /* function implementing this operator */ + regproc oprrest /* function to calculate operator + restriction selectivity */ + regproc oprjoin /* function to calculate operator + join selectivity */ +.fi +.nf M +pg_opclass + NameData opcname /* operator class name */ +.fi +.nf M +pg_proc + NameData proname /* function name */ + oid proowner /* usesysid of owner */ + oid prolang /* function implementation language */ + bool proisinh /* - deprecated */ + bool proistrusted /* run in server or untrusted function + process? */ + bool proiscachable /* can the function return values be + cached? */ + int2 pronargs /* # of arguments */ + bool proretset /* does the function return a set? + - unused */ + oid prorettype /* return type */ + oid8 proargtypes /* argument types */ + int4 probyte_pct /* % of argument size (in bytes) that + needs to be examined in order to + compute the function */ + int4 properbyte_cpu /* sensitivity of the function's + running time to the size of its + inputs */ + int4 propercall_cpu /* overhead of the function's + invocation (regardless of input + size) */ + int4 prooutin_ratio /* size of the function's output as a + percentage of the size of the input */ + text prosrc /* function definition: + INTERNAL function: actual C name of function + C function: currently, this field is unused + SQL function: text of query(s) + PL function: text in procedural language */ + bytea probin /* path to object file (C functions only) */ +.fi +.nf M +pg_language + NameData lanname /* language name */ + text lancompiler /* - deprecated */ +.fi +.SH "ENTITIES" +.nf M +pg_database + NameData datname /* database name */ + oid datdba /* usesysid of database administrator */ + text datpath /* directory of database under + $PGDATA */ +.fi +.nf M +pg_group + NameData groname /* group name */ + int2 grosysid /* group's UNIX group id */ + int2 grolist[1] /* list of usesysids of group members */ +.fi +.nf M +pg_shadow + NameData usename /* user's name */ + int2 usesysid /* user's UNIX user id */ + bool usecreatedb /* can user create databases? */ + bool usetrace /* can user set trace flags? */ + bool usesuper /* can user be POSTGRES superuser? */ + bool usecatupd /* can user update catalogs? */ +.fi +.SH "RULE SYSTEM CATALOGS" +.nf M +pg_listener + NameData relname /* class for which asynchronous + notification is desired */ + int4 listenerpid /* process id of server corresponding + to a frontend program waiting for + asynchronous notification */ + int4 notification /* whether an event notification for + this process id still pending */ + +.fi +.nf M +pg_prs2rule + NameData prs2name /* rule name */ + char prs2eventtype /* rule event type: + R=retrieve + U=update (replace) + A=append + D=delete */ + oid prs2eventrel /* class to which event applies */ + int2 prs2eventattr /* attribute to which event applies */ + float8 necessary /* - deprecated */ + float8 sufficient /* - deprecated */ + text prs2text /* text of original rule definition */ +.fi +.nf M +pg_prs2plans + oid prs2ruleid /* prs2rule instance for which this + plan is used */ + int2 prs2planno /* plan number (one rule may invoke + multiple plans) */ + text prs2code /* external representation of the plan */ +.fi +.nf M +pg_prs2stub + oid prs2relid /* class to which this rule applies */ + bool prs2islast /* is this the last stub fragment? */ + int4 prs2no /* stub fragment number */ + stub prs2stub /* stub fragment */ +.fi +.nf M +pg_rewrite + NameData rulename /* rule name */ + char ev_type /* event type: + RETRIEVE, REPLACE, APPEND, DELETE + codes are parser-dependent (!?) */ + oid ev_class /* class to which this rule applies */ + int2 ev_attr /* attribute to which this rule applies */ + bool is_instead /* is this an "instead" rule? */ + text ev_qual /* qualification with which to modify + (rewrite) the plan that triggered this + rule */ + text action /* parse tree of action */ +.fi +.SH "LARGE OBJECT CATALOGS" +.nf M +pg_lobj + oid ourid /* 'ourid' from pg_naming that + identifies this object in the + Inversion file system namespace */ + int4 objtype /* storage type code: + 0=Inversion + 1=Unix + 2=External + 3=Jaquith */ + bytea object_descripto/* opaque object-handle structure */ +.fi +.nf M +pg_naming + NameData filename /* filename component */ + oid ourid /* random oid used to identify this + instance in other instances (can't + use the actual oid for obscure + reasons */ + oid parentid /* pg_naming instance of parent + Inversion file system directory */ +.fi +.nf M +pg_platter + NameData plname /* platter name */ + int4 plstart /* the highest OCCUPIED extent */ +.fi +.nf M +pg_plmap + oid plid /* platter (in pg_platter) on which + this extent (of blocks) resides */ + oid pldbid /* database of the class to which this + extent (of blocks) belongs */ + oid plrelid /* class to which this extend (of + blocks) belongs */ + int4 plblkno /* starting block number within the + class */ + int4 ploffset /* offset within the platter at which + this extent begins */ + int4 plextentsz /* length of this extent */ +.fi diff --git a/doc/src/sgml/dfunc.sgml b/doc/src/sgml/dfunc.sgml index 28e7428871..9da366886e 100644 --- a/doc/src/sgml/dfunc.sgml +++ b/doc/src/sgml/dfunc.sgml @@ -1,28 +1,127 @@ - -Linking Dynamically-Loaded Functions + + Linking Dynamically-Loaded Functions - + + + After you have created and registered a user-defined - function, your work is essentially done. Postgres, - however, must load the object code (e.g., a .o file, or + function, your work is essentially done. Postgres, + however, must load the object code (e.g., a .o file, or a shared library) that implements your function. As - previously mentioned, Postgres loads your code at + previously mentioned, Postgres loads your code at runtime, as required. In order to allow your code to be dynamically loaded, you may have to compile and link-edit it in a special way. This section briefly describes how to perform the compilation and link-editing required before you can load your user-defined - functions into a running Postgres server. Note that + functions into a running Postgres server. Note that this process has changed as of Version 4.2. - - -The old Postgres dynamic + + +The old Postgres dynamic loading mechanism required in-depth knowledge in terms of executable format, placement and alignment of executable instructions within memory, etc. on the part of the person writing the dynamic loader. Such loaders tended to be slow and buggy. As of Version 4.2, the -Postgres dynamic loading mechanism has been rewritten to use +Postgres dynamic loading mechanism has been rewritten to use the dynamic loading mechanism provided by the operating system. This approach is generally faster, more reliable and more portable than our previous dynamic loading mechanism. @@ -30,58 +129,59 @@ The reason for this is that nearly all modern versions of UNIX use a dynamic loading mechanism to implement shared libraries and must therefore provide a fast and reliable mechanism. On the other hand, the object file must be -postprocessed a bit before it can be loaded into Postgres. We +postprocessed a bit before it can be loaded into Postgres. We hope that the large increase in speed and reliability will make up for the slight decrease in convenience. - + You should expect to read (and reread, and re-reread) the manual pages for the C compiler, cc(1), and the link editor, ld(1), if you have specific questions. In addition, the regression test suites in the directory - PGROOT/src/regress contain several + PGROOT/src/regress contain several working examples of this process. If you copy what these tests do, you should not have any problems. The following terminology will be used below: - - - - Dynamic loading - is what Postgres does to an object file. The - object file is copied into the running Postgres + + + + Dynamic loading + is what Postgres does to an object file. The + object file is copied into the running Postgres server and the functions and variables within the file are made available to the functions within - the Postgres process. Postgres does this using + the Postgres process. + Postgres does this using the dynamic loading mechanism provided by the operating system. - - - - - Loading and link editing + + + + + Loading and link editing is what you do to an object file in order to produce another kind of object file (e.g., an executable program or a shared library). You perform this using the link editing program, ld(1). - - - - + + + + - + The following general restrictions and notes also apply to the discussion below: - - - + + + Paths given to the create function command must be absolute paths (i.e., start with "/") that refer to directories visible on the machine on which the - Postgres server is running. - - + Postgres server is running. + + Relative paths do in fact work, but are relative to the directory where the database resides (which is generally @@ -89,45 +189,45 @@ invisible to the frontend application). Obviously, it makes no sense to make the path relative to the directory in which the user started the frontend application, since the server could be running on a completely different machine! - - - - - - -The Postgres user must be able to traverse the path + + + + + + +The Postgres user must be able to traverse the path given to the create function command and be able to - read the object file. This is because the Postgres - server runs as the Postgres user, not as the user + read the object file. This is because the Postgres + server runs as the Postgres user, not as the user who starts up the frontend process. (Making the file or a higher-level directory unreadable and/or unexecutable by the "postgres" user is an extremely common mistake.) - - - - + + + + Symbol names defined within object files must not conflict with each other or with symbols defined in - Postgres. - - - - + Postgres. + + + + The GNU C compiler usually does not provide the special options that are required to use the operating system's dynamic loader interface. In such cases, the C compiler that comes with the operating system must be used. - - - + + + - -<Acronym>ULTRIX</Acronym> + +<acronym>ULTRIX</acronym> - + It is very easy to build dynamically-loaded object files under ULTRIX. ULTRIX does not have any shared library mechanism and hence does not place any restrictions on @@ -138,42 +238,42 @@ The GNU C compiler usually does not provide the special produce each object file with the option -G 0. (Notice that that's the numeral ``0'' and not the letter ``O''). For example, - + # simple ULTRIX example % cc -G 0 -c foo.c - + produces an object file called foo.o that can then be - dynamically loaded into Postgres. + dynamically loaded into Postgres. No additional loading or link-editing must be performed. - - + + - -<Acronym>DEC OSF/1</Acronym> + +<acronym>DEC OSF/1</acronym> - + Under DEC OSF/1, you can take any simple object file and produce a shared object file by running the ld command over it with the correct options. The commands to do this look like: - + # simple DEC OSF/1 example % cc -c foo.c % ld -shared -expect_unresolved '*' -o foo.so foo.o - + The resulting shared object file can then be loaded - into Postgres. When specifying the object file name to + into Postgres. When specifying the object file name to the create function command, one must give it the name of the shared object file (ending in .so) rather than the simple object file. - - -Actually, Postgres does not care + + +Actually, Postgres does not care what you name the file as long as it is a shared object file. If you prefer to name your shared object files with the extension .o, this -is fine with Postgres +is fine with Postgres so long as you make sure that the correct file name is given to the create function command. In other words, you must simply be consistent. However, from a @@ -183,19 +283,20 @@ files have been made into shared object files and which have not. For example, it's very hard to write Makefiles to do the link-editing automatically if both the object file and the shared object file end in .o! - - + + If the file you specify is not a shared object, the backend will hang! - - + + - - -<Acronym>SunOS 4.x</Acronym>, <Acronym>Solaris 2.x</Acronym> and <Acronym>HP-UX</Acronym> + + +<acronym>SunOS 4.x</acronym>, <acronym>Solaris 2.x</acronym> and + <acronym>HP-UX</acronym> - + Under SunOS 4.x, Solaris 2.x and HP-UX, the simple object file must be created by compiling the source file with special compiler flags and a shared library @@ -209,44 +310,61 @@ If the file you specify is into a shared library using the HP-UX link editor with the -b option. This sounds complicated but is actually very simple, since the commands to do it are just: - + # simple HP-UX example % cc +z +u -c foo.c % ld -b -o foo.sl foo.o - - + + - + As with the .so files mentioned in the last subsection, the create function command must be told which file is the correct file to load (i.e., you must give it the location of the shared library, or .sl file). Under SunOS 4.x, the commands look like: - + # simple SunOS 4.x example % cc -PIC -c foo.c % ld -dc -dp -Bdynamic -o foo.so foo.o - + and the equivalent lines under Solaris 2.x are: - + # simple Solaris 2.x example % cc -K PIC -c foo.c % ld -G -Bdynamic -o foo.so foo.o - + or - + # simple Solaris 2.x example % gcc -fPIC -c foo.c % ld -G -Bdynamic -o foo.so foo.o - - + + - + When linking shared libraries, you may have to specify some additional shared libraries (typically system libraries, such as the C and math libraries) on your ld command line. - - - + + + + + diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml index 0b8eb1af18..cad198c225 100644 --- a/doc/src/sgml/extend.sgml +++ b/doc/src/sgml/extend.sgml @@ -1,252 +1,279 @@ - -Extending <Acronym>SQL</Acronym>: An Overview + + Extending <acronym>SQL</acronym>: An Overview - - In the sections that follow, we will discuss how you - can extend the Postgres SQL query language by adding: - - - + + In the sections that follow, we will discuss how you + can extend the Postgres + SQL query language by adding: + + + + functions - - - - + + + + types - - - - + + + + operators - - - - + + + + aggregates - - - - + + + + - -How Extensibility Works + + How Extensibility Works - - Postgres is extensible because its operation is - catalog-driven. If you are familiar with standard - relational systems, you know that they store information - about databases, tables, columns, etc., in what are - commonly known as system catalogs. (Some systems call - this the data dictionary). The catalogs appear to the - user as classes, like any other, but the DBMS stores - its internal bookkeeping in them. One key difference - between Postgres and standard relational systems is - that Postgres stores much more information in its - catalogs -- not only information about tables and columns, - but also information about its types, functions, access - methods, and so on. These classes can be modified by - the user, and since Postgres bases its internal operation - on these classes, this means that Postgres can be - extended by users. By comparison, conventional - database systems can only be extended by changing hardcoded - procedures within the DBMS or by loading modules - specially-written by the DBMS vendor. - - - Postgres is also unlike most other data managers in - that the server can incorporate user-written code into - itself through dynamic loading. That is, the user can - specify an object code file (e.g., a compiled .o file - or shared library) that implements a new type or function - and Postgres will load it as required. Code written - in SQL are even more trivial to add to the server. - This ability to modify its operation "on the fly" makes - Postgres uniquely suited for rapid prototyping of new - applications and storage structures. - - + + Postgres is extensible because its operation is + catalog-driven. If you are familiar with standard + relational systems, you know that they store information + about databases, tables, columns, etc., in what are + commonly known as system catalogs. (Some systems call + this the data dictionary). The catalogs appear to the + user as classes, like any other, but the DBMS stores + its internal bookkeeping in them. One key difference + between Postgres and standard relational systems is + that Postgres stores much more information in its + catalogs -- not only information about tables and columns, + but also information about its types, functions, access + methods, and so on. These classes can be modified by + the user, and since Postgres bases its internal operation + on these classes, this means that Postgres can be + extended by users. By comparison, conventional + database systems can only be extended by changing hardcoded + procedures within the DBMS or by loading modules + specially-written by the DBMS vendor. + - -The <ProductName>Postgres</ProductName> Type System + + Postgres is also unlike most other data managers in + that the server can incorporate user-written code into + itself through dynamic loading. That is, the user can + specify an object code file (e.g., a compiled .o file + or shared library) that implements a new type or function + and Postgres will load it as required. Code written + in SQL are even more trivial to add to the server. + This ability to modify its operation "on the fly" makes + Postgres uniquely suited for rapid prototyping of new + applications and storage structures. + + - - The Postgres type system can be broken down in several ways. - Types are divided into base types and composite types. - Base types are those, like int4, that are implemented - in a language such as C. They generally correspond to - what are often known as "abstract data types"; Postgres - can only operate on such types through methods provided - by the user and only understands the behavior of such - types to the extent that the user describes them. - Composite types are created whenever the user creates a - class. EMP is an example of a composite type. - - - Postgres stores these types in only one way (within the - file that stores all instances of the class) but the - user can "look inside" at the attributes of these types - from the query language and optimize their retrieval by - (for example) defining indices on the attributes. - Postgres base types are further divided into built-in - types and user-defined types. Built-in types (like - int4) are those that are compiled into the system. - User-defined types are those created by the user in the - manner to be described below. - - + + The <productname>Postgres</productname> Type System - -About the <ProductName>Postgres</ProductName> System Catalogs + + The Postgres type system + can be broken down in several ways. + Types are divided into base types and composite types. + Base types are those, like int4, that are implemented + in a language such as C. They generally correspond to + what are often known as "abstract data types"; Postgres + can only operate on such types through methods provided + by the user and only understands the behavior of such + types to the extent that the user describes them. + Composite types are created whenever the user creates a + class. EMP is an example of a composite type. + - - Having introduced the basic extensibility concepts, we - can now take a look at how the catalogs are actually - laid out. You can skip this section for now, but some - later sections will be incomprehensible without the - information given here, so mark this page for later - reference. - All system catalogs have names that begin with pg_. - The following classes contain information that may be - useful to the end user. (There are many other system - catalogs, but there should rarely be a reason to query - them directly.) + + Postgres stores these types + in only one way (within the + file that stores all instances of the class) but the + user can "look inside" at the attributes of these types + from the query language and optimize their retrieval by + (for example) defining indices on the attributes. + Postgres base types are further + divided into built-in + types and user-defined types. Built-in types (like + int4) are those that are compiled + into the system. + User-defined types are those created by the user in the + manner to be described below. + + - -Postgres System Catalogs -Catalogs - - - -Catalog Name -Description - - - - -pg_database - databases - - -pg_class - classes - - -pg_attribute - class attributes - - -pg_index - secondary indices - - -pg_proc - procedures (both C and SQL) - - -pg_type - types (both base and complex) - - -pg_operator - operators - - -pg_aggregate - aggregates and aggregate functions - - -pg_am - access methods - - -pg_amop - access method operators - - -pg_amproc - access method support functions - - -pg_opclass - access method operator classes - - - -
- + + About the <productname>Postgres</productname> System Catalogs - -
-The major <ProductName>Postgres</ProductName> system catalogs - -
+ + Having introduced the basic extensibility concepts, we + can now take a look at how the catalogs are actually + laid out. You can skip this section for now, but some + later sections will be incomprehensible without the + information given here, so mark this page for later + reference. + All system catalogs have names that begin with + pg_. + The following classes contain information that may be + useful to the end user. (There are many other system + catalogs, but there should rarely be a reason to query + them directly.) - The Reference Manual gives a more detailed explanation - of these catalogs and their attributes. However, - - shows the major entities and their relationships - in the system catalogs. (Attributes that do not refer - to other entities are not shown unless they are part of - a primary key.) - This diagram is more or less incomprehensible until you - actually start looking at the contents of the catalogs - and see how they relate to each other. For now, the - main things to take away from this diagram are as follows: + + Postgres System Catalogs + Catalogs + + + + Catalog Name + Description + + + + + pg_database + databases + + + pg_class + classes + + + pg_attribute + class attributes + + + pg_index + secondary indices + + + pg_proc + procedures (both C and SQL) + + + pg_type + types (both base and complex) + + + pg_operator + operators + + + pg_aggregate + aggregates and aggregate functions + + + pg_am + access methods + + + pg_amop + access method operators + + + pg_amproc + access method support functions + + + pg_opclass + access method operator classes + + + +
+
+ + +
+ The major <productname>Postgres</productname> system catalogs + +
+ + The Reference Manual gives a more detailed explanation + of these catalogs and their attributes. However, + + shows the major entities and their relationships + in the system catalogs. (Attributes that do not refer + to other entities are not shown unless they are part of + a primary key.) + This diagram is more or less incomprehensible until you + actually start looking at the contents of the catalogs + and see how they relate to each other. For now, the + main things to take away from this diagram are as follows: - - - - In several of the sections that follow, we will - present various join queries on the system - catalogs that display information we need to extend - the system. Looking at this diagram should make - some of these join queries (which are often - three- or four-way joins) more understandable, - because you will be able to see that the - attributes used in the queries form foreign keys - in other classes. - - - - Many different features (classes, attributes, - functions, types, access methods, etc.) are - tightly integrated in this schema. A simple - create command may modify many of these catalogs. - - - - Types and procedures - are central to the schema. + + + + In several of the sections that follow, we will + present various join queries on the system + catalogs that display information we need to extend + the system. Looking at this diagram should make + some of these join queries (which are often + three- or four-way joins) more understandable, + because you will be able to see that the + attributes used in the queries form foreign keys + in other classes. + + + + + Many different features (classes, attributes, + functions, types, access methods, etc.) are + tightly integrated in this schema. A simple + create command may modify many of these catalogs. + + + + + Types and procedures + are central to the schema. - - -We use the words procedure and function more or less -interchangably. - - + + + We use the words procedure + and function more or less interchangably. + + - Nearly every catalog contains some reference to - instances in one or both of these classes. For - example, Postgres frequently uses type - signatures (e.g., of functions and operators) to - identify unique instances of other catalogs. + Nearly every catalog contains some reference to + instances in one or both of these classes. For + example, Postgres frequently uses type + signatures (e.g., of functions and operators) to + identify unique instances of other catalogs. + + + + + There are many attributes and relationships that + have obvious meanings, but there are many + (particularly those that have to do with access + methods) that do not. The relationships between + pg_am, pg_amop, pg_amproc, pg_operator and + pg_opclass are particularly hard to understand + and will be described in depth (in the section + on interfacing types and operators to indices) + after we have discussed basic extensions. + + + + +
+
- - - - There are many attributes and relationships that - have obvious meanings, but there are many - (particularly those that have to do with access - methods) that do not. The relationships between - pg_am, pg_amop, pg_amproc, pg_operator and - pg_opclass are particularly hard to understand - and will be described in depth (in the section - on interfacing types and operators to indices) - after we have discussed basic extensions. - - - - - - -
+ diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index 85abcf0ae4..420ee00733 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -1,57 +1,384 @@ - -Indices + + Indices and Keys - + + Indexes are primarily used to enhance database + performance. They should be defined on table columns (or class + attributes) which are used as qualifications in repetative queries. + Inappropriate use will result in slower performance, since update + and insertion times are increased in the presence of indices. + - -Partial Indices + + Two forms of indices may be defined: - - -Author - -This is from a reply to a question on the e-mail list -by Paul M. Aoki -on 1998-08-11. + + + + For a value index, + the key fields for the + index are specified as column names; a column may also have + an associated operator class. An operator class is used + to specify the operators to be used for a particular + index. For example, a btree index on four-byte integers + would use the int4_ops class; + this operator class includes + comparison functions for four-byte integers. The default + operator class is the appropriate operator class for that + field type. + + + + + + For a functional index, an index is defined + on the result of a user-defined function applied + to one or more attributes of a single class. + These functional indices + can be used to obtain fast access to data + based on operators that would normally require some + transformation to apply them to the base data. + + + + + + + Postgres provides btree, rtree and hash access methods for + secondary indices. The btree access method is an implementation of + the Lehman-Yao high-concurrency btrees. The rtree access method + implements standard rtrees using Guttman's quadratic split algorithm. + The hash access method is an implementation of Litwin's linear + hashing. We mention the algorithms used solely to indicate that all + of these access methods are fully dynamic and do not have to be + optimized periodically (as is the case with, for example, static hash + access methods). + + + + The Postgres query optimizer will consider using btree indices in a scan + whenever an indexed attribute is involved in a comparison using one of: + + + < + <= + = + >= + > + + + + + Both box classes support indices on the box data + type in Postgres. + The difference between them is that bigbox_ops + scales box coordinates down, to avoid floating point exceptions from + doing multiplication, addition, and subtraction on very large + floating-point coordinates. If the field on which your rectangles lie + is about 20,000 units square or larger, you should use + bigbox_ops. + The poly_ops operator class supports rtree + indices on polygon data. + + + + The Postgres + query optimizer will consider using an rtree index whenever + an indexed attribute is involved in a comparison using one of: + + + << + &< + &> + >> + @ + ~= + && + + + + + The Postgres + query optimizer will consider using a hash index whenever + an indexed attribute is involved in a comparison using + the = operator. + + + + Currently, only the BTREE access method supports multi-column + indexes. Up to 7 keys may be specified. + + + + Use + to remove an index. + + + + The int24_ops + operator class is useful for constructing indices on int2 data, and + doing comparisons against int4 data in query qualifications. + Similarly, int42_ops + support indices on int4 data that is to be compared against int2 data + in queries. + + + + The following select list returns all ops_names: + + +SELECT am.amname AS acc_name, + opc.opcname AS ops_name, + opr.oprname AS ops_comp + FROM pg_am am, pg_amop amop, + pg_opclass opc, pg_operator opr + WHERE amop.amopid = am.oid AND + amop.amopclaid = opc.oid AND + amop.amopopr = opr.oid + ORDER BY acc_name, ops_name, ops_comp + + + + + Keys + + + + Author + + Written by + Herouth Maoz + This originally appeared on the User's Mailing List on 1998-03-02 + in response to the question: + "What is the difference between PRIMARY KEY and UNIQUE constraints?". + + + + + + +Subject: Re: [QUESTIONS] PRIMARY KEY | UNIQUE + + What's the difference between: + + PRIMARY KEY(fields,...) and + UNIQUE (fields,...) + + - Is this an alias? + - If PRIMARY KEY is already unique, then why + is there another kind of key named UNIQUE? + + + + + A primary key is the field(s) used to identify a specific row. For example, + Social Security numbers identifying a person. + + + + A simply UNIQUE combination of fields has nothing to do with identifying + the row. It's simply an integrity constraint. For example, I have + collections of links. Each collection is identified by a unique number, + which is the primary key. This key is used in relations. + + + + However, my application requires that each collection will also have a + unique name. Why? So that a human being who wants to modify a collection + will be able to identify it. It's much harder to know, if you have two + collections named "Life Science", the the one tagged 24433 is the one you + need, and the one tagged 29882 is not. + + + + So, the user selects the collection by its name. We therefore make sure, + withing the database, that names are unique. However, no other table in the + database relates to the collections table by the collection Name. That + would be very inefficient. + + + + Moreover, despite being unique, the collection name does not actually + define the collection! For example, if somebody decided to change the name + of the collection from "Life Science" to "Biology", it will still be the + same collection, only with a different name. As long as the name is unique, + that's OK. + + + + So: + + + + + Primary key: + + + + Is used for identifying the row and relating to it. + + + + + Is impossible (or hard) to update. + + + + + Should not allow NULLs. + + + + + + + + + Unique field(s): + + + + Are used as an alternative access to the row. + + + + + Are updateable, so long as they are kept unique. + + + + + NULLs are acceptable. + + + + + + + + + + As for why no non-unique keys are defined explicitly in standard + SQL syntax? Well, you + must understand that indices are implementation-dependent. + SQL does not + define the implementation, merely the relations between data in the + database. Postgres does allow + non-unique indices, but indices + used to enforce SQL keys are always unique. + + + + Thus, you may query a table by any combination of its columns, despite the + fact that you don't have an index on these columns. The indexes are merely + an implementational aid which each RDBMS offers + you, in order to cause + commonly used queries to be done more efficiently. + Some RDBMS may give you + additional measures, such as keeping a key stored in main memory. They will + have a special command, for example + +CREATE MEMSTORE ON <table> COLUMNS <cols> + + (this is not an existing command, just an example). + + + + In fact, when you create a primary key or a unique combination of fields, + nowhere in the SQL specification does it say + that an index is created, nor that + the retrieval of data by the key is going to be more efficient than a + sequential scan! + + + + So, if you want to use a combination of fields which is not unique as a + secondary key, you really don't have to specify anything - just start + retrieving by that combination! However, if you want to make the retrieval + efficient, you'll have to resort to the means your + RDBMS provider gives you + - be it an index, my imaginary MEMSTORE command, or an intelligent + RDBMS + which creates indices without your knowledge based on the fact that you have + sent it many queries based on a specific combination of keys... (It learns + from experience). + + + + + Partial Indices + + + Author + + This is from a reply to a question on the e-mail list + by Paul M. Aoki + on 1998-08-11. - + + -A partial index -is an index built over a subset of a table; the subset is defined by -a predicate. Postgres - supported partial indices with arbitrary -predicates. I believe IBM's db2 for as/400 supports partial indices -using single-clause predicates. + + A partial index + is an index built over a subset of a table; the subset is defined by + a predicate. Postgres + supported partial indices with arbitrary + predicates. I believe IBM's db2 for as/400 supports partial indices + using single-clause predicates. + - -The main motivation for partial indices is this: -if all of the queries you ask that can -profitably use an index fall into a certain range, why build an index -over the whole table and suffer the associated space/time costs? + + The main motivation for partial indices is this: + if all of the queries you ask that can + profitably use an index fall into a certain range, why build an index + over the whole table and suffer the associated space/time costs? -(There are other reasons too; see - for details.) + (There are other reasons too; see + for details.) + - -The machinery to build, update and query partial indices isn't too -bad. The hairy parts are index selection (which indices do I build?) -and query optimization (which indices do I use?); i.e., the parts -that involve deciding what predicate(s) match the workload/query in -some useful way. For those who are into database theory, the problems -are basically analogous to the corresponding materialized view -problems, albeit with different cost parameters and formulae. These -are, in the general case, hard problems for the standard ordinal -SQL -types; they're super-hard problems with black-box extension types, -because the selectivity estimation technology is so crude. + + The machinery to build, update and query partial indices isn't too + bad. The hairy parts are index selection (which indices do I build?) + and query optimization (which indices do I use?); i.e., the parts + that involve deciding what predicate(s) match the workload/query in + some useful way. For those who are into database theory, the problems + are basically analogous to the corresponding materialized view + problems, albeit with different cost parameters and formulae. These + are, in the general case, hard problems for the standard ordinal + SQL + types; they're super-hard problems with black-box extension types, + because the selectivity estimation technology is so crude. + - -Check , -, -and - -for more information. + + Check , + , + and + + for more information. + + + + + diff --git a/doc/src/sgml/lisp.sgml b/doc/src/sgml/lisp.sgml new file mode 100644 index 0000000000..08f4aca005 --- /dev/null +++ b/doc/src/sgml/lisp.sgml @@ -0,0 +1,104 @@ + + Lisp Programming Interface + + + + pg.el is a socket-level interface to + Postgres for emacs. + + + + + Author + + Written by + Eric Marsden + on 21 Jul 1999. + + + + + pg.el is a socket-level interface to + Postgres for emacs (text + editor extraordinaire). The module is capable of type coercions from a + range of SQL types to the equivalent Emacs Lisp type. It currently + supports neither crypt or Kerberos authentication, nor large objects. + + + + The code (version 0.2) is available under GNU GPL from + + http://www.chez.com/emarsden/downloads/pg.el + + + +Changes since last release: + + + + + now works with XEmacs (tested with Emacs 19.34 & 20.2, and XEmacs + 20.4) + + + + + + added functions to provide database metainformation (list of + databases, of tables, of columns) + + + + + + arguments to `pg:result' are now :keywords + + + + + + MULE-resistant + + + + + + more self-testing code + + + + + + + Please note that this is a programmer's API, and doesn't provide any + form of user interface. Example: + + + (defun demo () + (interactive) + (let* ((conn (pg:connect "template1" "postgres" "postgres")) + (res (pg:exec conn "SELECT * from scshdemo WHERE a = 42"))) + (message "status is %s" (pg:result res :status)) + (message "metadata is %s" (pg:result res :attributes)) + (message "data is %s" (pg:result res :tuples)) + (pg:disconnect conn))) + + + + + diff --git a/doc/src/sgml/oper.sgml b/doc/src/sgml/oper.sgml index a2452db411..673adfbc6a 100644 --- a/doc/src/sgml/oper.sgml +++ b/doc/src/sgml/oper.sgml @@ -12,7 +12,7 @@ Postgres provides a large number of built-in operators on system types. These operators are declared in the system catalog - pg_operator. Every entry in pg_operator includes + pg_operator. Every entry in pg_operator includes the name of the procedure that implements the operator and the class OIDs of the input and output types. diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index a0dfcc3e80..99c72d1fa6 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -1,105 +1,3 @@ - - @@ -123,8 +21,8 @@ Move SQL reference pages up into the User's Guide. + - @@ -156,28 +54,29 @@ Move SQL reference pages up into the User's Guide. - - - - - - - - - - - - + + + + + - + + - + + + + + + + + @@ -277,7 +176,7 @@ Your name here... &oper; &func; &typeconv; - &keys; + &indices; &array; &inherit; &mvcc; @@ -341,7 +240,6 @@ Your name here... &trigger; &spi; &xplang; - @@ -359,6 +257,7 @@ Your name here... &libpgtcl; &odbc; &jdbc; + &lisp; diff --git a/doc/src/sgml/programmer.sgml b/doc/src/sgml/programmer.sgml index 663e3046c9..93ea1dbc81 100644 --- a/doc/src/sgml/programmer.sgml +++ b/doc/src/sgml/programmer.sgml @@ -1,9 +1,15 @@ @@ -172,6 +179,7 @@ Disable it until we put in some info. &ecpg; &odbc; &jdbc; + &lisp; diff --git a/doc/src/sgml/security.sgml b/doc/src/sgml/security.sgml index 1539f98717..85f9d08ef1 100644 --- a/doc/src/sgml/security.sgml +++ b/doc/src/sgml/security.sgml @@ -102,6 +102,221 @@ + + + Host-Based Access Control + + +.SH NAME +$PGDATA/pg_hba.conf +.SH DESCRIPTION + + Host-based access control + is the name for the basic controls PostgreSQL + exercises on what clients are allowed to access a database and how + the users on those clients must authenticate themselves. + + + + Each database system contains a file named + pg_hba.conf, in its PGDATA + directory, which controls who can connect to each database. + + + + Every client accessing a database + must + be covered by one of + the entries in pg_hba.conf. + Otherwise all attempted connections from that + client will be rejected with a "User authentication failed" error + message. + + + + The general format of the pg_hba.conf + file is of a set of records, one per + line. Blank lines and lines beginning with a hash character + ("#") are ignored. A record is + made up of a number of fields which are separated by spaces and/or tabs. + + + + Connections from clients can be made using UNIX domain sockets or Internet + domain sockets (ie. TCP/IP). Connections made using UNIX domain sockets + are controlled using records of the following format: + + +local database authentication method + + + where + + + + database + specifies the database that this record applies to. The value + all + specifies that it applies to all databases. + + + authentication method + specifies the method a user must use to authenticate themselves when + connecting to that database using UNIX domain sockets. The different methods + are described below. + + + + + + Connections made using Internet domain sockets are controlled using records + of the following format. + + +host database TCP/IP address TCP/IP mask authentication method + + + + + The TCP/IP address + is logically anded to both the specified + TCP/IP mask + and the TCP/IP address + of the connecting client. + If the two resulting values are equal then the + record is used for this connection. If a connection matches more than one + record then the earliest one in the file is used. + Both the + TCP/IP address + and the + TCP/IP mask + are specified in dotted decimal notation. + + + + If a connection fails to match any record then the + reject + authentication method is applied (see below). + + + + Authentication Methods + + + The following authentication methods are supported for both UNIX and TCP/IP + domain sockets: + + + + trust + + + The connection is allowed unconditionally. + + + + + + reject + + + The connection is rejected unconditionally. + + + + + + crypt + + + The client is asked for a password for the user. This is sent encrypted + (using crypt(3)) + and compared against the password held in the + pg_shadow table. + If the passwords match, the connection is allowed. + + + + + + password + + + The client is asked for a password for the user. This is sent in clear + and compared against the password held in the + pg_shadow table. + If the passwords match, the connection is allowed. An optional password file + may be specified after the + password + keyword which is used to match the supplied password rather than the pg_shadow + table. See + pg_passwd. + + + + + + + + The following authentication methods are supported for TCP/IP + domain sockets only: + + + + krb4 + + + Kerberos V4 is used to authenticate the user. + + + + + + krb5 + + + Kerberos V5 is used to authenticate the user. + + + + + + ident + + + The ident server on the client is used to authenticate the user (RFC 1413). + An optional map name may be specified after the + ident + keyword which allows ident user names to be mapped onto + Postgres user names. + Maps are held in the file + $PGDATA/pg_ident.conf. + + + + + + + + + Examples + + + +# Trust any connection via UNIX domain sockets. +local trust +# Trust any connection via TCP/IP from this machine. +host all 127.0.0.1 255.255.255.255 trust +# We don't like this machine. +host all 192.168.0.10 255.255.255.0 reject +# This machine can't encrypt so we ask for passwords in clear. +host all 192.168.0.3 255.255.255.0 password +# The rest of this group of machines should provide encrypted passwords. +host all 192.168.0.0 255.255.255.0 crypt + + + + @@ -138,20 +353,50 @@ have to explicitly insert/update the pg_group table. For example: - jolly=> insert into pg_group (groname, grosysid, grolist) - jolly=> values ('posthackers', '1234', '{5443, 8261}'); - INSERT 548224 - jolly=> grant insert on foo to group posthackers; - CHANGE - jolly=> + +jolly=> insert into pg_group (groname, grosysid, grolist) +jolly=> values ('posthackers', '1234', '{5443, 8261}'); +INSERT 548224 +jolly=> grant insert on foo to group posthackers; +CHANGE +jolly=> + + - The fields in pg_group are: - * groname: the group name. This a name and should be purely - alphanumeric. Do not include underscores or other punctuation. - * grosysid: the group id. This is an int4. This should be unique for - each group. - * grolist: the list of pg_user id's that belong in the group. This - is an int4[]. + + The fields in pg_group are: + + + + groname + + + The group name. This a name and should be purely + alphanumeric. Do not include underscores or other punctuation. + + + + + + grosysid + + + The group id. This is an int4. This should be unique for + each group. + + + + + + grolist + + + The list of pg_user id's that belong in the group. This + is an int4[]. + + + + diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index e9c3467638..ee27118681 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -3,14 +3,18 @@ - SQL manipulates sets of data. The language is - composed of various key words. Arithmetic - and procedural expressions are allowed. We will cover these topics - in this chapter; subsequent chapters will include details on data - types, functions, and operators. + A description of the general syntax of SQL. + + SQL manipulates sets of data. The language is + composed of various key words. Arithmetic + and procedural expressions are allowed. We will cover these topics + in this chapter; subsequent chapters will include details on data + types, functions, and operators. + + Key Words @@ -230,7 +234,7 @@ MAXVALUE MINVALUE MODE NOCREATEDB NOCREATEUSER NOTHING NOTNULL OIDS OPERATOR PASSWORD PROCEDURAL -RECIPE RENAME RETURNS ROW RULE +RECIPE RENAME RETURNS ROW RULE SEQUENCE SERIAL SHARE START STATEMENT STDIN STDOUT TRUSTED VALID VERSION @@ -302,18 +306,455 @@ UNCOMMITTED UNNAMED + + Comments + + + A comment + is an arbitrary sequence of characters following double dashes up to the end + of the line. We also support double-slashes as comments, e.g.: + + +-- This is a standard SQL comment +// And this is another supported comment style, like C++ + + +We also support C-style block comments, e.g.: + + +/* multi + line + comment */ + + + + + + Names + + + Names in SQL are sequences of less than NAMEDATALEN alphanumeric characters, + starting with an alphabetic character. By default, NAMEDATALEN is set + to 32, but at the time the system is built, NAMEDATALEN can be changed + by changing the #ifdef in src/backend/include/postgres.h. Underscore + ("_") is considered an alphabetic character. + + + + + Constants + + + There are six types of + constants + for use in SQL. They are described below. + + + + String Constants + + + Strings + in SQL are arbitrary sequences of ASCII characters bounded by single + quotes ("'", e.g. 'This is a string'). + Uppercase alphabetics within strings are accepted + literally. Non-printing characters may be embedded within strings by + prepending them with a backslash + ("\"; e.g. "\tab". + SQL92 allows single quotes to be embedded in strings by typing two + adjacent single quotes (e.g. 'Dianne''s horse'), and for + historical reasons Postgres also allows + single quotes to be escaped with a backslash + (e.g. 'Dianne\'s horse'). + + + + Because of the limitations on + instance sizes, string constants are currently limited to a length of + a little less than 8192 bytes. Larger strings may be handled using the + Postgres Large Object interface. + + + + + Integer Constants + + + Integer constants + in SQL are collection of ASCII digits with no decimal point. Legal + values range from -2147483648 to +2147483647. This will vary + depending on the operating system and host machine. + + + + + Floating Point Constants + + + Floating point constants + consist of an integer part, a decimal point, and a fraction part or + scientific notation of the following format: + + +{dig}.{dig} [e [+-] {dig}] + + + where dig is one or more digits. + You must include at least one dig after the + period and after the [+-] if you use those options. An exponent with + a missing mantissa has a mantissa of 1 inserted. There may be no + extra characters embedded in the string. + Floating point constaints are of type float8. + + + + + Constants of Postgres User-Defined Types + + + A constant of an + arbitrary + type can be entered using the notations: + + +type 'string' +'string'::type +CAST 'string' AS type + + + The value inside the string is passed to the input + conversion routine for the type called type-name. The result is a + constant of the indicated type. The explicit typecast may be omitted + if there is no ambiguity as to the type the constant must be, in which + case it is automatically coerced. + + + + + Array constants + + + Array constants + are arrays of any Postgres type, including other arrays, string + constants, etc. The general format of an array constant is the + following: + + +{val1delimval2delim} + + + where delim + is the delimiter for the type stored in the pg_type class. + (For built-in types, this is the comma character (","). An + example of an array constant is + + +{{1,2,3},{4,5,6},{7,8,9}} + + + This constant is a two-dimensional, 3 by 3 array consisting of three + sub-arrays of integers. + + + + Individual array elements can and should be placed between quotation + marks whenever possible to avoid ambiguity problems with respect to + leading white space. + + + + + + Fields and Columns + + + Fields + + + A field + is either an attribute of a given class or one of the following: + + + + oid + + + stands for the unique identifier of an instance which is added by + Postgres to all instances automatically. Oids are not reused and are 32 + bit quantities. + + + + + + xmin + + + The identity of the inserting transaction. + + + + + + xmax + + + The identity of the deleting transaction. + + + + + + cmin + + + The command identifier within the transaction. + + + + + + cmax + + + The identity of the deleting command. + + + + + + + + For further information on these fields consult + . + Times are represented internally as instances of the + abstime + data type. Transaction and command identifiers are 32 bit quantities. + Transactions are assigned sequentially starting at 512. + + + + + Columns + + + A column is a construct of the form: + + +instance{.composite_field}.field `['number`]' + + + instance + identifies a particular class and can be thought of as standing for + the instances of that class. An instance variable is either a class + name, a surrogate for a class defined by means of a FROM clause, + or the keyword NEW or CURRENT. + NEW and CURRENT can only appear in the action portion of a rule, while + other instance variables can be used in any SQL statement. + composite_field + is a field of of one of the Postgres composite types, + while successive composite fields address attributes in the + class(s) to which the composite field evaluates. Lastly, + field + is a normal (base type) field in the class(s) last addressed. If + field + is of type array, + then the optional number + designator indicates a specific element in the array. If no number is + indicated, then all array elements are returned. + + + + + + Operators + + + Any built-in system, or user-defined operator may be used in SQL. + For the list of built-in and system operators consult + . + For a list of user-defined operators consult your system administrator + or run a query on the pg_operator class. + Parentheses may be used for arbitrary grouping of operators in expressions. + + + Expressions SQL92 allows expressions - to transform data in expressions. Expressions may contain operators + to transform data in tables. Expressions may contain operators (see for more details) and functions ( has more information). - + + + An expression is one of the following: + + + ( a_expr ) + constant + attribute + a_expr binary_operator a_expr + a_expr right_unary_operator + left_unary_operator a_expr + parameter + functional expressions + aggregate expressions + + + + + We have already discussed constants and attributes. The two kinds of + operator expressions indicate respectively binary and left_unary + expressions. The following sections discuss the remaining options. + + + + Parameters + + + A parameter + is used to indicate a parameter in a SQL function. Typically this + is used in SQL function definition statement. The form of a + parameter is: + + +$number + + + + + For example, consider the definition of a function, + dept, as + + +CREATE FUNCTION dept (name) + RETURNS dept + AS 'select * from + dept where name=$1' + LANGUAGE 'sql'; + + + + + + Functional Expressions + + + A functional expression + is the name of a legal SQL function, followed by its argument list + enclosed in parentheses: + + +function (a_expr [, a_expr ) + + + + + For example, the following computes the square root of an employee + salary: + + +sqrt(emp.salary) + + + + + + Aggregate Expression + + + An aggregate expression + represents a simple aggregate (i.e., one that computes a single value) + or an aggregate function (i.e., one that computes a set of values). + The syntax is the following: + + +aggregate_name (attribute) + + + where aggregate_name + must be a previously defined aggregate. + + + + + Target List + + + A target list + is a parenthesized, comma-separated list of one or more elements, each + of which must be of the form: + + +a_expr [ AS result_attname ] + + + where result_attname + is the name of the attribute to be created (or an + already existing attribute name in the case of update statements.) If + result_attname + is not present, then + a_expr + must contain only one attribute name which is assumed to be the name + of the result field. In Postgres + default naming is only used if + a_expr + is an attribute. + + + + + Qualification + + + A qualification + consists of any number of clauses connected by the logical operators: + + + NOT + AND + OR + + + A clause is an a_expr + that evaluates to a boolean over a set of instances. + + + + + From List + + + The from list + is a comma-separated list of from expressions. + Each "from expression" is of the form: + + +[ class_reference ] instance_variable + {, [ class_ref ] instance_variable... } + + + where class_reference + is of the form + + +class_name [ * ] + + + The "from expression" + defines one or more instance variables to range over the class + indicated in class_reference. + One can also request + the instance variable to range over all classes that are beneath the + indicated class in the inheritance hierarchy by postpending the + designator asterisk ("*"). + + diff --git a/doc/src/sgml/user.sgml b/doc/src/sgml/user.sgml index 3323e160bd..0bb677c686 100644 --- a/doc/src/sgml/user.sgml +++ b/doc/src/sgml/user.sgml @@ -1,56 +1,3 @@ - - @@ -67,9 +14,9 @@ Move SQL reference pages up into the User's Guide. + - @@ -157,7 +104,7 @@ Your name here... &oper; &func; &typeconv; - &keys; + &indices; &array; &inherit; &mvcc; diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index a59d15d08e..036d029d81 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -1,39 +1,124 @@ - -Extending <Acronym>SQL</Acronym>: Functions + + Extending <acronym>SQL</acronym>: Functions - - As it turns out, part of defining a new type is the - definition of functions that describe its behavior. - Consequently, while it is possible to define a new - function without defining a new type, the reverse is - not true. We therefore describe how to add new functions - to Postgres before describing - how to add new types. - Postgres SQL - provides two types of functions: query language functions - (functions written in SQL and programming - language functions (functions written in a compiled - programming language such as C.) Either kind - of function can take a base type, a composite type or - some combination as arguments (parameters). In addition, - both kinds of functions can return a base type or - a composite type. It's easier to define SQL - functions, so we'll start with those. Examples in this section - can also be found in funcs.sql - and funcs.c. - + + As it turns out, part of defining a new type is the + definition of functions that describe its behavior. + Consequently, while it is possible to define a new + function without defining a new type, the reverse is + not true. We therefore describe how to add new functions + to Postgres before describing + how to add new types. + - -Query Language (<Acronym>SQL</Acronym>) Functions + + Postgres SQL + provides three types of functions: - -<Acronym>SQL</Acronym> Functions on Base Types + + + + query language functions + (functions written in SQL) + + + + + procedural language + functions (functions written in, for example, PLTCL or PLSQL) + + + + + programming + language functions (functions written in a compiled + programming language such as C) + + + - - The simplest possible SQL function has no arguments and - simply returns a base type, such as int4: + Every kind + of function can take a base type, a composite type or + some combination as arguments (parameters). In addition, + every kind of function can return a base type or + a composite type. It's easiest to define SQL + functions, so we'll start with those. Examples in this section + can also be found in funcs.sql + and funcs.c. + + + + Query Language (<acronym>SQL</acronym>) Functions + + + SQL functions execute an arbitrary list of SQL queries, returning + the results of the last query in the list. SQL functions in general + return sets. If their returntype is not specified as a + setof, + then an arbitrary element of the last query's result will be returned. + + + + The body of a SQL function following AS + should be a list of queries separated by whitespace characters and + bracketed within quotation marks. Note that quotation marks used in + the queries must be escaped, by preceding them with two + backslashes. + + + + Arguments to the SQL function may be referenced in the queries using + a $n syntax: $1 refers to the first argument, $2 to the second, and so + on. If an argument is complex, then a dot + notation (e.g. "$1.emp") may be + used to access attributes of the argument or + to invoke functions. + + + + Examples + + + To illustrate a simple SQL function, consider the following, + which might be used to debit a bank account: + + +create function TP1 (int4, float8) returns int4 + as 'update BANK set balance = BANK.balance - $2 + where BANK.acctountno = $1 + select(x = 1)' + language 'sql'; + + + A user could execute this function to debit account 17 by $100.00 as + follows: + + +select (x = TP1( 17,100.0)); + + + + + The following more interesting example takes a single argument of type + EMP, and retrieves multiple results: + + +select function hobbies (EMP) returns set of HOBBIES + as 'select (HOBBIES.all) from HOBBIES + where $1.name = HOBBIES.person' + language 'sql'; + + + + + + <acronym>SQL</acronym> Functions on Base Types + + + The simplest possible SQL function has no arguments and + simply returns a base type, such as int4: - + CREATE FUNCTION one() RETURNS int4 AS 'SELECT 1 as RESULT' LANGUAGE 'sql'; @@ -44,23 +129,22 @@ +-------+ |1 | +-------+ - - - - + + + Notice that we defined a target list for the function (with the name RESULT), but the target list of the query that invoked the function overrode the function's target list. Hence, the result is labelled answer instead of one. - - - It's almost as easy to define SQL functions + + + It's almost as easy to define SQL functions that take base types as arguments. In the example below, notice how we refer to the arguments within the function as $1 - and $2. - - + and $2: + + CREATE FUNCTION add_em(int4, int4) RETURNS int4 AS 'SELECT $1 + $2;' LANGUAGE 'sql'; @@ -71,22 +155,22 @@ +-------+ |3 | +-------+ - - - + + + - -<Acronym>SQL</Acronym> Functions on Composite Types + + <acronym>SQL</acronym> Functions on Composite Types - + When specifying functions with arguments of composite types (such as EMP), we must not only specify which argument we want (as we did above with $1 and $2) but also the attributes of that argument. For example, take the function double_salary that computes what your - salary would be if it were doubled. - - + salary would be if it were doubled: + + CREATE FUNCTION double_salary(EMP) RETURNS int4 AS 'SELECT $1.salary * 2 AS salary;' LANGUAGE 'sql'; @@ -100,17 +184,17 @@ +-----+-------+ |Sam | 2400 | +-----+-------+ - - - + + + Notice the use of the syntax $1.salary. Before launching into the subject of functions that return composite types, we must first introduce the function notation for projecting attributes. The simple way to explain this is that we can usually use the - notation attribute(class) and class.attribute interchangably. - - + notation attribute(class) and class.attribute interchangably: + + -- -- this is the same as: -- SELECT EMP.name AS youngster FROM EMP WHERE EMP.age < 30 @@ -124,59 +208,60 @@ +----------+ |Sam | +----------+ - - - + + + As we shall see, however, this is not always the case. This function notation is important when we want to use a function that returns a single instance. We do this by assembling the entire instance within the function, attribute by attribute. This is an example of a function that returns a single EMP instance: - - + + CREATE FUNCTION new_emp() RETURNS EMP AS 'SELECT \'None\'::text AS name, 1000 AS salary, 25 AS age, \'(2,2)\'::point AS cubicle' LANGUAGE 'sql'; - - - - + + + In this case we have specified each of the attributes with a constant value, but any computation or expression could have been substituted for these constants. Defining a function like this can be tricky. Some of the more important caveats are as follows: - - - - - - The target list order must be exactly the same as - that in which the attributes appear in the CREATE - TABLE statement (or when you execute a .* query). - - - - -You must typecast the expressions (using ::) very carefully -or you will see the following error: - - - WARN::function declared to return type EMP does not retrieve (EMP.*) - - - - - -When calling a function that returns an instance, we + + + + + The target list order must be exactly the same as + that in which the attributes appear in the CREATE + TABLE statement (or when you execute a .* query). + + + + + You must typecast the expressions (using ::) very carefully + or you will see the following error: + + + +WARN::function declared to return type EMP does not retrieve (EMP.*) + + + + + + + When calling a function that returns an instance, we cannot retrieve the entire instance. We must either project an attribute out of the instance or pass the entire instance into another function. - + + SELECT name(new_emp()) AS nobody; +-------+ @@ -184,35 +269,35 @@ When calling a function that returns an instance, we +-------+ |None | +-------+ - - - - - -The reason why, in general, we must use the function + + + + + + The reason why, in general, we must use the function syntax for projecting attributes of function return values is that the parser just doesn't understand the other (dot) syntax for projection when combined with function calls. - - + + SELECT new_emp().name AS nobody; WARN:parser: syntax error at or near "." - - - - - - - Any collection of commands in the SQL query + + + + + + + Any collection of commands in the SQL query language can be packaged together and defined as a function. - The commands can include updates (i.e., insert, - update and delete) as well - as select queries. However, the final command - must be a select that returns whatever is + The commands can include updates (i.e., insert, + update and delete) as well + as select queries. However, the final command + must be a select that returns whatever is specified as the function's returntype. - - + + CREATE FUNCTION clean_EMP () RETURNS int4 AS 'DELETE FROM EMP WHERE EMP.salary <= 0; SELECT 1 AS ignore_this' @@ -226,74 +311,312 @@ The reason why, in general, we must use the function |1 | +--+ - - - - + + + + - -Programming Language Functions + + Procedural Language Functions - -Programming Language Functions on Base Types + + Procedural languages aren't built into Postgres. They are offered + by loadable modules. Please refer to the documentation for the + PL in question for details about the syntax and how the AS + clause is interpreted by the PL handler. + - - Internally, Postgres regards a + + There are two procedural languages available with the standard + Postgres distribution (PLTCL and PLSQL), and other + languages can be defined. + Refer to for + more information. + + + + + Internal Functions + + + Internal functions are functions written in C which have been statically + linked into the Postgres backend + process. The AS + clause gives the C-language name of the function, which need not be the + same as the name being declared for SQL use. + (For reasons of backwards compatibility, an empty AS + string is accepted as meaning that the C-language function name is the + same as the SQL name.) Normally, all internal functions present in the + backend are declared as SQL functions during database initialization, + but a user could use CREATE FUNCTION + to create additional alias names for an internal function. + + + + + Compiled (C) Language Functions + + + Functions written in C can be defined to Postgres, which will dynamically + load them into its address space. The AS + clause gives the full path name of the object file that contains the + function. This file is loaded either using + load(l) + or automatically the first time the function is necessary for + execution. Repeated execution of a function will cause negligible + additional overhead, as the function will remain in a main memory + cache. + + + + The string which specifies the object file (the string in the AS clause) + should be the full path + of the object code file for the function, bracketed by quotation + marks. (Postgres will not compile a + function automatically; it must + be compiled before it is used in a CREATE FUNCTION + command. See below for additional information.) + + + + C Language Functions on Base Types + + + The following table gives the C type required for parameters in the C + functions that will be loaded into Postgres. The "Defined In" + column gives the actual header file (in the + .../src/backend/ + directory) that the equivalent C type is defined. However, if you + include utils/builtins.h, + these files will automatically be + included. + + + Equivalent C Types + for Built-In <productname>Postgres</productname> Types + Equivalent C Types + + + + + Built-In Type + + + C Type + + + Defined In + + + + + + abstime + AbsoluteTime + utils/nabstime.h + + + bool + bool + include/c.h + + + box + (BOX *) + utils/geo-decls.h + + + bytea + (bytea *) + include/postgres.h + + + char + char + N/A + + + cid + CID + include/postgres.h + + + datetime + (DateTime *) + include/c.h or include/postgres.h + + + int2 + int2 + include/postgres.h + + + int28 + (int28 *) + include/postgres.h + + + int4 + int4 + include/postgres.h + + + float4 + float32 or (float4 *) + include/c.h or include/postgres.h + + + float8 + float64 or (float8 *) + include/c.h or include/postgres.h + + + lseg + (LSEG *) + include/geo-decls.h + + + name + (Name) + include/postgres.h + + + oid + oid + include/postgres.h + + + oid8 + (oid8 *) + include/postgres.h + + + path + (PATH *) + utils/geo-decls.h + + + point + (POINT *) + utils/geo-decls.h + + + regproc + regproc or REGPROC + include/postgres.h + + + reltime + RelativeTime + utils/nabstime.h + + + text + (text *) + include/postgres.h + + + tid + ItemPointer + storage/itemptr.h + + + timespan + (TimeSpan *) + include/c.h or include/postgres.h + + + tinterval + TimeInterval + utils/nabstime.h + + + uint2 + uint16 + include/c.h + + + uint4 + uint32 + include/c.h + + + xid + (XID *) + include/postgres.h + + + +
+
+ + + Internally, Postgres regards a base type as a "blob of memory." The user-defined functions that you define over a type in turn define the - way that Postgres can operate - on it. That is, Postgres will + way that Postgres can operate + on it. That is, Postgres will only store and retrieve the data from disk and use your user-defined functions to input, process, and output the data. Base types can have one of three internal formats: - -pass by value, fixed-length - -pass by reference, fixed-length - -pass by reference, variable-length - - - - + + + + pass by value, fixed-length + + + + + pass by reference, fixed-length + + + + + pass by reference, variable-length + + + + + + By-value types can only be 1, 2 or 4 bytes in length (even if your computer supports by-value types of other - sizes). Postgres itself + sizes). Postgres itself only passes integer types by value. You should be careful to define your types such that they will be the same size (in bytes) on all architectures. For example, the - long type is dangerous because it + long type is dangerous because it is 4 bytes on some machines and 8 bytes on others, whereas - int type is 4 bytes on most - UNIX machines (though not on most + int type is 4 bytes on most + UNIX machines (though not on most personal computers). A reasonable implementation of - the int4 type on UNIX + the int4 type on UNIX machines might be: - - /* 4-byte integer, passed by value */ - typedef int int4; - - + +/* 4-byte integer, passed by value */ +typedef int int4; + + - + On the other hand, fixed-length types of any size may be passed by-reference. For example, here is a sample - implementation of a Postgres type: + implementation of a Postgres type: - - /* 16-byte structure, passed by reference */ - typedef struct - { - double x, y; - } Point; - - + +/* 16-byte structure, passed by reference */ +typedef struct +{ + double x, y; +} Point; + + - + Only pointers to such types can be used when passing - them in and out of Postgres functions. + them in and out of Postgres functions. Finally, all variable-length types must also be passed by reference. All variable-length types must begin with a length field of exactly 4 bytes, and all data to @@ -302,42 +625,42 @@ The reason why, in general, we must use the function length field is the total length of the structure (i.e., it includes the size of the length field itself). We can define the text type as follows: - - - - typedef struct { - int4 length; - char data[1]; - } text; - - + +typedef struct { + int4 length; + char data[1]; +} text; + + - + Obviously, the data field is not long enough to hold - all possible strings -- it's impossible to declare such - a structure in C. When manipulating + all possible strings; it's impossible to declare such + a structure in C. When manipulating variable-length types, we must be careful to allocate the correct amount of memory and initialize the length field. For example, if we wanted to store 40 bytes in a text structure, we might use a code fragment like this: - - #include "postgres.h" - ... - char buffer[40]; /* our source data */ - ... - text *destination = (text *) palloc(VARHDRSZ + 40); - destination->length = VARHDRSZ + 40; - memmove(destination->data, buffer, 40); - ... - - - + +#include "postgres.h" +... +char buffer[40]; /* our source data */ +... +text *destination = (text *) palloc(VARHDRSZ + 40); +destination->length = VARHDRSZ + 40; +memmove(destination->data, buffer, 40); +... + + + + Now that we've gone over all of the possible structures for base types, we can show some examples of real functions. - Suppose funcs.c look like: - + Suppose funcs.c look like: + + #include <string.h> #include "postgres.h" @@ -394,56 +717,58 @@ The reason why, in general, we must use the function strncat(VARDATA(new_text), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ); return (new_text); } - - + + - - On OSF/1 we would type: + + On OSF/1 we would type: - + CREATE FUNCTION add_one(int4) RETURNS int4 - AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; + AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; CREATE FUNCTION makepoint(point, point) RETURNS point - AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; + AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; CREATE FUNCTION concat_text(text, text) RETURNS text - AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; + AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; CREATE FUNCTION copytext(text) RETURNS text - AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; - - + AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c'; + + - + On other systems, we might have to make the filename end in .sl (to indicate that it's a shared library). - -
+ + - -Programming Language Functions on Composite Types + + C Language Functions on Composite Types - + Composite types do not have a fixed layout like C structures. Instances of a composite type may contain null fields. In addition, composite types that are part of an inheritance hierarchy may have different fields than other members of the same inheritance hierarchy. - Therefore, Postgres provides + Therefore, Postgres provides a procedural interface for accessing fields of composite types - from C. As Postgres processes + from C. As Postgres processes a set of instances, each instance will be passed into your - function as an opaque structure of type TUPLE. + function as an opaque structure of type TUPLE. Suppose we want to write a function to answer the query - + + * SELECT name, c_overpaid(EMP, 1500) AS overpaid FROM EMP WHERE name = 'Bill' or name = 'Sam'; - + + In the query above, we can define c_overpaid as: - + #include "postgres.h" #include "executor/executor.h" /* for GetAttributeByName() */ @@ -458,136 +783,248 @@ The reason why, in general, we must use the function return (false); return(salary > limit); } - - + + - - GetAttributeByName is the - Postgres system function that + + GetAttributeByName is the + Postgres system function that returns attributes out of the current instance. It has three arguments: the argument of type TUPLE passed into the function, the name of the desired attribute, and a return parameter that describes whether the attribute - is null. GetAttributeByName will + is null. GetAttributeByName will align data properly so you can cast its return value to the desired type. For example, if you have an attribute - name which is of the type name, the GetAttributeByName + name which is of the type name, the GetAttributeByName call would look like: - + + char *str; ... str = (char *) GetAttributeByName(t, "name", &isnull) - - + + - - The following query lets Postgres + + The following query lets Postgres know about the c_overpaid function: - - * CREATE FUNCTION c_overpaid(EMP, int4) RETURNS bool - AS 'PGROOT/tutorial/obj/funcs.so' LANGUAGE 'c'; - - - + + * CREATE FUNCTION c_overpaid(EMP, int4) RETURNS bool + AS 'PGROOT/tutorial/obj/funcs.so' LANGUAGE 'c'; + + + + While there are ways to construct new instances or modify existing instances from within a C function, these are far too complex to discuss in this manual. - - + + - -Caveats + + Writing Code - + We now turn to the more difficult task of writing programming language functions. Be warned: this section of the manual will not make you a programmer. You must - have a good understanding of C + have a good understanding of C (including the use of pointers and the malloc memory manager) - before trying to write C functions for - use with Postgres. While it may + before trying to write C functions for + use with Postgres. While it may be possible to load functions written in languages other - than C into Postgres, + than C into Postgres, this is often difficult (when it is possible at all) - because other languages, such as FORTRAN - and Pascal often do not follow the same - "calling convention" as C. That is, other + because other languages, such as FORTRAN + and Pascal often do not follow the same + calling convention + as C. That is, other languages do not pass argument and return values between functions in the same way. For this reason, we will assume that your programming language functions - are written in C. - The basic rules for building C functions + are written in C. + + + + C functions with base type arguments can be written in a + straightforward fashion. The C equivalents of built-in Postgres types + are accessible in a C file if + PGROOT/src/backend/utils/builtins.h + is included as a header file. This can be achieved by having + + +#include <utils/builtins.h> + + + at the top of the C source file. + + + + The basic rules for building C functions are as follows: - - - - Most of the header (include) files for - Postgres - should already be installed in - PGROOT/include (see Figure 2). - You should always include - - - -I$PGROOT/include - - on your cc command lines. Sometimes, you may - find that you require header files that are in - the server source itself (i.e., you need a file - we neglected to install in include). In those - cases you may need to add one or more of - - -I$PGROOT/src/backend - -I$PGROOT/src/backend/include - -I$PGROOT/src/backend/port/<PORTNAME> - -I$PGROOT/src/backend/obj - - (where <PORTNAME> is the name of the port, e.g., - alpha or sparc). - - - - When allocating memory, use the - Postgres - routines palloc and pfree instead of the - corresponding C library routines - malloc and free. - The memory allocated by palloc will be freed - automatically at the end of each transaction, - preventing memory leaks. - - - - Always zero the bytes of your structures using - memset or bzero. Several routines (such as the - hash access method, hash join and the sort algorithm) - compute functions of the raw bits contained in - your structure. Even if you initialize all fields - of your structure, there may be - several bytes of alignment padding (holes in the - structure) that may contain garbage values. - - - - Most of the internal Postgres - types are declared in postgres.h, so it's a good - idea to always include that file as well. Including - postgres.h will also include elog.h and palloc.h for you. - - - - Compiling and loading your object code so that - it can be dynamically loaded into - Postgres - always requires special flags. See Appendix A - for a detailed explanation of how to do it for - your particular operating system. - - - - - -
-
+ + + + Most of the header (include) files for + Postgres + should already be installed in + PGROOT/include (see Figure 2). + You should always include + +-I$PGROOT/include + + + on your cc command lines. Sometimes, you may + find that you require header files that are in + the server source itself (i.e., you need a file + we neglected to install in include). In those + cases you may need to add one or more of + + +-I$PGROOT/src/backend +-I$PGROOT/src/backend/include +-I$PGROOT/src/backend/port/<PORTNAME> +-I$PGROOT/src/backend/obj + + + (where <PORTNAME> is the name of the port, e.g., + alpha or sparc). + + + + + When allocating memory, use the + Postgres + routines palloc and pfree instead of the + corresponding C library routines + malloc and free. + The memory allocated by palloc will be freed + automatically at the end of each transaction, + preventing memory leaks. + + + + + Always zero the bytes of your structures using + memset or bzero. Several routines (such as the + hash access method, hash join and the sort algorithm) + compute functions of the raw bits contained in + your structure. Even if you initialize all fields + of your structure, there may be + several bytes of alignment padding (holes in the + structure) that may contain garbage values. + + + + + Most of the internal Postgres + types are declared in postgres.h, + so it's a good + idea to always include that file as well. Including + postgres.h will also include elog.h and palloc.h for you. + + + + + Compiling and loading your object code so that + it can be dynamically loaded into + Postgres + always requires special flags. + See + for a detailed explanation of how to do it for + your particular operating system. + + + + + + + + + Function Overloading + + + More than one function may be defined with the same name, as long as + the arguments they take are different. In other words, function names + can be overloaded. + A function may also have the same name as an attribute. In the case + that there is an ambiguity between a function on a complex type and + an attribute of the complex type, the attribute will always be used. + + + + Name Space Conflicts + + + As of Postgres v6.5, + CREATE FUNCTION can decouple a C language + function name from the name of the entry point. This is now the + preferred technique to accomplish function overloading. + + + + Pre-v6.5 + + + For functions written in C, the SQL name declared in + CREATE FUNCTION + must be exactly the same as the actual name of the function in the + C code (hence it must be a legal C function name). + + + + There is a subtle implication of this restriction: while the + dynamic loading routines in most operating systems are more than + happy to allow you to load any number of shared libraries that + contain conflicting (identically-named) function names, they may + in fact botch the load in interesting ways. For example, if you + define a dynamically-loaded function that happens to have the + same name as a function built into Postgres, the DEC OSF/1 dynamic + loader causes Postgres to call the function within itself rather than + allowing Postgres to call your function. Hence, if you want your + function to be used on different architectures, we recommend that + you do not overload C function names. + + + + There is a clever trick to get around the problem just described. + Since there is no problem overloading SQL functions, you can + define a set of C functions with different names and then define + a set of identically-named SQL function wrappers that take the + appropriate argument types and call the matching C function. + + + + Another solution is not to use dynamic loading, but to link your + functions into the backend statically and declare them as INTERNAL + functions. Then, the functions must all have distinct C names but + they can be declared with the same SQL names (as long as their + argument types differ, of course). This way avoids the overhead of + an SQL wrapper function, at the cost of more effort to prepare a + custom backend executable. + + + + +
+ + diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index 4c0db10295..771c30e485 100644 --- a/doc/src/sgml/xindex.sgml +++ b/doc/src/sgml/xindex.sgml @@ -1,3 +1,8 @@ + + Interfacing Extensions To Indices diff --git a/doc/src/sgml/xplang.sgml b/doc/src/sgml/xplang.sgml index a87404bd61..f9752d11ca 100644 --- a/doc/src/sgml/xplang.sgml +++ b/doc/src/sgml/xplang.sgml @@ -1,14 +1,14 @@ - -Procedural Languages + + Procedural Languages - + Beginning with the release of version 6.3, - Postgres supports + Postgres supports the definition of procedural languages. In the case of a function or trigger procedure defined in a procedural language, the database has @@ -18,60 +18,60 @@ handler itself is a special programming language function compiled into a shared object and loaded on demand. - + - -Installing Procedural Languages + +Installing Procedural Languages - - +<procedure> + <title> Procedural Language Installation - + A procedural language is installed in the database in three steps. - - + + The shared object for the language handler must be compiled and installed. By default the handler for PL/pgSQL is built and installed into the database library directory. If Tcl/Tk support is configured in, the handler for PL/Tcl is also built and installed in the same location. - - + + Writing a handler for a new procedural language (PL) is outside the scope of this manual. - - - - + + + + The handler must be declared with the command - - CREATE FUNCTION handler_function_name () RETURNS OPAQUE AS - 'path-to-shared-object' LANGUAGE 'C'; - - The special return type of OPAQUE tells + + CREATE FUNCTION handler_function_name () RETURNS OPAQUE AS + 'path-to-shared-object' LANGUAGE 'C'; + + The special return type of OPAQUE tells the database, that this function does not return one of the defined base- or composite types and is not directly usable - in SQL statements. - - - - + in SQL statements. + + + + The PL must be declared with the command - - CREATE [ TRUSTED ] PROCEDURAL LANGUAGE 'language-name' - HANDLER handler_function_name - LANCOMPILER 'description'; - - The optional keyword TRUSTED tells + + CREATE [ TRUSTED ] PROCEDURAL LANGUAGE 'language-name' + HANDLER handler_function_name + LANCOMPILER 'description'; + + The optional keyword TRUSTED tells if ordinary database users that have no superuser privileges can use this language to create functions and trigger procedures. Since PL functions are @@ -79,129 +79,129 @@ languages that don't gain access to database backends internals or the filesystem. The languages PL/pgSQL and PL/Tcl are known to be trusted. - - - + + + - - Example - - + + Example + + The following command tells the database where to find the shared object for the PL/pgSQL languages call handler function. - - + + CREATE FUNCTION plpgsql_call_handler () RETURNS OPAQUE AS '/usr/local/pgsql/lib/plpgsql.so' LANGUAGE 'C'; - - + + - - + + The command - - + + CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql' HANDLER plpgsql_call_handler LANCOMPILER 'PL/pgSQL'; - - + + then defines that the previously declared call handler function should be invoked for functions and trigger procedures where the language attribute is 'plpgsql'. - - + + PL handler functions have a special call interface that is different from regular C language functions. One of the arguments - given to the handler is the object ID in the pg_proc + given to the handler is the object ID in the pg_proc tables entry for the function that should be executed. The handler examines various system catalogs to analyze the functions call arguments and it's return data type. The source text of the functions body is found in the prosrc attribute of - pg_proc. + pg_proc. Due to this, in contrast to C language functions, PL functions can be overloaded like SQL language functions. There can be multiple different PL functions having the same function name, as long as the call arguments differ. - - - Procedural languages defined in the template1 + + + Procedural languages defined in the template1 database are automatically defined in all subsequently created databases. So the database administrator can decide which languages are available by default. - - - - + + + + - -PL/pgSQL + +PL/pgSQL - + PL/pgSQL is a loadable procedural language for the - Postgres database system. - + Postgres database system. + - + This package was originally written by Jan Wieck. - + - -Overview + +Overview - + The design goals of PL/pgSQL were to create a loadable procedural language that - - - + + + can be used to create functions and trigger procedures, - - - - - adds control structures to the SQL language, - - - - + + + + + adds control structures to the SQL language, + + + + can perform complex computations, - - - - + + + + inherits all user defined types, functions and operators, - - - - + + + + can be defined to be trusted by the server, - - - - + + + + is easy to use. - - - - - + + + + + The PL/pgSQL call handler parses the functions source text and produces an internal binary instruction tree on the first time, the function is called by a backend. The produced bytecode is identified in the call handler by the object ID of the function. This ensures, that changing a function by a DROP/CREATE sequence will take effect without establishing a new database connection. - - - For all expressions and SQL statements used in + + + For all expressions and SQL statements used in the function, the PL/pgSQL bytecode interpreter creates a prepared execution plan using the SPI managers SPI_prepare() and SPI_saveplan() functions. This is done the first time, the individual @@ -210,41 +210,41 @@ plans would be required, will only prepare and save those plans that are really used during the entire lifetime of the database connection. - - + + Except for input-/output-conversion and calculation functions for user defined types, anything that can be defined in C language functions can also be done with PL/pgSQL. It is possible to create complex conditional computation functions and later use them to define operators or use them in functional indices. - - + + - -Description + +Description - -Structure of PL/pgSQL + +Structure of PL/pgSQL - + The PL/pgSQL language is case insensitive. All keywords and identifiers can be used in mixed upper- and lowercase. - - + + PL/pgSQL is a block oriented language. A block is defined as - + [<<label>>] [DECLARE declarations] BEGIN statements END; - + There can be any number of subblocks in the statement section of a block. Subblocks can be used to hide variables from outside a @@ -252,76 +252,78 @@ declared in the declarations section preceding a block are initialized to their default values every time the block is entered, not only once per function call. - + - + It is important not to misunderstand the meaning of BEGIN/END for grouping statements in PL/pgSQL and the database commands for transaction control. Functions and trigger procedures cannot - start or commit transactions and Postgres + start or commit transactions and Postgres does not have nested transactions. - - + + - -Comments + +Comments - + There are two types of comments in PL/pgSQL. A double dash '--' starts a comment that extends to the end of the line. A '/*' starts a block comment that extends to the next occurence of '*/'. Block comments cannot be nested, but double dash comments can be enclosed into a block comment and a double dash can hide the block comment delimiters '/*' and '*/'. - - + + - -Declarations + +Declarations - + All variables, rows and records used in a block or it's subblocks must be declared in the declarations section of a block except for the loop variable of a FOR loop iterating over a range of integer values. Parameters given to a PL/pgSQL function are automatically declared with the usual identifiers $n. The declarations have the following syntax: - - + + - - -name [ CONSTANT ] type [ NOT NULL ] [ DEFAULT | := value ]; - - - + + +name [ CONSTANT ] +>typ> [ NOT NULL ] [ DEFAULT | := + value ]; + + + Declares a variable of the specified base type. If the variable is declared as CONSTANT, the value cannot be changed. If NOT NULL is specified, an assignment of a NULL value results in a runtime error. Since the default value of all variables is the - SQL NULL value, all variables declared as NOT NULL + SQL NULL value, all variables declared as NOT NULL must also have a default value specified. - - + + The default value is evaluated ever time the function is called. So - assigning 'now' to a variable of type - datetime causes the variable to have the + assigning 'now' to a variable of type + datetime causes the variable to have the time of the actual function call, not when the function was precompiled into it's bytecode. - - - + + + - - -name class%ROWTYPE; - - - + + +name class%ROWTYPE; + + + Declares a row with the structure of the given class. Class must be an existing table- or viewname of the database. The fields of the row are accessed in the dot notation. Parameters to a function can @@ -331,110 +333,110 @@ attributes of a table row are accessible in the row, no Oid or other system attributes (hence the row could be from a view and view rows don't have useful system attributes). - - + + The fields of the rowtype inherit the tables fieldsizes or precision for char() etc. data types. - - - + + + - - -name RECORD; - - - + + +name RECORD; + + + Records are similar to rowtypes, but they have no predefined structure. They are used in selections and FOR loops to hold one actual database row from a SELECT operation. One and the same record can be used in different selections. Accessing a record or an attempt to assign a value to a record field when there is no actual row in it results in a runtime error. - - + + The NEW and OLD rows in a trigger are given to the procedure as - records. This is necessary because in Postgres + records. This is necessary because in Postgres one and the same trigger procedure can handle trigger events for different tables. - - - + + + - - -name ALIAS FOR $n; - - - + + +name ALIAS FOR $n; + + + For better readability of the code it is possible to define an alias for a positional parameter to a function. - - + + This aliasing is required for composite types given as arguments to a function. The dot notation $1.salary as in SQL functions is not allowed in PL/pgSQL. - - - + + + - - -RENAME oldname TO newname; - - - + + +RENAME oldname TO newname; + + + Change the name of a variable, record or row. This is useful if NEW or OLD should be referenced by another name inside a trigger procedure. - - - + + + - - + + - -Data Types + +Data Types - + The type of a varible can be any of the existing basetypes of - the database. type in the declarations + the database. type in the declarations section above is defined as: - - - - - - Postgres-basetype - - - - - variable%TYPE - - - - - class.field%TYPE - - - - - - variable is the name of a variable, + + + + + + Postgres-basetype + + + + + variable%TYPE + + + + + class.field%TYPE + + + + + + variable is the name of a variable, previously declared in the same function, that is visible at this point. - - - class is the name of an existing table - or view where field is the name of + + + class is the name of an existing table + or view where field is the name of an attribute. - - - Using the class.field%TYPE + + + Using the class.field%TYPE causes PL/pgSQL to lookup the attributes definitions at the first call to the funciton during the lifetime of a backend. Have a table with a char(20) attribute and some PL/pgSQL functions @@ -444,18 +446,18 @@ previously declared in the char(40) and restores the data. Ha - he forgot about the funcitons. The computations inside them will truncate the values to 20 characters. But if they are defined using the - class.field%TYPE + class.field%TYPE declarations, they will automagically handle the size change or if the new table schema defines the attribute as text type. - - + + - -Expressions + +Expressions - + All expressions used in PL/pgSQL statements are processed using the backends executor. Expressions which appear to contain constants may in fact require run-time evaluation (e.g. 'now' for the @@ -463,22 +465,22 @@ datetime type) so it is impossible for the PL/pgSQL parser to identify real constant values other than the NULL keyword. All expressions are evaluated internally by executing a query - - SELECT expression - + + SELECT expression + using the SPI manager. In the expression, occurences of variable identifiers are substituted by parameters and the actual values from the variables are passed to the executor in the parameter array. All expressions used in a PL/pgSQL function are only prepared and saved once. - - + + The type checking done by the Postgres main parser has some side effects to the interpretation of constant values. In detail there is a difference between what the two functions - + CREATE FUNCTION logfunc1 (text) RETURNS datetime AS ' DECLARE logtxt ALIAS FOR $1; @@ -487,11 +489,11 @@ it is impossible for the PL/pgSQL parser RETURN ''now''; END; ' LANGUAGE 'plpgsql'; - + and - + CREATE FUNCTION logfunc2 (text) RETURNS datetime AS ' DECLARE logtxt ALIAS FOR $1; @@ -502,9 +504,9 @@ it is impossible for the PL/pgSQL parser RETURN curtime; END; ' LANGUAGE 'plpgsql'; - + - do. In the case of logfunc1(), the Postgres + do. In the case of logfunc1(), the Postgres main parser knows when preparing the plan for the INSERT, that the string 'now' should be interpreted as datetime because the target field of logtable @@ -512,389 +514,392 @@ it is impossible for the PL/pgSQL parser and this constant value is then used in all invocations of logfunc1() during the lifetime of the backend. Needless to say that this isn't what the programmer wanted. - - - In the case of logfunc2(), the Postgres + + + In the case of logfunc2(), the Postgres main parser does not know what type 'now' should become and therefor it returns a datatype of text containing the string 'now'. During the assignment to the local variable curtime, the PL/pgSQL interpreter casts this string to the datetime type by calling the text_out() and datetime_in() functions for the conversion. - - - This type checking done by the Postgres main + + + This type checking done by the Postgres main parser got implemented after PL/pgSQL was nearly done. It is a difference between 6.3 and 6.4 and affects all functions using the prepared plan feature of the SPI manager. Using a local variable in the above manner is currently the only way in PL/pgSQL to get those values interpreted correctly. - - + + If record fields are used in expressions or statements, the data types of fields should not change between calls of one and the same expression. Keep this in mind when writing trigger procedures that handle events for more than one table. - - + + - -Statements + +Statements - + Anything not understood by the PL/pgSQL parser as specified below will be put into a query and sent down to the database engine to execute. The resulting query should not return any data. - + - + - - + + Assignment - - - + + + An assignment of a value to a variable or row/record field is written as - - identifier := expression; - + + identifier := expression; + If the expressions result data type doesn't match the variables data type, or the variable has a size/precision that is known (as for char(20)), the result value will be implicitly casted by the PL/pgSQL bytecode interpreter using the result types output- and the variables type input-functions. Note that this could potentially result in runtime errors generated by the types input functions. - - + + An assignment of a complete selection into a record or row can be done by - - SELECT expressions INTO target FROM ...; - - target can be a record, a row variable or a + + SELECT expressions INTO target FROM ...; + + target can be a record, a row variable or a comma separated list of variables and record-/row-fields. - - + + if a row or a variable list is used as target, the selected values must exactly match the structure of the target(s) or a runtime error occurs. The FROM keyword can be followed by any valid qualification, grouping, sorting etc. that can be given for a SELECT statement. - - + + There is a special variable named FOUND of type bool that can be used immediately after a SELECT INTO to check if an assignment had success. - + SELECT * INTO myrec FROM EMP WHERE empname = myname; IF NOT FOUND THEN RAISE EXCEPTION ''employee % not found'', myname; END IF; - + If the selection returns multiple rows, only the first is moved into the target fields. All others are silently discarded. - - - + + + - - + + Calling another function - - - - All functions defined in a Prostgres + + + + All functions defined in a Prostgres database return a value. Thus, the normal way to call a function is to execute a SELECT query or doing an assignment (resulting in a PL/pgSQL internal SELECT). But there are cases where someone isn't interested int the functions result. - - PERFORM query - - executes a 'SELECT query' over the + + PERFORM query + + executes a 'SELECT query' over the SPI manager and discards the result. Identifiers like local variables are still substituted into parameters. - - - + + + - - + + Returning from the function - - - - - RETURN expression - - The function terminates and the value of expression + + + + + RETURN expression + + The function terminates and the value of expression will be returned to the upper executor. The return value of a function cannot be undefined. If control reaches the end of the toplevel block of the function without hitting a RETURN statement, a runtime error will occur. - - + + The expressions result will be automatically casted into the functions return type as described for assignments. - - - + + + - - + + Aborting and messages - - - + + + As indicated in the above examples there is a RAISE statement that - can throw messages into the Postgres + can throw messages into the Postgres elog mechanism. - - RAISE level ''format'' [, identifier [...]]; - + + RAISE level + r">forle>'' [, + identifier [...]]; + Inside the format, % is used as a placeholder for the subsequent comma-separated identifiers. Possible levels are DEBUG (silently suppressed in production running databases), NOTICE (written into the database log and forwarded to the client application) and EXCEPTION (written into the database log and aborting the transaction). - - - + + + - - + + Conditionals - - - - - IF expression THEN + + + + + IF expression THEN statements [ELSE statements] END IF; - - The expression must return a value that + + The expression must return a value that at least can be casted into a boolean type. - - - + + + - - + + Loops - - - + + + There are multiple types of loops. - + [<<label>>] LOOP statements END LOOP; - + An unconditional loop that must be terminated explicitly by an EXIT statement. The optional label can be used by EXIT statements of nested loops to specify which level of nesting should be terminated. - + [<<label>>] - WHILE expression LOOP + WHILE expression LOOP statements END LOOP; - + A conditional loop that is executed as long as the evaluation - of expression is true. - + of expression is true. + [<<label>>] - FOR name IN [ REVERSE ] expression .. expression LOOP + FOR name IN [ REVERSE ] +le>expressle> .. expression LOOP statements END LOOP; - + A loop that iterates over a range of integer values. The variable - name is automatically created as type + name is automatically created as type integer and exists only inside the loop. The two expressions giving the lower and upper bound of the range are evaluated only when entering the loop. The iteration step is always 1. - + [<<label>>] - FOR record | row IN select_clause LOOP + FOR record | row IN select_clause LOOP statements END LOOP; - + The record or row is assigned all the rows resulting from the select clause and the statements executed for each. If the loop is terminated with an EXIT statement, the last assigned row is still accessible after the loop. - - EXIT [ label ] [ WHEN expression ]; - - If no label given, + + EXIT [ label ] [ WHEN expression ]; + + If no label given, the innermost loop is terminated and the statement following END LOOP is executed next. - If label is given, it + If label is given, it must be the label of the current or an upper level of nested loop blocks. Then the named loop or block is terminated and control continues with the statement after the loops/blocks corresponding END. - - - + + + - + - + - -Trigger Procedures + +Trigger Procedures - + PL/pgSQL can be used to define trigger procedures. They are created with the usual CREATE FUNCTION command as a function with no arguments and a return type of OPAQUE. - - - There are some Postgres specific details + + + There are some Postgres specific details in functions used as trigger procedures. - - + + First they have some special variables created automatically in the toplevel blocks declaration section. They are - + - + - - + + NEW - - - + + + Datatype RECORD; variable holding the new database row on INSERT/UPDATE operations on ROW level triggers. - - - + + + - - + + OLD - - - + + + Datatype RECORD; variable holding the old database row on UPDATE/DELETE operations on ROW level triggers. - - - + + + - - + + TG_NAME - - - + + + Datatype name; variable that contains the name of the trigger actually fired. - - - + + + - - + + TG_WHEN - - - + + + Datatype text; a string of either 'BEFORE' or 'AFTER' depending on the triggers definition. - - - + + + - - + + TG_LEVEL - - - + + + Datatype text; a string of either 'ROW' or 'STATEMENT' depending on the triggers definition. - - - + + + - - + + TG_OP - - - + + + Datatype text; a string of 'INSERT', 'UPDATE' or 'DELETE' telling for which operation the trigger is actually fired. - - - + + + - - + + TG_RELID - - - + + + Datatype oid; the object ID of the table that caused the trigger invocation. - - - + + + - - + + TG_RELNAME - - - + + + Datatype name; the name of the table that caused the trigger invocation. - - - + + + - - + + TG_NARGS - - - + + + Datatype integer; the number of arguments given to the trigger procedure in the CREATE TRIGGER statement. - - - + + + - - + + TG_ARGV[] - - - + + + Datatype array of text; the arguments from the CREATE TRIGGER statement. The index counts from 0 and can be given as an expression. Invalid indices (< 0 or >= tg_nargs) result in a NULL value. - - - + + + - + - + Second they must return either NULL or a record/row containing exactly the structure of the table the trigger was fired for. Triggers fired AFTER might always return a NULL value with no @@ -904,22 +909,22 @@ Loops row in the operation. It is possible to replace single values directly in NEW and return that or to build a complete new record/row to return. - - + + - -Exceptions + +Exceptions - - Postgres does not have a very smart + + Postgres does not have a very smart exception handling model. Whenever the parser, planner/optimizer or executor decide that a statement cannot be processed any longer, the whole transaction gets aborted and the system jumps back into the mainloop to get the next query from the client application. - - + + It is possible to hook into the error mechanism to notice that this happens. But currently it's impossible to tell what really caused the abort (input/output conversion error, floating point @@ -929,73 +934,73 @@ Loops And even if, at this point the information, that the transaction is aborted, is already sent to the client application, so resuming operation does not make any sense. - - + + Thus, the only thing PL/pgSQL currently does when it encounters an abort during execution of a function or trigger procedure is to write some additional DEBUG level log messages telling in which function and where (line number and type of statement) this happened. - - - + + + - -Examples + +Examples - + Here are only a few functions to demonstrate how easy PL/pgSQL functions can be written. For more complex examples the programmer might look at the regression test for PL/pgSQL. - + - + One painful detail of writing functions in PL/pgSQL is the handling of single quotes. The functions source text on CREATE FUNCTION must be a literal string. Single quotes inside of literal strings must be either doubled or quoted with a backslash. We are still looking for an elegant alternative. In the meantime, doubling the single qoutes as in the examples below should be used. Any solution for this -in future versions of Postgres will be +in future versions of Postgres will be upward compatible. - + - -Some Simple PL/pgSQL Functions + +Some Simple PL/pgSQL Functions - + The following two PL/pgSQL functions are identical to their counterparts from the C language function discussion. - + CREATE FUNCTION add_one (int4) RETURNS int4 AS ' BEGIN RETURN $1 + 1; END; ' LANGUAGE 'plpgsql'; - + - + CREATE FUNCTION concat_text (text, text) RETURNS text AS ' BEGIN RETURN $1 || $2; END; ' LANGUAGE 'plpgsql'; - - + + - + - -PL/pgSQL Function on Composite Type + +PL/pgSQL Function on Composite Type - + Again it is the PL/pgSQL equivalent to the example from The C functions. - + CREATE FUNCTION c_overpaid (EMP, int4) RETURNS bool AS ' DECLARE emprec ALIAS FOR $1; @@ -1007,21 +1012,21 @@ upward compatible. RETURN emprec.salary > sallim; END; ' LANGUAGE 'plpgsql'; - - + + - + - -PL/pgSQL Trigger Procedure + +PL/pgSQL Trigger Procedure - + This trigger ensures, that any time a row is inserted or updated in the table, the current username and time are stamped into the row. And it ensures that an employees name is given and that the salary is a positive value. - + CREATE TABLE emp ( empname text, salary int4, @@ -1052,108 +1057,110 @@ upward compatible. CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp FOR EACH ROW EXECUTE PROCEDURE emp_stamp(); - - + + - + - + - + - -PL/Tcl + +PL/Tcl - + PL/Tcl is a loadable procedural language for the - Postgres database system + Postgres database system that enables the Tcl language to be used to create functions and trigger-procedures. - + - + This package was originally written by Jan Wieck. - + - -Overview + +Overview - + PL/Tcl offers most of the capabilities a function writer has in the C language, except for some restrictions. - - + + The good restriction is, that everything is executed in a safe Tcl-interpreter. In addition to the limited command set of safe Tcl, only a few commands are available to access the database over SPI and to raise messages via elog(). There is no way to access internals of the database backend or gaining OS-level access under the permissions of the - Postgres user ID like in C. + Postgres user ID like in C. Thus, any unprivileged database user may be permitted to use this language. - - + + The other, internal given, restriction is, that Tcl procedures cannot be used to create input-/output-functions for new data types. - - + + The shared object for the PL/Tcl call handler is automatically built - and installed in the Postgres + and installed in the Postgres library directory if the Tcl/Tk support is specified in the configuration step of the installation procedure. - - + + - -Description + +Description - -<ProductName>Postgres</ProductName> Functions and Tcl Procedure Names + +<productname>Postgres</productname> Functions and Tcl Procedure Names - - In Postgres, one and the + + In Postgres, one and the same function name can be used for different functions as long as the number of arguments or their types differ. This would collide with Tcl procedure names. To offer the same flexibility in PL/Tcl, the internal Tcl procedure names contain the object ID of the procedures pg_proc row as part of their name. Thus, different - argtype versions of the same Postgres + argtype versions of the same Postgres function are different for Tcl too. - + - + - -Defining Functions in PL/Tcl + +Defining Functions in PL/Tcl - + To create a function in the PL/Tcl language, use the known syntax - - CREATE FUNCTION funcname (argument-types) RETURNS returntype AS ' + + CREATE FUNCTION funcname + ceable>argumenceable>) RETURNS + returntype AS ' # PL/Tcl function body ' LANGUAGE 'pltcl'; - + When calling this function in a query, the arguments are given as variables $1 ... $n to the Tcl procedure body. So a little max function returning the higher of two int4 values would be created as: - + CREATE FUNCTION tcl_max (int4, int4) RETURNS int4 AS ' if {$1 > $2} {return $1} return $2 ' LANGUAGE 'pltcl'; - + Composite type arguments are given to the procedure as Tcl arrays. The element names @@ -1161,9 +1168,9 @@ upward compatible. type. If an attribute in the actual row has the NULL value, it will not appear in the array! Here is an example that defines the overpaid_2 function (as found in the - older Postgres documentation) in PL/Tcl + older Postgres documentation) in PL/Tcl - + CREATE FUNCTION overpaid_2 (EMP) RETURNS bool AS ' if {200000.0 < $1(salary)} { return "t" @@ -1173,15 +1180,15 @@ upward compatible. } return "f" ' LANGUAGE 'pltcl'; - - + + - + - -Global Data in PL/Tcl + +Global Data in PL/Tcl - + Sometimes (especially when using the SPI functions described later) it is useful to have some global status data that is held between two calls to a procedure. @@ -1191,146 +1198,146 @@ upward compatible. an array is made available to each procedure via the upvar command. The global name of this variable is the procedures internal name and the local name is GD. - - + + - -Trigger Procedures in PL/Tcl + +Trigger Procedures in PL/Tcl - - Trigger procedures are defined in Postgres + + Trigger procedures are defined in Postgres as functions without arguments and a return type of opaque. And so are they in the PL/Tcl language. - - + + The informations from the trigger manager are given to the procedure body in the following variables: - - + + - - + + $TG_name - - - + + + The name of the trigger from the CREATE TRIGGER statement. - - - + + + - - + + $TG_relid - - - + + + The object ID of the table that caused the trigger procedure to be invoked. - - - + + + - - + + $TG_relatts - - - + + + A Tcl list of the tables field names prefixed with an empty list element. So looking up an element name in the list with the lsearch Tcl command returns the same positive number starting from 1 as the fields are numbered in the pg_attribute system catalog. - - - + + + - - + + $TG_when - - - + + + The string BEFORE or AFTER depending on the event of the trigger call. - - - + + + - - + + $TG_level - - - + + + The string ROW or STATEMENT depending on the event of the trigger call. - - - + + + - - + + $TG_op - - - + + + The string INSERT, UPDATE or DELETE depending on the event of the trigger call. - - - + + + - - + + $NEW - - - + + + An array containing the values of the new table row on INSERT/UPDATE actions, or empty on DELETE. - - - + + + - - + + $OLD - - - + + + An array containing the values of the old table row on UPDATE/DELETE actions, or empty on INSERT. - - - + + + - - + + $GD - - - + + + The global status data array as described above. - - - + + + - - + + $args - - - + + + A Tcl list of the arguments to the procedure as given in the CREATE TRIGGER statement. The arguments are also accessible as $1 ... $n in the procedure body. - - - + + + - + - + The return value from a trigger procedure is one of the strings OK or SKIP, or a list as returned by the 'array get' Tcl command. If the return value is OK, the normal operation (INSERT/UPDATE/DELETE) that fired this trigger @@ -1339,14 +1346,14 @@ $args to return a modified row to the trigger manager that will be inserted instead of the one given in $NEW (INSERT/UPDATE only). Needless to say that all this is only meaningful when the trigger is BEFORE and FOR EACH ROW. - - + + Here's a little example trigger procedure that forces an integer value in a table to keep track of the # of updates that are performed on the row. For new row's inserted, the value is initialized to 0 and then incremented on every update operation: - + CREATE FUNCTION trigfunc_modcount() RETURNS OPAQUE AS ' switch $TG_op { INSERT { @@ -1367,161 +1374,163 @@ $args CREATE TRIGGER trig_mytab_modcount BEFORE INSERT OR UPDATE ON mytab FOR EACH ROW EXECUTE PROCEDURE trigfunc_modcount('modcnt'); - + - - + + - -Database Access from PL/Tcl + +Database Access from PL/Tcl - + The following commands are available to access the database from the body of a PL/Tcl procedure: - + - + - - -elog level msg - - - + + +elog level msg + + + Fire a log message. Possible levels are NOTICE, WARN, ERROR, FATAL, DEBUG and NOIND like for the elog() C function. - - - + + + - - -quote string - - - + + +quote string + + + Duplicates all occurences of single quote and backslash characters. It should be used when variables are used in the query string given to spi_exec or spi_prepare (not for the value list on spi_execp). Think about a query string like - + "SELECT '$val' AS ret" - + where the Tcl variable val actually contains "doesn't". This would result in the final query string - + "SELECT 'doesn't' AS ret" - + what would cause a parse error during spi_exec or spi_prepare. It should contain - + "SELECT 'doesn''t' AS ret" - + and has to be written as - + "SELECT '[ quote $val ]' AS ret" - - - - + + + + - - -spi_exec ?-count n? ?-array name? query ?loop-body? - - - + + +spi_exec ?-count n? ?-array +>nam>?e>quee> ?loop-body? + + + Call parser/planner/optimizer/executor for query. The optional -count value tells spi_exec the maximum number of rows to be processed by the query. - - + + If the query is a SELECT statement and the optional loop-body (a body of Tcl commands like in a foreach statement) is given, it is evaluated for each row selected and behaves like expected on continue/break. The values of selected fields are put into variables named as the column names. So a - + spi_exec "SELECT count(*) AS cnt FROM pg_proc" - + will set the variable $cnt to the number of rows in the pg_proc system catalog. If the option -array is given, the column values are stored in the associative array named 'name' indexed by the column name instead of individual variables. - + spi_exec -array C "SELECT * FROM pg_class" { elog DEBUG "have table $C(relname)" } - + will print a DEBUG log message for every row of pg_class. The return value of spi_exec is the number of rows affected by query as found in the global variable SPI_processed. - - - + + + - - -spi_prepare query typelist - - - + + +spi_prepare query typelist + + + Prepares AND SAVES a query plan for later execution. It is a bit different from the C level SPI_prepare in that the plan is automatically copied to the toplevel memory context. Thus, there is currently no way of preparing a plan without saving it. - - + + If the query references arguments, the type names must be given as a Tcl list. The return value from spi_prepare is a query ID to be used in subsequent calls to spi_execp. See spi_execp for a sample. - - - + + + - - -spi_exec ?-count n? ?-array name? ?-nulls str? query ?valuelist? ?loop-body? - - - + + +spi_exec ?-count n? ?-array +>nam>? ?-nullse>se>le>quleble>valueble>? ?loop-body? + + + Execute a prepared plan from spi_prepare with variable substitution. The optional -count value tells spi_execp the maximum number of rows to be processed by the query. - - + + The optional value for -nulls is a string of spaces and 'n' characters telling spi_execp which of the values are NULL's. If given, it must have exactly the length of the number of values. - - + + The queryid is the ID returned by the spi_prepare call. - - + + If there was a typelist given to spi_prepare, a Tcl list of values of exactly the same length must be given to spi_execp after the query. If the type list on spi_prepare was empty, this argument must be omitted. - - + + If the query is a SELECT statement, the same as described for spi_exec happens for the loop-body and the variables for the fields selected. - - + + Here's an example for a PL/Tcl function using a prepared plan: - + CREATE FUNCTION t1_count(int4, int4) RETURNS int4 AS ' if {![ info exists GD(plan) ]} { # prepare the saved plan on the first call @@ -1532,7 +1541,7 @@ spi_exec ?-count n? ?-array name + Note that each backslash that Tcl should see must be doubled in the query creating the function, since the main parser processes @@ -1540,16 +1549,16 @@ spi_exec ?-count n? ?-array name - - + + + - - + + Modules and the unknown command - - - + + + PL/Tcl has a special support for things often used. It recognizes two magic tables, pltcl_modules and pltcl_modfuncs. If these exist, the module 'unknown' is loaded into the interpreter @@ -1558,21 +1567,36 @@ Modules and the unknown command of the modules. If this is true, the module is loaded on demand. To enable this behavior, the PL/Tcl call handler must be compiled with -DPLTCL_UNKNOWN_SUPPORT set. - - + + There are support scripts to maintain these tables in the modules subdirectory of the PL/Tcl source including the source for the unknown module that must get installed initially. - - - + + + - + - + - - + + + - - +