2003-04-10 03:22:45 +02:00
|
|
|
<!--
|
2005-01-08 23:13:38 +01:00
|
|
|
$PostgreSQL: pgsql/doc/src/sgml/xtypes.sgml,v 1.24 2005/01/08 22:13:38 tgl Exp $
|
2003-04-10 03:22:45 +02:00
|
|
|
-->
|
|
|
|
|
|
|
|
<sect1 id="xtypes">
|
|
|
|
<title>User-Defined Types</title>
|
2001-05-13 00:51:36 +02:00
|
|
|
|
|
|
|
<indexterm zone="xtypes">
|
2003-08-31 19:32:24 +02:00
|
|
|
<primary>data type</primary>
|
|
|
|
<secondary>user-defined</secondary>
|
2001-05-13 00:51:36 +02:00
|
|
|
</indexterm>
|
|
|
|
|
2000-05-02 22:02:03 +02:00
|
|
|
<para>
|
2003-10-22 00:51:14 +02:00
|
|
|
As described in <xref linkend="extend-type-system">,
|
|
|
|
<productname>PostgreSQL</productname> can be extended to support new
|
|
|
|
data types. This section describes how to define new base types,
|
|
|
|
which are data types defined below the level of the <acronym>SQL</>
|
|
|
|
language. Creating a new base type requires implementing functions
|
|
|
|
to operate on the type in a low-level language, usually C.
|
2000-05-02 22:02:03 +02:00
|
|
|
</para>
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
|
|
|
The examples in this section can be found in
|
|
|
|
<filename>complex.sql</filename> and <filename>complex.c</filename>
|
2003-10-22 00:51:14 +02:00
|
|
|
in the <filename>src/tutorial</> directory of the source distribution.
|
|
|
|
See the <filename>README</> file in that directory for instructions
|
|
|
|
about running the examples.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
|
|
|
<indexterm>
|
|
|
|
<primary>input function</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary>output function</primary>
|
|
|
|
</indexterm>
|
2003-08-31 19:32:24 +02:00
|
|
|
A user-defined type must always have input and output
|
|
|
|
functions.<indexterm><primary>input function</primary><secondary>of
|
|
|
|
a data type</secondary></indexterm><indexterm><primary>output
|
|
|
|
function</primary><secondary>of a data type</secondary></indexterm>
|
2002-01-07 03:29:15 +01:00
|
|
|
These functions determine how the type appears in strings (for input
|
|
|
|
by the user and output to the user) and how the type is organized in
|
|
|
|
memory. The input function takes a null-terminated character string
|
2003-08-31 19:32:24 +02:00
|
|
|
as its argument and returns the internal (in memory) representation
|
|
|
|
of the type. The output function takes the internal representation
|
|
|
|
of the type as argument and returns a null-terminated character
|
2003-10-22 00:51:14 +02:00
|
|
|
string. If we want to do anything more with the type than merely
|
|
|
|
store it, we must provide additional functions to implement whatever
|
|
|
|
operations we'd like to have for the type.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-10 03:22:45 +02:00
|
|
|
Suppose we want to define a type <type>complex</> that represents
|
2003-10-22 00:51:14 +02:00
|
|
|
complex numbers. A natural way to represent a complex number in
|
2003-04-10 03:22:45 +02:00
|
|
|
memory would be the following C structure:
|
2002-01-07 03:29:15 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2000-05-02 22:02:03 +02:00
|
|
|
typedef struct Complex {
|
|
|
|
double x;
|
|
|
|
double y;
|
|
|
|
} Complex;
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2003-10-22 00:51:14 +02:00
|
|
|
We will need to make this a pass-by-reference type, since it's too
|
|
|
|
large to fit into a single <type>Datum</> value.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-10 03:22:45 +02:00
|
|
|
As the external string representation of the type, we choose a
|
|
|
|
string of the form <literal>(x,y)</literal>.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-04-10 03:22:45 +02:00
|
|
|
The input and output functions are usually not hard to write,
|
|
|
|
especially the output function. But when defining the external
|
|
|
|
string representation of the type, remember that you must eventually
|
|
|
|
write a complete and robust parser for that representation as your
|
|
|
|
input function. For instance:
|
2002-01-07 03:29:15 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2003-10-22 00:51:14 +02:00
|
|
|
PG_FUNCTION_INFO_V1(complex_in);
|
|
|
|
|
|
|
|
Datum
|
|
|
|
complex_in(PG_FUNCTION_ARGS)
|
2000-05-02 22:02:03 +02:00
|
|
|
{
|
2003-10-22 00:51:14 +02:00
|
|
|
char *str = PG_GETARG_CSTRING(0);
|
2003-07-27 19:10:07 +02:00
|
|
|
double x,
|
|
|
|
y;
|
|
|
|
Complex *result;
|
2003-04-10 03:22:45 +02:00
|
|
|
|
|
|
|
if (sscanf(str, " ( %lf , %lf )", &x, &y) != 2)
|
2003-07-27 19:10:07 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
|
2003-10-22 00:51:14 +02:00
|
|
|
errmsg("invalid input syntax for complex: \"%s\"",
|
|
|
|
str)));
|
2003-07-27 19:10:07 +02:00
|
|
|
|
2003-04-10 03:22:45 +02:00
|
|
|
result = (Complex *) palloc(sizeof(Complex));
|
2000-05-02 22:02:03 +02:00
|
|
|
result->x = x;
|
|
|
|
result->y = y;
|
2003-10-22 00:51:14 +02:00
|
|
|
PG_RETURN_POINTER(result);
|
2000-05-02 22:02:03 +02:00
|
|
|
}
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2003-04-10 03:22:45 +02:00
|
|
|
The output function can simply be:
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<programlisting>
|
2003-10-22 00:51:14 +02:00
|
|
|
PG_FUNCTION_INFO_V1(complex_out);
|
|
|
|
|
|
|
|
Datum
|
|
|
|
complex_out(PG_FUNCTION_ARGS)
|
2000-05-02 22:02:03 +02:00
|
|
|
{
|
2003-10-22 00:51:14 +02:00
|
|
|
Complex *complex = (Complex *) PG_GETARG_POINTER(0);
|
|
|
|
char *result;
|
2003-04-10 03:22:45 +02:00
|
|
|
|
2003-10-22 00:51:14 +02:00
|
|
|
result = (char *) palloc(100);
|
|
|
|
snprintf(result, 100, "(%g,%g)", complex->x, complex->y);
|
|
|
|
PG_RETURN_CSTRING(result);
|
2000-05-02 22:02:03 +02:00
|
|
|
}
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2003-04-10 03:22:45 +02:00
|
|
|
</para>
|
2002-01-07 03:29:15 +01:00
|
|
|
|
2003-04-10 03:22:45 +02:00
|
|
|
<para>
|
2003-10-22 00:51:14 +02:00
|
|
|
You should be careful to make the input and output functions inverses of
|
2003-04-10 03:22:45 +02:00
|
|
|
each other. If you do not, you will have severe problems when you
|
|
|
|
need to dump your data into a file and then read it back in. This
|
|
|
|
is a particularly common problem when floating-point numbers are
|
|
|
|
involved.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
2003-10-22 00:51:14 +02:00
|
|
|
Optionally, a user-defined type can provide binary input and output
|
|
|
|
routines. Binary I/O is normally faster but less portable than textual
|
|
|
|
I/O. As with textual I/O, it is up to you to define exactly what the
|
2003-11-01 02:56:29 +01:00
|
|
|
external binary representation is. Most of the built-in data types
|
2003-10-22 00:51:14 +02:00
|
|
|
try to provide a machine-independent binary representation. For
|
|
|
|
<type>complex</type>, we will piggy-back on the binary I/O converters
|
|
|
|
for type <type>float8</>:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
PG_FUNCTION_INFO_V1(complex_recv);
|
|
|
|
|
|
|
|
Datum
|
|
|
|
complex_recv(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
StringInfo buf = (StringInfo) PG_GETARG_POINTER(0);
|
|
|
|
Complex *result;
|
|
|
|
|
|
|
|
result = (Complex *) palloc(sizeof(Complex));
|
|
|
|
result->x = pq_getmsgfloat8(buf);
|
|
|
|
result->y = pq_getmsgfloat8(buf);
|
|
|
|
PG_RETURN_POINTER(result);
|
|
|
|
}
|
|
|
|
|
|
|
|
PG_FUNCTION_INFO_V1(complex_send);
|
|
|
|
|
|
|
|
Datum
|
|
|
|
complex_send(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
Complex *complex = (Complex *) PG_GETARG_POINTER(0);
|
|
|
|
StringInfoData buf;
|
|
|
|
|
|
|
|
pq_begintypsend(&buf);
|
|
|
|
pq_sendfloat8(&buf, complex->x);
|
|
|
|
pq_sendfloat8(&buf, complex->y);
|
|
|
|
PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
|
|
|
|
}
|
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
To define the <type>complex</type> type, we need to create the
|
|
|
|
user-defined I/O functions before creating the type:
|
2002-01-07 03:29:15 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2002-08-22 02:01:51 +02:00
|
|
|
CREATE FUNCTION complex_in(cstring)
|
2000-05-02 22:02:03 +02:00
|
|
|
RETURNS complex
|
2003-04-10 03:22:45 +02:00
|
|
|
AS '<replaceable>filename</replaceable>'
|
2003-10-22 00:51:14 +02:00
|
|
|
LANGUAGE C IMMUTABLE STRICT;
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2002-08-22 02:01:51 +02:00
|
|
|
CREATE FUNCTION complex_out(complex)
|
|
|
|
RETURNS cstring
|
2003-04-10 03:22:45 +02:00
|
|
|
AS '<replaceable>filename</replaceable>'
|
2003-10-22 00:51:14 +02:00
|
|
|
LANGUAGE C IMMUTABLE STRICT;
|
|
|
|
|
|
|
|
CREATE FUNCTION complex_recv(internal)
|
|
|
|
RETURNS complex
|
|
|
|
AS '<replaceable>filename</replaceable>'
|
|
|
|
LANGUAGE C IMMUTABLE STRICT;
|
|
|
|
|
|
|
|
CREATE FUNCTION complex_send(complex)
|
|
|
|
RETURNS bytea
|
|
|
|
AS '<replaceable>filename</replaceable>'
|
|
|
|
LANGUAGE C IMMUTABLE STRICT;
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
2003-04-10 03:22:45 +02:00
|
|
|
|
|
|
|
Notice that the declarations of the input and output functions must
|
|
|
|
reference the not-yet-defined type. This is allowed, but will draw
|
2003-10-22 00:51:14 +02:00
|
|
|
warning messages that may be ignored. The input function must
|
|
|
|
appear first.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
|
|
|
Finally, we can declare the data type:
|
|
|
|
<programlisting>
|
2000-05-02 22:02:03 +02:00
|
|
|
CREATE TYPE complex (
|
2003-10-22 00:51:14 +02:00
|
|
|
internallength = 16,
|
|
|
|
input = complex_in,
|
|
|
|
output = complex_out,
|
|
|
|
receive = complex_recv,
|
|
|
|
send = complex_send,
|
|
|
|
alignment = double
|
2000-05-02 22:02:03 +02:00
|
|
|
);
|
2002-01-07 03:29:15 +01:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
2000-05-02 22:02:03 +02:00
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
2003-04-10 03:22:45 +02:00
|
|
|
When you define a new base type,
|
2002-01-07 03:29:15 +01:00
|
|
|
<productname>PostgreSQL</productname> automatically provides support
|
2003-04-10 03:22:45 +02:00
|
|
|
for arrays of that
|
|
|
|
type.<indexterm><primary>array</primary><secondary>of user-defined
|
|
|
|
type</secondary></indexterm> For historical reasons, the array type
|
|
|
|
has the same name as the base type with the underscore character
|
|
|
|
(<literal>_</>) prepended.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
|
2003-10-22 00:51:14 +02:00
|
|
|
<para>
|
|
|
|
Once the data type exists, we can declare additional functions to
|
|
|
|
provide useful operations on the data type. Operators can then be
|
|
|
|
defined atop the functions, and if needed, operator classes can be
|
|
|
|
created to support indexing of the data type. These additional
|
|
|
|
layers are discussed in following sections.
|
|
|
|
</para>
|
|
|
|
|
2002-01-07 03:29:15 +01:00
|
|
|
<para>
|
2003-04-10 03:22:45 +02:00
|
|
|
If the values of your data type might exceed a few hundred bytes in
|
2003-10-22 00:51:14 +02:00
|
|
|
size (in internal form), you should make the data type
|
2003-04-10 03:22:45 +02:00
|
|
|
TOAST-able.<indexterm><primary>TOAST</primary><secondary>and
|
|
|
|
user-defined types</secondary></indexterm> To do this, the internal
|
|
|
|
representation must follow the standard layout for variable-length
|
|
|
|
data: the first four bytes must be an <type>int32</type> containing
|
2003-10-22 00:51:14 +02:00
|
|
|
the total length in bytes of the datum (including itself). The C
|
|
|
|
functions operating on the data type must be careful to unpack any
|
2005-01-08 23:13:38 +01:00
|
|
|
toasted values they are handed, by using <function>PG_DETOAST_DATUM</>.
|
|
|
|
(This detail is customarily hidden by defining type-specific
|
|
|
|
<function>GETARG</function> macros.) Then,
|
2003-04-10 03:22:45 +02:00
|
|
|
when running the <command>CREATE TYPE</command> command, specify the
|
|
|
|
internal length as <literal>variable</> and select the appropriate
|
|
|
|
storage option.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2003-08-10 03:20:34 +02:00
|
|
|
For further details see the description of the
|
|
|
|
<xref linkend="sql-createtype" endterm="sql-createtype-title"> command.
|
2002-01-07 03:29:15 +01:00
|
|
|
</para>
|
2003-04-10 03:22:45 +02:00
|
|
|
</sect1>
|
2000-05-02 22:02:03 +02:00
|
|
|
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
|
|
Local variables:
|
|
|
|
mode:sgml
|
|
|
|
sgml-omittag:nil
|
|
|
|
sgml-shorttag:t
|
|
|
|
sgml-minimize-attributes:nil
|
|
|
|
sgml-always-quote-attributes:t
|
|
|
|
sgml-indent-step:1
|
|
|
|
sgml-indent-data:t
|
|
|
|
sgml-parent-document:nil
|
|
|
|
sgml-default-dtd-file:"./reference.ced"
|
|
|
|
sgml-exposed-tags:nil
|
|
|
|
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
|
|
|
sgml-local-ecat-files:nil
|
|
|
|
End:
|
|
|
|
-->
|