postgresql/doc/src/sgml/ref/create_type.sgml

541 lines
20 KiB
Plaintext

<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/create_type.sgml,v 1.48 2003/11/29 19:51:38 pgsql Exp $
PostgreSQL documentation
-->
<refentry id="SQL-CREATETYPE">
<refmeta>
<refentrytitle id="sql-createtype-title">CREATE TYPE</refentrytitle>
<refmiscinfo>SQL - Language Statements</refmiscinfo>
</refmeta>
<refnamediv>
<refname>CREATE TYPE</refname>
<refpurpose>define a new data type</refpurpose>
</refnamediv>
<indexterm zone="sql-createtype">
<primary>CREATE TYPE</primary>
</indexterm>
<refsynopsisdiv>
<synopsis>
CREATE TYPE <replaceable class="parameter">name</replaceable> AS
( <replaceable class="PARAMETER">attribute_name</replaceable> <replaceable class="PARAMETER">data_type</replaceable> [, ... ] )
CREATE TYPE <replaceable class="parameter">name</replaceable> (
INPUT = <replaceable class="parameter">input_function</replaceable>,
OUTPUT = <replaceable class="parameter">output_function</replaceable>
[ , RECEIVE = <replaceable class="parameter">receive_function</replaceable> ]
[ , SEND = <replaceable class="parameter">send_function</replaceable> ]
[ , INTERNALLENGTH = { <replaceable class="parameter">internallength</replaceable> | VARIABLE } ]
[ , PASSEDBYVALUE ]
[ , ALIGNMENT = <replaceable class="parameter">alignment</replaceable> ]
[ , STORAGE = <replaceable class="parameter">storage</replaceable> ]
[ , DEFAULT = <replaceable class="parameter">default</replaceable> ]
[ , ELEMENT = <replaceable class="parameter">element</replaceable> ]
[ , DELIMITER = <replaceable class="parameter">delimiter</replaceable> ]
)
</synopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<command>CREATE TYPE</command> registers a new data type for use in
the current data base. The user who defines a type becomes its
owner.
</para>
<para>
If a schema name is given then the type is created in the specified
schema. Otherwise it is created in the current schema. The type
name must be distinct from the name of any existing type or domain
in the same schema. (Because tables have associated data types,
the type name must also be distinct from the name of any existing
table in the same schema.)
</para>
<refsect2>
<title>Composite Types</title>
<para>
The first form of <command>CREATE TYPE</command>
creates a composite type.
The composite type is specified by a list of attribute names and data types.
This is essentially the same as the row type
of a table, but using <command>CREATE TYPE</command> avoids the need to
create an actual table when all that is wanted is to define a type.
A stand-alone composite type is useful as the return type of a function.
</para>
</refsect2>
<refsect2>
<title>Base Types</title>
<para>
The second form of <command>CREATE TYPE</command> creates a new base type
(scalar type). The parameters may appear in any order, not only that
illustrated above, and most are optional. You must register
two or more functions (using <command>CREATE FUNCTION</command>) before
defining the type. The support functions
<replaceable class="parameter">input_function</replaceable> and
<replaceable class="parameter">output_function</replaceable>
are required, while the functions
<replaceable class="parameter">receive_function</replaceable> and
<replaceable class="parameter">send_function</replaceable>
are optional. Generally these functions have to be coded in C
or another low-level language.
</para>
<para>
The <replaceable class="parameter">input_function</replaceable>
converts the type's external textual representation to the internal
representation used by the
operators and functions defined for the type.
<replaceable class="parameter">output_function</replaceable>
performs the reverse transformation. The input function may be
declared as taking one argument of type <type>cstring</type>,
or as taking three arguments of types
<type>cstring</type>, <type>oid</type>, <type>integer</type>.
The first argument is the input text as a C string, the second
argument is the element type in case this is an array type,
and the third is the <literal>typmod</> of the destination column, if known.
The input function should return a value of the data type itself.
The output function may be
declared as taking one argument of the new data type, or as taking
two arguments of which the second is type <type>oid</type>.
The second argument is again the array element type for array types.
The output function should return type <type>cstring</type>.
</para>
<para>
The optional <replaceable class="parameter">receive_function</replaceable>
converts the type's external binary representation to the internal
representation. If this function is not supplied, the type cannot
participate in binary input. The binary representation should be
chosen to be cheap to convert to internal form, while being reasonably
portable. (For example, the standard integer data types use network
byte order as the external binary representation, while the internal
representation is in the machine's native byte order.) The receive
function should perform adequate checking to ensure that the value is
valid.
The receive function may be declared as taking one argument of type
<type>internal</type>, or two arguments of types <type>internal</type>
and <type>oid</type>. It must return a value of the data type itself.
(The first argument is a pointer to a <type>StringInfo</type> buffer
holding the received byte string; the optional second argument is the
element type in case this is an array type.) Similarly, the optional
<replaceable class="parameter">send_function</replaceable> converts
from the internal representation to the external binary representation.
If this function is not supplied, the type cannot participate in binary
output. The send function may be
declared as taking one argument of the new data type, or as taking
two arguments of which the second is type <type>oid</type>.
The second argument is again the array element type for array types.
The send function must return type <type>bytea</type>.
</para>
<para>
You should at this point be wondering how the input and output functions
can be declared to have results or arguments of the new type, when they have
to be created before the new type can be created. The answer is that the
input function must be created first, then the output function (and
the binary I/O functions if wanted), and finally the data type.
<productname>PostgreSQL</productname> will first see the name of the new
data type as the return type of the input function. It will create a
<quote>shell</> type, which is simply a placeholder entry in
the system catalog, and link the input function definition to the shell
type. Similarly the other functions will be linked to the (now already
existing) shell type. Finally, <command>CREATE TYPE</> replaces the
shell entry with a complete type definition, and the new type can be used.
</para>
<para>
While the details of the new type's internal representation are only
known to the I/O functions and other functions you create to work with
the type, there are several properties of the internal representation
that must be declared to <productname>PostgreSQL</productname>.
Foremost of these is
<replaceable class="parameter">internallength</replaceable>.
Base data types can be fixed-length, in which case
<replaceable class="parameter">internallength</replaceable> is a
positive integer, or variable length, indicated by setting
<replaceable class="parameter">internallength</replaceable>
to <literal>VARIABLE</literal>. (Internally, this is represented
by setting <literal>typlen</> to -1.) The internal representation of all
variable-length types must start with a 4-byte integer giving the total
length of this value of the type.
</para>
<para>
The optional flag <literal>PASSEDBYVALUE</literal> indicates that
values of this data type are passed by value, rather than by
reference. You may not pass by value types whose internal
representation is larger than the size of the <type>Datum</> type
(4 bytes on most machines, 8 bytes on a few).
</para>
<para>
The <replaceable class="parameter">alignment</replaceable> parameter
specifies the storage alignment required for the data type. The
allowed values equate to alignment on 1, 2, 4, or 8 byte boundaries.
Note that variable-length types must have an alignment of at least
4, since they necessarily contain an <type>int4</> as their first component.
</para>
<para>
The <replaceable class="parameter">storage</replaceable> parameter
allows selection of storage strategies for variable-length data
types. (Only <literal>plain</literal> is allowed for fixed-length
types.) <literal>plain</literal> specifies that data of the type
will always be stored in-line and not compressed.
<literal>extended</literal> specifies that the system will first
try to compress a long data value, and will move the value out of
the main table row if it's still too long.
<literal>external</literal> allows the value to be moved out of the
main table, but the system will not try to compress it.
<literal>main</literal> allows compression, but discourages moving
the value out of the main table. (Data items with this storage
strategy may still be moved out of the main table if there is no
other way to make a row fit, but they will be kept in the main
table preferentially over <literal>extended</literal> and
<literal>external</literal> items.)
</para>
<para>
A default value may be specified, in case a user wants columns of the
data type to default to something other than the null value.
Specify the default with the <literal>DEFAULT</literal> key word.
(Such a default may be overridden by an explicit <literal>DEFAULT</literal>
clause attached to a particular column.)
</para>
<para>
To indicate that a type is an array, specify the type of the array
elements using the <literal>ELEMENT</> key word. For example, to
define an array of 4-byte integers (<type>int4</type>), specify
<literal>ELEMENT = int4</literal>. More details about array types
appear below.
</para>
<para>
To indicate the delimiter to be used between values in the external
representation of arrays of this type, <replaceable
class="parameter">delimiter</replaceable> can be
set to a specific character. The default delimiter is the comma
(<literal>,</literal>). Note that the delimiter is associated
with the array element type, not the array type itself.
</para>
</refsect2>
<refsect2>
<title>Array Types</title>
<para>
Whenever a user-defined base data type is created,
<productname>PostgreSQL</productname> automatically creates an
associated array type, whose name consists of the base type's
name prepended with an underscore. The parser understands this
naming convention, and translates requests for columns of type
<literal>foo[]</> into requests for type <literal>_foo</>.
The implicitly-created array type is variable length and uses the
built-in input and output functions <literal>array_in</> and
<literal>array_out</>.
</para>
<para>
You might reasonably ask why there is an <option>ELEMENT</>
option, if the system makes the correct array type automatically.
The only case where it's useful to use <option>ELEMENT</> is when you are
making a fixed-length type that happens to be internally an array of a number of
identical things, and you want to allow these things to be accessed
directly by subscripting, in addition to whatever operations you plan
to provide for the type as a whole. For example, type <type>name</>
allows its constituent <type>char</> elements to be accessed this way.
A 2-D <type>point</> type could allow its two component numbers to be
accessed like <literal>point[0]</> and <literal>point[1]</>.
Note that
this facility only works for fixed-length types whose internal form
is exactly a sequence of identical fixed-length fields. A subscriptable
variable-length type must have the generalized internal representation
used by <literal>array_in</> and <literal>array_out</>.
For historical reasons (i.e., this is clearly wrong but it's far too
late to change it), subscripting of fixed-length array types starts from
zero, rather than from one as for variable-length arrays.
</para>
</refsect2>
</refsect1>
<refsect1>
<title>Parameters</title>
<variablelist>
<varlistentry>
<term><replaceable class="parameter">name</replaceable></term>
<listitem>
<para>
The name (optionally schema-qualified) of a type to be created.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">attribute_name</replaceable></term>
<listitem>
<para>
The name of an attribute (column) for the composite type.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">data_type</replaceable></term>
<listitem>
<para>
The name of an existing data type to become a column of the
composite type.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">input_function</replaceable></term>
<listitem>
<para>
The name of a function that converts data from the type's
external textual form to its internal form.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">output_function</replaceable></term>
<listitem>
<para>
The name of a function that converts data from the type's
internal form to its external textual form.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">receive_function</replaceable></term>
<listitem>
<para>
The name of a function that converts data from the type's
external binary form to its internal form.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">send_function</replaceable></term>
<listitem>
<para>
The name of a function that converts data from the type's
internal form to its external binary form.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">internallength</replaceable></term>
<listitem>
<para>
A numeric constant that specifies the length in bytes of the new
type's internal representation. The default assumption is that
it is variable-length.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">alignment</replaceable></term>
<listitem>
<para>
The storage alignment requirement of the data type. If specified,
it must be <literal>char</literal>, <literal>int2</literal>,
<literal>int4</literal>, or <literal>double</literal>; the
default is <literal>int4</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">storage</replaceable></term>
<listitem>
<para>
The storage strategy for the data type. If specified, must be
<literal>plain</literal>, <literal>external</literal>,
<literal>extended</literal>, or <literal>main</literal>; the
default is <literal>plain</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">default</replaceable></term>
<listitem>
<para>
The default value for the data type. If this is omitted, the
default is null.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">element</replaceable></term>
<listitem>
<para>
The type being created is an array; this specifies the type of
the array elements.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">delimiter</replaceable></term>
<listitem>
<para>
The delimiter character to be used between values in arrays made
of this type.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="SQL-CREATETYPE-notes">
<title>Notes</title>
<para>
User-defined type names cannot begin with the underscore character
(<literal>_</literal>) and can only be 62 characters
long (or in general <symbol>NAMEDATALEN</symbol> - 2, rather than
the <symbol>NAMEDATALEN</symbol> - 1 characters allowed for other
names). Type names beginning with underscore are reserved for
internally-created array type names.
</para>
<para>
In <productname>PostgreSQL</productname> versions before 7.3, it
was customary to avoid creating a shell type by replacing the
functions' forward references to the type name with the placeholder
pseudotype <type>opaque</>. The <type>cstring</> arguments and
results also had to be declared as <type>opaque</> before 7.3. To
support loading of old dump files, <command>CREATE TYPE</> will
accept functions declared using <type>opaque</>, but it will issue
a notice and change the function's declaration to use the correct
types.
</para>
</refsect1>
<refsect1>
<title>Examples</title>
<para>
This example creates a composite type and uses it in
a function definition:
<programlisting>
CREATE TYPE compfoo AS (f1 int, f2 text);
CREATE FUNCTION getfoo() RETURNS SETOF compfoo AS
'SELECT fooid, fooname FROM foo' LANGUAGE SQL;
</programlisting>
</para>
<para>
This example creates the base data type <type>box</type> and then uses the
type in a table definition:
<programlisting>
CREATE TYPE box (
INTERNALLENGTH = 16,
INPUT = my_box_in_function,
OUTPUT = my_box_out_function
);
CREATE TABLE myboxes (
id integer,
description box
);
</programlisting>
</para>
<para>
If the internal structure of <type>box</type> were an array of four
<type>float4</> elements, we might instead use
<programlisting>
CREATE TYPE box (
INTERNALLENGTH = 16,
INPUT = my_box_in_function,
OUTPUT = my_box_out_function,
ELEMENT = float4
);
</programlisting>
which would allow a box value's component numbers to be accessed
by subscripting. Otherwise the type behaves the same as before.
</para>
<para>
This example creates a large object type and uses it in
a table definition:
<programlisting>
CREATE TYPE bigobj (
INPUT = lo_filein, OUTPUT = lo_fileout,
INTERNALLENGTH = VARIABLE
);
CREATE TABLE big_objs (
id integer,
obj bigobj
);
</programlisting>
</para>
<para>
More examples, including suitable input and output functions, are
in <xref linkend="extend">.
</para>
</refsect1>
<refsect1 id="SQL-CREATETYPE-compatibility">
<title>Compatibility</title>
<para>
This <command>CREATE TYPE</command> command is a
<productname>PostgreSQL</productname> extension. There is a
<command>CREATE TYPE</command> statement in SQL99 that is rather
different in detail.
</para>
</refsect1>
<refsect1 id="SQL-CREATETYPE-see-also">
<title>See Also</title>
<simplelist type="inline">
<member><xref linkend="sql-createfunction" endterm="sql-createfunction-title"></member>
<member><xref linkend="sql-droptype" endterm="sql-droptype-title"></member>
</simplelist>
</refsect1>
</refentry>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"../reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:"/usr/lib/sgml/catalog"
sgml-local-ecat-files:nil
End:
-->