541 lines
21 KiB
Plaintext
541 lines
21 KiB
Plaintext
<!-- doc/src/sgml/rowtypes.sgml -->
|
|
|
|
<sect1 id="rowtypes">
|
|
<title>Composite Types</title>
|
|
|
|
<indexterm>
|
|
<primary>composite type</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>row type</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
A <firstterm>composite type</firstterm> represents the structure of a row or record;
|
|
it is essentially just a list of field names and their data types.
|
|
<productname>PostgreSQL</productname> allows composite types to be
|
|
used in many of the same ways that simple types can be used. For example, a
|
|
column of a table can be declared to be of a composite type.
|
|
</para>
|
|
|
|
<sect2 id="rowtypes-declaring">
|
|
<title>Declaration of Composite Types</title>
|
|
|
|
<para>
|
|
Here are two simple examples of defining composite types:
|
|
<programlisting>
|
|
CREATE TYPE complex AS (
|
|
r double precision,
|
|
i double precision
|
|
);
|
|
|
|
CREATE TYPE inventory_item AS (
|
|
name text,
|
|
supplier_id integer,
|
|
price numeric
|
|
);
|
|
</programlisting>
|
|
The syntax is comparable to <command>CREATE TABLE</command>, except that only
|
|
field names and types can be specified; no constraints (such as <literal>NOT
|
|
NULL</literal>) can presently be included. Note that the <literal>AS</literal> keyword
|
|
is essential; without it, the system will think a different kind
|
|
of <command>CREATE TYPE</command> command is meant, and you will get odd syntax
|
|
errors.
|
|
</para>
|
|
|
|
<para>
|
|
Having defined the types, we can use them to create tables:
|
|
|
|
<programlisting>
|
|
CREATE TABLE on_hand (
|
|
item inventory_item,
|
|
count integer
|
|
);
|
|
|
|
INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);
|
|
</programlisting>
|
|
|
|
or functions:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION price_extension(inventory_item, integer) RETURNS numeric
|
|
AS 'SELECT $1.price * $2' LANGUAGE SQL;
|
|
|
|
SELECT price_extension(item, 10) FROM on_hand;
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
<para>
|
|
Whenever you create a table, a composite type is also automatically
|
|
created, with the same name as the table, to represent the table's
|
|
row type. For example, had we said:
|
|
<programlisting>
|
|
CREATE TABLE inventory_item (
|
|
name text,
|
|
supplier_id integer REFERENCES suppliers,
|
|
price numeric CHECK (price > 0)
|
|
);
|
|
</programlisting>
|
|
then the same <literal>inventory_item</literal> composite type shown above would
|
|
come into being as a
|
|
byproduct, and could be used just as above. Note however an important
|
|
restriction of the current implementation: since no constraints are
|
|
associated with a composite type, the constraints shown in the table
|
|
definition <emphasis>do not apply</emphasis> to values of the composite type
|
|
outside the table. (To work around this, create a
|
|
<glossterm linkend="glossary-domain">domain</glossterm> over the composite
|
|
type, and apply the desired constraints as <literal>CHECK</literal>
|
|
constraints of the domain.)
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="rowtypes-constructing">
|
|
<title>Constructing Composite Values</title>
|
|
|
|
<indexterm>
|
|
<primary>composite type</primary>
|
|
<secondary>constant</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
To write a composite value as a literal constant, enclose the field
|
|
values within parentheses and separate them by commas. You can put double
|
|
quotes around any field value, and must do so if it contains commas or
|
|
parentheses. (More details appear <link
|
|
linkend="rowtypes-io-syntax">below</link>.) Thus, the general format of
|
|
a composite constant is the following:
|
|
<synopsis>
|
|
'( <replaceable>val1</replaceable> , <replaceable>val2</replaceable> , ... )'
|
|
</synopsis>
|
|
An example is:
|
|
<programlisting>
|
|
'("fuzzy dice",42,1.99)'
|
|
</programlisting>
|
|
which would be a valid value of the <literal>inventory_item</literal> type
|
|
defined above. To make a field be NULL, write no characters at all
|
|
in its position in the list. For example, this constant specifies
|
|
a NULL third field:
|
|
<programlisting>
|
|
'("fuzzy dice",42,)'
|
|
</programlisting>
|
|
If you want an empty string rather than NULL, write double quotes:
|
|
<programlisting>
|
|
'("",42,)'
|
|
</programlisting>
|
|
Here the first field is a non-NULL empty string, the third is NULL.
|
|
</para>
|
|
|
|
<para>
|
|
(These constants are actually only a special case of
|
|
the generic type constants discussed in <xref
|
|
linkend="sql-syntax-constants-generic"/>. The constant is initially
|
|
treated as a string and passed to the composite-type input conversion
|
|
routine. An explicit type specification might be necessary to tell
|
|
which type to convert the constant to.)
|
|
</para>
|
|
|
|
<para>
|
|
The <literal>ROW</literal> expression syntax can also be used to
|
|
construct composite values. In most cases this is considerably
|
|
simpler to use than the string-literal syntax since you don't have
|
|
to worry about multiple layers of quoting. We already used this
|
|
method above:
|
|
<programlisting>
|
|
ROW('fuzzy dice', 42, 1.99)
|
|
ROW('', 42, NULL)
|
|
</programlisting>
|
|
The ROW keyword is actually optional as long as you have more than one
|
|
field in the expression, so these can be simplified to:
|
|
<programlisting>
|
|
('fuzzy dice', 42, 1.99)
|
|
('', 42, NULL)
|
|
</programlisting>
|
|
The <literal>ROW</literal> expression syntax is discussed in more detail in <xref
|
|
linkend="sql-syntax-row-constructors"/>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="rowtypes-accessing">
|
|
<title>Accessing Composite Types</title>
|
|
|
|
<para>
|
|
To access a field of a composite column, one writes a dot and the field
|
|
name, much like selecting a field from a table name. In fact, it's so
|
|
much like selecting from a table name that you often have to use parentheses
|
|
to keep from confusing the parser. For example, you might try to select
|
|
some subfields from our <literal>on_hand</literal> example table with something
|
|
like:
|
|
|
|
<programlisting>
|
|
SELECT item.name FROM on_hand WHERE item.price > 9.99;
|
|
</programlisting>
|
|
|
|
This will not work since the name <literal>item</literal> is taken to be a table
|
|
name, not a column name of <literal>on_hand</literal>, per SQL syntax rules.
|
|
You must write it like this:
|
|
|
|
<programlisting>
|
|
SELECT (item).name FROM on_hand WHERE (item).price > 9.99;
|
|
</programlisting>
|
|
|
|
or if you need to use the table name as well (for instance in a multitable
|
|
query), like this:
|
|
|
|
<programlisting>
|
|
SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;
|
|
</programlisting>
|
|
|
|
Now the parenthesized object is correctly interpreted as a reference to
|
|
the <literal>item</literal> column, and then the subfield can be selected from it.
|
|
</para>
|
|
|
|
<para>
|
|
Similar syntactic issues apply whenever you select a field from a composite
|
|
value. For instance, to select just one field from the result of a function
|
|
that returns a composite value, you'd need to write something like:
|
|
|
|
<programlisting>
|
|
SELECT (my_func(...)).field FROM ...
|
|
</programlisting>
|
|
|
|
Without the extra parentheses, this will generate a syntax error.
|
|
</para>
|
|
|
|
<para>
|
|
The special field name <literal>*</literal> means <quote>all fields</quote>, as
|
|
further explained in <xref linkend="rowtypes-usage"/>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="rowtypes-modifying">
|
|
<title>Modifying Composite Types</title>
|
|
|
|
<para>
|
|
Here are some examples of the proper syntax for inserting and updating
|
|
composite columns.
|
|
First, inserting or updating a whole column:
|
|
|
|
<programlisting>
|
|
INSERT INTO mytab (complex_col) VALUES((1.1,2.2));
|
|
|
|
UPDATE mytab SET complex_col = ROW(1.1,2.2) WHERE ...;
|
|
</programlisting>
|
|
|
|
The first example omits <literal>ROW</literal>, the second uses it; we
|
|
could have done it either way.
|
|
</para>
|
|
|
|
<para>
|
|
We can update an individual subfield of a composite column:
|
|
|
|
<programlisting>
|
|
UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...;
|
|
</programlisting>
|
|
|
|
Notice here that we don't need to (and indeed cannot)
|
|
put parentheses around the column name appearing just after
|
|
<literal>SET</literal>, but we do need parentheses when referencing the same
|
|
column in the expression to the right of the equal sign.
|
|
</para>
|
|
|
|
<para>
|
|
And we can specify subfields as targets for <command>INSERT</command>, too:
|
|
|
|
<programlisting>
|
|
INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
|
|
</programlisting>
|
|
|
|
Had we not supplied values for all the subfields of the column, the
|
|
remaining subfields would have been filled with null values.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="rowtypes-usage">
|
|
<title>Using Composite Types in Queries</title>
|
|
|
|
<para>
|
|
There are various special syntax rules and behaviors associated with
|
|
composite types in queries. These rules provide useful shortcuts,
|
|
but can be confusing if you don't know the logic behind them.
|
|
</para>
|
|
|
|
<para>
|
|
In <productname>PostgreSQL</productname>, a reference to a table name (or alias)
|
|
in a query is effectively a reference to the composite value of the
|
|
table's current row. For example, if we had a table
|
|
<structname>inventory_item</structname> as shown
|
|
<link linkend="rowtypes-declaring">above</link>, we could write:
|
|
<programlisting>
|
|
SELECT c FROM inventory_item c;
|
|
</programlisting>
|
|
This query produces a single composite-valued column, so we might get
|
|
output like:
|
|
<programlisting>
|
|
c
|
|
------------------------
|
|
("fuzzy dice",42,1.99)
|
|
(1 row)
|
|
</programlisting>
|
|
Note however that simple names are matched to column names before table
|
|
names, so this example works only because there is no column
|
|
named <structfield>c</structfield> in the query's tables.
|
|
</para>
|
|
|
|
<para>
|
|
The ordinary qualified-column-name
|
|
syntax <replaceable>table_name</replaceable><literal>.</literal><replaceable>column_name</replaceable>
|
|
can be understood as applying <link linkend="field-selection">field
|
|
selection</link> to the composite value of the table's current row.
|
|
(For efficiency reasons, it's not actually implemented that way.)
|
|
</para>
|
|
|
|
<para>
|
|
When we write
|
|
<programlisting>
|
|
SELECT c.* FROM inventory_item c;
|
|
</programlisting>
|
|
then, according to the SQL standard, we should get the contents of the
|
|
table expanded into separate columns:
|
|
<programlisting>
|
|
name | supplier_id | price
|
|
------------+-------------+-------
|
|
fuzzy dice | 42 | 1.99
|
|
(1 row)
|
|
</programlisting>
|
|
as if the query were
|
|
<programlisting>
|
|
SELECT c.name, c.supplier_id, c.price FROM inventory_item c;
|
|
</programlisting>
|
|
<productname>PostgreSQL</productname> will apply this expansion behavior to
|
|
any composite-valued expression, although as shown <link
|
|
linkend="rowtypes-accessing">above</link>, you need to write parentheses
|
|
around the value that <literal>.*</literal> is applied to whenever it's not a
|
|
simple table name. For example, if <function>myfunc()</function> is a function
|
|
returning a composite type with columns <structfield>a</structfield>,
|
|
<structfield>b</structfield>, and <structfield>c</structfield>, then these two queries have the
|
|
same result:
|
|
<programlisting>
|
|
SELECT (myfunc(x)).* FROM some_table;
|
|
SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
<productname>PostgreSQL</productname> handles column expansion by
|
|
actually transforming the first form into the second. So, in this
|
|
example, <function>myfunc()</function> would get invoked three times per row
|
|
with either syntax. If it's an expensive function you may wish to
|
|
avoid that, which you can do with a query like:
|
|
<programlisting>
|
|
SELECT m.* FROM some_table, LATERAL myfunc(x) AS m;
|
|
</programlisting>
|
|
Placing the function in
|
|
a <literal>LATERAL</literal> <literal>FROM</literal> item keeps it from
|
|
being invoked more than once per row. <literal>m.*</literal> is still
|
|
expanded into <literal>m.a, m.b, m.c</literal>, but now those variables
|
|
are just references to the output of the <literal>FROM</literal> item.
|
|
(The <literal>LATERAL</literal> keyword is optional here, but we show it
|
|
to clarify that the function is getting <structfield>x</structfield>
|
|
from <structname>some_table</structname>.)
|
|
</para>
|
|
</tip>
|
|
|
|
<para>
|
|
The <replaceable>composite_value</replaceable><literal>.*</literal> syntax results in
|
|
column expansion of this kind when it appears at the top level of
|
|
a <link linkend="queries-select-lists"><command>SELECT</command> output
|
|
list</link>, a <link linkend="dml-returning"><literal>RETURNING</literal>
|
|
list</link> in <command>INSERT</command>/<command>UPDATE</command>/<command>DELETE</command>/<command>MERGE</command>,
|
|
a <link linkend="queries-values"><literal>VALUES</literal> clause</link>, or
|
|
a <link linkend="sql-syntax-row-constructors">row constructor</link>.
|
|
In all other contexts (including when nested inside one of those
|
|
constructs), attaching <literal>.*</literal> to a composite value does not
|
|
change the value, since it means <quote>all columns</quote> and so the
|
|
same composite value is produced again. For example,
|
|
if <function>somefunc()</function> accepts a composite-valued argument,
|
|
these queries are the same:
|
|
|
|
<programlisting>
|
|
SELECT somefunc(c.*) FROM inventory_item c;
|
|
SELECT somefunc(c) FROM inventory_item c;
|
|
</programlisting>
|
|
|
|
In both cases, the current row of <structname>inventory_item</structname> is
|
|
passed to the function as a single composite-valued argument.
|
|
Even though <literal>.*</literal> does nothing in such cases, using it is good
|
|
style, since it makes clear that a composite value is intended. In
|
|
particular, the parser will consider <literal>c</literal> in <literal>c.*</literal> to
|
|
refer to a table name or alias, not to a column name, so that there is
|
|
no ambiguity; whereas without <literal>.*</literal>, it is not clear
|
|
whether <literal>c</literal> means a table name or a column name, and in fact
|
|
the column-name interpretation will be preferred if there is a column
|
|
named <literal>c</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
Another example demonstrating these concepts is that all these queries
|
|
mean the same thing:
|
|
<programlisting>
|
|
SELECT * FROM inventory_item c ORDER BY c;
|
|
SELECT * FROM inventory_item c ORDER BY c.*;
|
|
SELECT * FROM inventory_item c ORDER BY ROW(c.*);
|
|
</programlisting>
|
|
All of these <literal>ORDER BY</literal> clauses specify the row's composite
|
|
value, resulting in sorting the rows according to the rules described
|
|
in <xref linkend="composite-type-comparison"/>. However,
|
|
if <structname>inventory_item</structname> contained a column
|
|
named <structfield>c</structfield>, the first case would be different from the
|
|
others, as it would mean to sort by that column only. Given the column
|
|
names previously shown, these queries are also equivalent to those above:
|
|
<programlisting>
|
|
SELECT * FROM inventory_item c ORDER BY ROW(c.name, c.supplier_id, c.price);
|
|
SELECT * FROM inventory_item c ORDER BY (c.name, c.supplier_id, c.price);
|
|
</programlisting>
|
|
(The last case uses a row constructor with the key word <literal>ROW</literal>
|
|
omitted.)
|
|
</para>
|
|
|
|
<para>
|
|
Another special syntactical behavior associated with composite values is
|
|
that we can use <firstterm>functional notation</firstterm> for extracting a field
|
|
of a composite value. The simple way to explain this is that
|
|
the notations <literal><replaceable>field</replaceable>(<replaceable>table</replaceable>)</literal>
|
|
and <literal><replaceable>table</replaceable>.<replaceable>field</replaceable></literal>
|
|
are interchangeable. For example, these queries are equivalent:
|
|
|
|
<programlisting>
|
|
SELECT c.name FROM inventory_item c WHERE c.price > 1000;
|
|
SELECT name(c) FROM inventory_item c WHERE price(c) > 1000;
|
|
</programlisting>
|
|
|
|
Moreover, if we have a function that accepts a single argument of a
|
|
composite type, we can call it with either notation. These queries are
|
|
all equivalent:
|
|
|
|
<programlisting>
|
|
SELECT somefunc(c) FROM inventory_item c;
|
|
SELECT somefunc(c.*) FROM inventory_item c;
|
|
SELECT c.somefunc FROM inventory_item c;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
This equivalence between functional notation and field notation
|
|
makes it possible to use functions on composite types to implement
|
|
<quote>computed fields</quote>.
|
|
<indexterm>
|
|
<primary>computed field</primary>
|
|
</indexterm>
|
|
<indexterm>
|
|
<primary>field</primary>
|
|
<secondary>computed</secondary>
|
|
</indexterm>
|
|
An application using the last query above wouldn't need to be directly
|
|
aware that <literal>somefunc</literal> isn't a real column of the table.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
Because of this behavior, it's unwise to give a function that takes a
|
|
single composite-type argument the same name as any of the fields of
|
|
that composite type. If there is ambiguity, the field-name
|
|
interpretation will be chosen if field-name syntax is used, while the
|
|
function will be chosen if function-call syntax is used. However,
|
|
<productname>PostgreSQL</productname> versions before 11 always chose the
|
|
field-name interpretation, unless the syntax of the call required it to
|
|
be a function call. One way to force the function interpretation in
|
|
older versions is to schema-qualify the function name, that is, write
|
|
<literal><replaceable>schema</replaceable>.<replaceable>func</replaceable>(<replaceable>compositevalue</replaceable>)</literal>.
|
|
</para>
|
|
</tip>
|
|
</sect2>
|
|
|
|
<sect2 id="rowtypes-io-syntax">
|
|
<title>Composite Type Input and Output Syntax</title>
|
|
|
|
<para>
|
|
The external text representation of a composite value consists of items that
|
|
are interpreted according to the I/O conversion rules for the individual
|
|
field types, plus decoration that indicates the composite structure.
|
|
The decoration consists of parentheses (<literal>(</literal> and <literal>)</literal>)
|
|
around the whole value, plus commas (<literal>,</literal>) between adjacent
|
|
items. Whitespace outside the parentheses is ignored, but within the
|
|
parentheses it is considered part of the field value, and might or might not be
|
|
significant depending on the input conversion rules for the field data type.
|
|
For example, in:
|
|
<programlisting>
|
|
'( 42)'
|
|
</programlisting>
|
|
the whitespace will be ignored if the field type is integer, but not if
|
|
it is text.
|
|
</para>
|
|
|
|
<para>
|
|
As shown previously, when writing a composite value you can write double
|
|
quotes around any individual field value.
|
|
You <emphasis>must</emphasis> do so if the field value would otherwise
|
|
confuse the composite-value parser. In particular, fields containing
|
|
parentheses, commas, double quotes, or backslashes must be double-quoted.
|
|
To put a double quote or backslash in a quoted composite field value,
|
|
precede it with a backslash. (Also, a pair of double quotes within a
|
|
double-quoted field value is taken to represent a double quote character,
|
|
analogously to the rules for single quotes in SQL literal strings.)
|
|
Alternatively, you can avoid quoting and use backslash-escaping to
|
|
protect all data characters
|
|
that would otherwise be taken as composite syntax.
|
|
</para>
|
|
|
|
<para>
|
|
A completely empty field value (no characters at all between the commas
|
|
or parentheses) represents a NULL. To write a value that is an empty
|
|
string rather than NULL, write <literal>""</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
The composite output routine will put double quotes around field values
|
|
if they are empty strings or contain parentheses, commas,
|
|
double quotes, backslashes, or white space. (Doing so for white space
|
|
is not essential, but aids legibility.) Double quotes and backslashes
|
|
embedded in field values will be doubled.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Remember that what you write in an SQL command will first be interpreted
|
|
as a string literal, and then as a composite. This doubles the number of
|
|
backslashes you need (assuming escape string syntax is used).
|
|
For example, to insert a <type>text</type> field
|
|
containing a double quote and a backslash in a composite
|
|
value, you'd need to write:
|
|
<programlisting>
|
|
INSERT ... VALUES ('("\"\\")');
|
|
</programlisting>
|
|
The string-literal processor removes one level of backslashes, so that
|
|
what arrives at the composite-value parser looks like
|
|
<literal>("\"\\")</literal>. In turn, the string
|
|
fed to the <type>text</type> data type's input routine
|
|
becomes <literal>"\</literal>. (If we were working
|
|
with a data type whose input routine also treated backslashes specially,
|
|
<type>bytea</type> for example, we might need as many as eight backslashes
|
|
in the command to get one backslash into the stored composite field.)
|
|
Dollar quoting (see <xref linkend="sql-syntax-dollar-quoting"/>) can be
|
|
used to avoid the need to double backslashes.
|
|
</para>
|
|
</note>
|
|
|
|
<tip>
|
|
<para>
|
|
The <literal>ROW</literal> constructor syntax is usually easier to work with
|
|
than the composite-literal syntax when writing composite values in SQL
|
|
commands.
|
|
In <literal>ROW</literal>, individual field values are written the same way
|
|
they would be written when not members of a composite.
|
|
</para>
|
|
</tip>
|
|
</sect2>
|
|
|
|
</sect1>
|