2000-03-31 05:27:42 +02:00
|
|
|
<!--
|
2001-05-13 00:51:36 +02:00
|
|
|
$Header: /cvsroot/pgsql/doc/src/sgml/xaggr.sgml,v 1.12 2001/05/12 22:51:36 petere Exp $
|
2000-03-31 05:27:42 +02:00
|
|
|
-->
|
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<chapter id="xaggr">
|
|
|
|
<title>Extending <acronym>SQL</acronym>: Aggregates</title>
|
1998-03-01 09:16:16 +01:00
|
|
|
|
2001-05-13 00:51:36 +02:00
|
|
|
<indexterm zone="xaggr">
|
|
|
|
<primary>aggregate functions</primary>
|
|
|
|
<secondary>extending</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<para>
|
|
|
|
Aggregate functions in <productname>Postgres</productname>
|
|
|
|
are expressed as <firstterm>state values</firstterm>
|
|
|
|
and <firstterm>state transition functions</firstterm>.
|
|
|
|
That is, an aggregate can be
|
|
|
|
defined in terms of state that is modified whenever an
|
|
|
|
input item is processed. To define a new aggregate
|
|
|
|
function, one selects a datatype for the state value,
|
|
|
|
an initial value for the state, and a state transition
|
|
|
|
function. The state transition function is just an
|
|
|
|
ordinary function that could also be used outside the
|
2000-07-17 05:05:41 +02:00
|
|
|
context of the aggregate. A <firstterm>final function</firstterm>
|
|
|
|
can also be specified, in case the desired output of the aggregate
|
|
|
|
is different from the data that needs to be kept in the running
|
|
|
|
state value.
|
2000-03-30 07:07:48 +02:00
|
|
|
</para>
|
2000-03-26 21:45:21 +02:00
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<para>
|
2000-07-17 05:05:41 +02:00
|
|
|
Thus, in addition to the input and result datatypes seen by a user
|
|
|
|
of the aggregate, there is an internal state-value datatype that
|
|
|
|
may be different from both the input and result types.
|
2000-03-30 07:07:48 +02:00
|
|
|
</para>
|
2000-03-26 21:45:21 +02:00
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<para>
|
2000-07-17 05:05:41 +02:00
|
|
|
If we define an aggregate that does not use a final function,
|
2000-03-30 07:07:48 +02:00
|
|
|
we have an aggregate that computes a running function of
|
2001-01-14 00:58:55 +01:00
|
|
|
the column values from each row. "Sum" is an
|
2000-03-30 07:07:48 +02:00
|
|
|
example of this kind of aggregate. "Sum" starts at
|
2001-01-14 00:58:55 +01:00
|
|
|
zero and always adds the current row's value to
|
2000-03-30 07:07:48 +02:00
|
|
|
its running total. For example, if we want to make a Sum
|
|
|
|
aggregate to work on a datatype for complex numbers,
|
|
|
|
we only need the addition function for that datatype.
|
|
|
|
The aggregate definition is:
|
|
|
|
|
|
|
|
<programlisting>
|
1998-10-30 20:37:19 +01:00
|
|
|
CREATE AGGREGATE complex_sum (
|
2000-07-17 05:05:41 +02:00
|
|
|
sfunc = complex_add,
|
1998-10-30 20:37:19 +01:00
|
|
|
basetype = complex,
|
2000-07-17 05:05:41 +02:00
|
|
|
stype = complex,
|
|
|
|
initcond = '(0,0)'
|
1998-10-30 20:37:19 +01:00
|
|
|
);
|
1998-03-01 09:16:16 +01:00
|
|
|
|
1998-10-30 20:37:19 +01:00
|
|
|
SELECT complex_sum(a) FROM test_complex;
|
1998-03-01 09:16:16 +01:00
|
|
|
|
|
|
|
+------------+
|
|
|
|
|complex_sum |
|
|
|
|
+------------+
|
|
|
|
|(34,53.9) |
|
|
|
|
+------------+
|
2000-03-30 07:07:48 +02:00
|
|
|
</programlisting>
|
2000-03-26 21:45:21 +02:00
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
(In practice, we'd just name the aggregate "sum", and rely on
|
|
|
|
<productname>Postgres</productname> to figure out which kind
|
|
|
|
of sum to apply to a complex column.)
|
|
|
|
</para>
|
1998-03-01 09:16:16 +01:00
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<para>
|
2000-07-17 05:05:41 +02:00
|
|
|
The above definition of "Sum" will return zero (the initial
|
|
|
|
state condition) if there are no non-null input values.
|
|
|
|
Perhaps we want to return NULL in that case instead --- SQL92
|
|
|
|
expects "Sum" to behave that way. We can do this simply by
|
|
|
|
omitting the "initcond" phrase, so that the initial state
|
|
|
|
condition is NULL. Ordinarily this would mean that the sfunc
|
|
|
|
would need to check for a NULL state-condition input, but for
|
|
|
|
"Sum" and some other simple aggregates like "Max" and "Min",
|
|
|
|
it's sufficient to insert the first non-null input value into
|
|
|
|
the state variable and then start applying the transition function
|
|
|
|
at the second non-null input value. <productname>Postgres</productname>
|
|
|
|
will do that automatically if the initial condition is NULL and
|
2001-03-25 00:03:26 +01:00
|
|
|
the transition function is marked "strict" (i.e., not to be called
|
2000-07-17 05:05:41 +02:00
|
|
|
for NULL inputs).
|
2000-03-30 07:07:48 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2000-07-17 05:05:41 +02:00
|
|
|
Another bit of default behavior for a "strict" transition function
|
|
|
|
is that the previous state value is retained unchanged whenever a
|
|
|
|
NULL input value is encountered. Thus, NULLs are ignored. If you
|
|
|
|
need some other behavior for NULL inputs, just define your transition
|
|
|
|
function as non-strict, and code it to test for NULL inputs and do
|
|
|
|
whatever is needed.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
"Average" is a more complex example of an aggregate. It requires
|
|
|
|
two pieces of running state: the sum of the inputs and the count
|
|
|
|
of the number of inputs. The final result is obtained by dividing
|
|
|
|
these quantities. Average is typically implemented by using a
|
|
|
|
two-element array as the transition state value. For example,
|
|
|
|
the built-in implementation of <function>avg(float8)</function>
|
|
|
|
looks like:
|
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<programlisting>
|
2000-07-17 05:05:41 +02:00
|
|
|
CREATE AGGREGATE avg (
|
|
|
|
sfunc = float8_accum,
|
|
|
|
basetype = float8,
|
2000-10-23 02:46:07 +02:00
|
|
|
stype = float8[],
|
2000-07-17 05:05:41 +02:00
|
|
|
finalfunc = float8_avg,
|
|
|
|
initcond = '{0,0}'
|
1998-10-30 20:37:19 +01:00
|
|
|
);
|
2000-03-30 07:07:48 +02:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
2000-03-26 21:45:21 +02:00
|
|
|
|
2000-03-30 07:07:48 +02:00
|
|
|
<para>
|
|
|
|
For further details see
|
|
|
|
<!--
|
|
|
|
Not available in the Programmer's Guide
|
|
|
|
<xref endterm="sql-createaggregate-title"
|
|
|
|
linkend="sql-createaggregate-title">.
|
|
|
|
-->
|
|
|
|
<command>CREATE AGGREGATE</command> in
|
|
|
|
<citetitle>The PostgreSQL User's Guide</citetitle>.
|
|
|
|
</para>
|
|
|
|
</chapter>
|
|
|
|
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
|
|
Local variables:
|
|
|
|
mode:sgml
|
|
|
|
sgml-omittag:nil
|
|
|
|
sgml-shorttag:t
|
|
|
|
sgml-minimize-attributes:nil
|
|
|
|
sgml-always-quote-attributes:t
|
|
|
|
sgml-indent-step:1
|
|
|
|
sgml-indent-data:t
|
|
|
|
sgml-parent-document:nil
|
|
|
|
sgml-default-dtd-file:"./reference.ced"
|
|
|
|
sgml-exposed-tags:nil
|
2000-03-31 05:27:42 +02:00
|
|
|
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
2000-03-30 07:07:48 +02:00
|
|
|
sgml-local-ecat-files:nil
|
|
|
|
End:
|
|
|
|
-->
|