intagg
intagg
The intagg module provides an integer aggregator and an
enumerator.
Functions
The aggregator is an aggregate function
int_array_aggregate(integer)>
that produces an integer array
containing exactly the integers it is fed.
Here is a not-tremendously-useful example:
test=# select int_array_aggregate(i) from
test-# generate_series(1,10,2) i;
int_array_aggregate
---------------------
{1,3,5,7,9}
(1 row)
The enumerator is a function
int_array_enum(integer[])>
that returns setof integer>. It is essentially the reverse
operation of the aggregator: given an array of integers, expand it
into a set of rows. For example,
test=# select * from int_array_enum(array[1,3,5,7,9]);
int_array_enum
----------------
1
3
5
7
9
(5 rows)
Sample Uses
Many database systems have the notion of a one to many table. Such a table
usually sits between two indexed tables, for example:
CREATE TABLE left (id INT PRIMARY KEY, ...);
CREATE TABLE right (id INT PRIMARY KEY, ...);
CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
It is typically used like this:
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = item>;
This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
Now, this methodology can be cumbersome with a very large number of
entries in the one_to_many> table. Often,
a join like this would result in an index scan
and a fetch for each right hand entry in the table for a particular
left hand entry. If you have a very dynamic system, there is not much you
can do. However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
CREATE TABLE summary as
SELECT left, int_array_aggregate(right) AS right
FROM one_to_many
GROUP BY left;
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
the array; that's why there is an array enumerator. You can do
SELECT left, int_array_enum(right) FROM summary WHERE left = item>;
The above query using int_array_enum> produces the same results
as
SELECT left, right FROM one_to_many WHERE left = item>;
The difference is that the query against the summary table has to get
only one row from the table, whereas the direct query against
one_to_many> must index scan and fetch a row for each entry.
On one system, an EXPLAIN> showed a query with a cost of 8488 was
reduced to a cost of 329. The original query was a join involving the
one_to_many> table, which was replaced by:
SELECT right, count(right) FROM
( SELECT left, int_array_enum(right) AS right
FROM summary JOIN (SELECT left FROM left_table WHERE left = item>) AS lefts
ON (summary.left = lefts.left)
) AS list
GROUP BY right
ORDER BY count DESC;