Commit Graph

7 Commits

Author SHA1 Message Date
Bruce Momjian 64d0b8b05f Attached is an update to contrib/tablefunc. It implements a new hashed
version of crosstab. This fixes a major deficiency in real-world use of
the original version. Easiest to undestand with an illustration:

Data:
-------------------------------------------------------------------
select * from cth;
  id | rowid |        rowdt        |   attribute    |      val
----+-------+---------------------+----------------+---------------
   1 | test1 | 2003-03-01 00:00:00 | temperature    | 42
   2 | test1 | 2003-03-01 00:00:00 | test_result    | PASS
   3 | test1 | 2003-03-01 00:00:00 | volts          | 2.6987
   4 | test2 | 2003-03-02 00:00:00 | temperature    | 53
   5 | test2 | 2003-03-02 00:00:00 | test_result    | FAIL
   6 | test2 | 2003-03-02 00:00:00 | test_startdate | 01 March 2003
   7 | test2 | 2003-03-02 00:00:00 | volts          | 3.1234
(7 rows)

Original crosstab:
-------------------------------------------------------------------
SELECT * FROM crosstab(
   'SELECT rowid, attribute, val FROM cth ORDER BY 1,2',4)
AS c(rowid text, temperature text, test_result text, test_startdate
text, volts text);
  rowid | temperature | test_result | test_startdate | volts
-------+-------------+-------------+----------------+--------
  test1 | 42          | PASS        | 2.6987         |
  test2 | 53          | FAIL        | 01 March 2003  | 3.1234
(2 rows)

Hashed crosstab:
-------------------------------------------------------------------
SELECT * FROM crosstab(
   'SELECT rowid, attribute, val FROM cth ORDER BY 1',
   'SELECT DISTINCT attribute FROM cth ORDER BY 1')
AS c(rowid text, temperature int4, test_result text, test_startdate
timestamp, volts float8);
  rowid | temperature | test_result |   test_startdate    | volts
-------+-------------+-------------+---------------------+--------
  test1 |          42 | PASS        |                     | 2.6987
  test2 |          53 | FAIL        | 2003-03-01 00:00:00 | 3.1234
(2 rows)

Notice that the original crosstab slides data over to the left in the
result tuple when it encounters missing data. In order to work around
this you have to be make your source sql do all sorts of contortions
(cartesian join of distinct rowid with distinct attribute; left join
that back to the real source data). The new version avoids this by
building a hash table using a second distinct attribute query.

The new version also allows for "extra" columns (see the README) and
allows the result columns to be coerced into differing datatypes if they
are suitable (as shown above).

In testing a "real-world" data set (69 distinct rowid's, 27 distinct
categories/attributes, multiple missing data points) I saw about a
5-fold improvement in execution time (from about 2200 ms old, to 440 ms
new).

I left the original version intact because: 1) BC, 2) it is probably
slightly faster if you know that you have no missing attributes.

README and regression test adjustments included. If there are no
objections, please apply.

Joe Conway
2003-03-20 06:46:30 +00:00
Tom Lane 1f1c332381 Remove inappropriate double-quoting in connectby() code; adjust
regression test to avoid using VALUE as a name.  From Joe Conway.
2002-11-23 01:54:09 +00:00
Bruce Momjian e5cf1a8a26 SET autocommit no longer needed in /contrib because pg_regress.sh does
it automatically now on regression session startup.
2002-10-21 01:42:14 +00:00
Bruce Momjian aa4c702eac Update /contrib for "autocommit TO 'on'".
Create objects in public schema.

Make spacing/capitalization consistent.

Remove transaction block use for object creation.

Remove unneeded function GRANTs.
2002-10-18 18:41:22 +00:00
Bruce Momjian a62873d279 The attached adds a bit to the contrib/tablefunc regression test for
behavior of connectby() in the presence of infinite recursion. Please
apply this one in addition to the one sent earlier.

Joe Conway
2002-10-03 17:15:36 +00:00
Tom Lane bd04184b11 Attached is a patch to fix some recently raised issues that exist in
contrib/tablefunc. Specifically it replaces the use of VIEWs (for needed
composite type creation) with use of CREATE TYPE. It also performs GRANT
EXECUTE ON FUNCTION foo() TO PUBLIC for all of the created functions. There
was also a cosmetic change to two regression files.

Joe Conway
2002-09-14 19:53:59 +00:00
Bruce Momjian 6fff9a7475 The attached removes the current non-standard file
"contrib/tablefunc/tablefunc-test.sql", and adds a standard regression
test suite to contrib/tablefunc.

Joe Conway
2002-09-12 00:14:40 +00:00