Remove TODO.detail files that contained useless or very old information.

Update TODO accordingly.
2024-10-04 04:46:52 +02:00 · 2004-02-12 18:11:54 +00:00 · 2004-02-12 18:11:54 +00:00 · 2b721d3d41
commit 2b721d3d41
parent 5de02e283f
10 changed files with 102 additions and 14561 deletions
--- a/doc/TODO.detail/foreign
+++ b/doc/TODO.detail/foreign
@ -1,542 +0,0 @@
 From fjoe@iclub.nsu.ru Tue Jan 23 03:38:45 2001
 Received: from mx.nsu.ru (root@mx.nsu.ru [193.124.215.71])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA14458
 	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 03:38:24 -0500 (EST)
 Received: from iclub.nsu.ru (root@iclub.nsu.ru [193.124.222.66])
 	by mx.nsu.ru (8.9.1/8.9.0) with ESMTP id OAA29153;
 	Tue, 23 Jan 2001 14:31:27 +0600 (NOVT)
 Received: from localhost (fjoe@localhost)
 	by iclub.nsu.ru (8.11.1/8.11.1) with ESMTP id f0N8VOr15273;
 	Tue, 23 Jan 2001 14:31:25 +0600 (NS)
 	(envelope-from fjoe@iclub.nsu.ru)
 Date: Tue, 23 Jan 2001 14:31:24 +0600 (NS)
 From: Max Khon <fjoe@iclub.nsu.ru>
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 Subject: Re: [HACKERS] Bug in FOREIGN KEY
 In-Reply-To: <200101230416.XAA04293@candle.pha.pa.us>
 Message-ID: <Pine.BSF.4.21.0101231429310.12474-100000@iclub.nsu.ru>
 MIME-Version: 1.0
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 Status: RO
 hi, there!
 On Mon, 22 Jan 2001, Bruce Momjian wrote:
 > 
 > > This problem with foreign keys has been reported to me, and I have confirmed
 > > the bug exists in current sources.  The DELETE should succeed:
 > > 
 > > ---------------------------------------------------------------------------
 > > 
 > > CREATE TABLE primarytest2 (
 > >                            col1 INTEGER, 
 > >                            col2 INTEGER, 
 > >                            PRIMARY KEY(col1, col2)
 > >                           );
 > > 
 > > CREATE TABLE foreigntest2 (col3 INTEGER, 
 > >                            col4 INTEGER,
 > >                            FOREIGN KEY (col3, col4) REFERENCES primarytest2
 > >                          );
 > > test=> BEGIN;
 > > BEGIN
 > > test=> INSERT INTO primarytest2 VALUES (5,5);
 > > INSERT 27618 1
 > > test=> DELETE FROM primarytest2 WHERE col1 = 5 AND col2 = 5;
 > > ERROR:  triggered data change violation on relation "primarytest2"
 I have another (slightly different) example:
 --- cut here ---
 test=> CREATE TABLE pr(obj_id int PRIMARY KEY);
 NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'pr_pkey' for
 table 'pr'
 CREATE
 test=> CREATE TABLE fr(obj_id int REFERENCES pr ON DELETE CASCADE);
 NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
 check(s)
 CREATE
 test=> BEGIN;
 BEGIN
 test=> INSERT INTO pr (obj_id) VALUES (1);
 INSERT 200539 1
 test=> INSERT INTO fr (obj_id) SELECT obj_id FROM pr;
 INSERT 200540 1
 test=> DELETE FROM fr;
 ERROR:  triggered data change violation on relation "fr"
 test=> 
 --- cut here ---
 we are running postgresql 7.1 beta3
 /fjoe
 From sszabo@megazone23.bigpanda.com Tue Jan 23 13:41:55 2001
 Received: from megazone23.bigpanda.com (rfx-64-6-210-138.users.reflexcom.com [64.6.210.138])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA19924
 	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 13:41:54 -0500 (EST)
 Received: from localhost (sszabo@localhost)
 	by megazone23.bigpanda.com (8.11.1/8.11.1) with ESMTP id f0NIfLa41018;
 	Tue, 23 Jan 2001 10:41:21 -0800 (PST)
 Date: Tue, 23 Jan 2001 10:41:21 -0800 (PST)
 From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 cc: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
        PostgreSQL-development <pgsql-hackers@postgresql.org>
 Subject: Re: [HACKERS] Bug in FOREIGN KEY
 In-Reply-To: <200101230417.XAA04332@candle.pha.pa.us>
 Message-ID: <Pine.BSF.4.21.0101231031290.40955-100000@megazone23.bigpanda.com>
 MIME-Version: 1.0
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 Status: RO
 > >     Think  I misinterpreted the SQL3 specs WR to this detail. The
 > >     checks must be made per statement,  not  at  the  transaction
 > >     level.  I'll  try  to fix it, but we need to define what will
 > >     happen with referential actions in the  case  of  conflicting
 > >     actions on the same key - there are some possible conflicts:
 > > 
 > >     1.  DEFERRED ON DELETE NO ACTION or RESTRICT
 > > 
 > >         Do  the referencing rows reference to the new PK row with
 > >         the  same  key  now,  or  is  this  still  a   constraint
 > >         violation?  I  would say it's not, because the constraint
 > >         condition is satisfied at the end of the transaction. How
 > >         do other databases behave?
 > > 
 > >     2.  DEFERRED ON DELETE CASCADE, SET NULL or SET DEFAULT
 > > 
 > >         Again  I'd  say  that  the  action  should  be suppressed
 > >         because a matching PK row is present at transaction end -
 > >         it's  not  the same old row, but the constraint itself is
 > >         still satisfied.
 I'm not actually sure on the cascade, set null and set default.  The
 way they are written seems to imply to me that it's based on the state
 of the database before/after the command in question as opposed to the
 deferred state of the database because of the stuff about updating the
 state of partially matching rows immediately after the delete/update of
 the row which wouldn't really make sense when deferred.  Does anyone know
 what other systems do with a case something like this all in a
 transaction:
 create table a (a int primary key);
 create table b (b int references a match full on update cascade
 		 on delete cascade deferrable initially deferred);
 insert into a values (1);
 insert into a values (2);
 insert into b values (1);
 delete from a where a=1;
 select * from b;
 commit;
 From pgsql-hackers-owner+M3901@postgresql.org Fri Jan 26 17:00:24 2001
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10576
 	for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 17:00:24 -0500 (EST)
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLtVq53019;
 	Fri, 26 Jan 2001 16:55:31 -0500 (EST)
 	(envelope-from pgsql-hackers-owner+M3901@postgresql.org)
 Received: from smtp1b.mail.yahoo.com (smtp3.mail.yahoo.com [128.11.68.135])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLqmq52691
 	for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 16:52:48 -0500 (EST)
 	(envelope-from janwieck@yahoo.com)
 Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
  by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 22:49:57 -0000
 X-Apparently-From: <janwieck@yahoo.com>
 Received: (from janwieck@localhost)
 	by jupiter.greatbridge.com (8.9.3/8.9.3) id RAA04701;
 	Fri, 26 Jan 2001 17:02:32 -0500
 From: Jan Wieck <janwieck@Yahoo.com>
 Message-Id: <200101262202.RAA04701@jupiter.greatbridge.com>
 Subject: Re: [HACKERS] Bug in FOREIGN KEY
 In-Reply-To: <200101262110.QAA06902@candle.pha.pa.us> from Bruce Momjian at "Jan
 	26, 2001 04:10:22 pm"
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 Date: Fri, 26 Jan 2001 17:02:32 -0500 (EST)
 CC: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
        PostgreSQL-development <pgsql-hackers@postgresql.org>
 X-Mailer: ELM [version 2.4ME+ PL68 (25)]
 MIME-Version: 1.0
 Content-Type: text/plain; charset=US-ASCII
 Content-Transfer-Encoding: 7bit
 Precedence: bulk
 Sender: pgsql-hackers-owner@postgresql.org
 Status: RO
 Bruce Momjian wrote:
 > Here is another bug:
 >
 > test=> begin;
 > BEGIN
 > test=> INSERT INTO primarytest2 VALUES (5,5);
 > INSERT 18757 1
 > test=> UPDATE primarytest2 SET col2=1 WHERE col1 = 5 AND col2 = 5;
 > ERROR:  deferredTriggerGetPreviousEvent: event for tuple (0,10) not
 > found
    Schema?
 Jan
 --
 #======================================================================#
 # It's easier to get forgiveness for being wrong than for being right. #
 # Let's break this rule - forgive me.                                  #
 #================================================== JanWieck@Yahoo.com #
 _________________________________________________________
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com
 From pgsql-hackers-owner+M3864@postgresql.org Fri Jan 26 10:07:36 2001
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA17732
 	for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 10:07:35 -0500 (EST)
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QF3lq12782;
 	Fri, 26 Jan 2001 10:03:47 -0500 (EST)
 	(envelope-from pgsql-hackers-owner+M3864@postgresql.org)
 Received: from mailout00.sul.t-online.com (mailout00.sul.t-online.com [194.25.134.16])
 	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0QF0Yq12614
 	for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 10:00:34 -0500 (EST)
 	(envelope-from peter_e@gmx.net)
 Received: from fwd01.sul.t-online.com 
 	by mailout00.sul.t-online.com with smtp 
 	id 14MALp-0006Im-00; Fri, 26 Jan 2001 15:59:45 +0100
 Received: from peter.localdomain (520083510237-0001@[212.185.245.73]) by fmrl01.sul.t-online.com
 	with esmtp id 14MALQ-1Z0gkaC; Fri, 26 Jan 2001 15:59:20 +0100
 Date: Fri, 26 Jan 2001 16:07:27 +0100 (CET)
 From: Peter Eisentraut <peter_e@gmx.net>
 To: Hiroshi Inoue <Inoue@tpf.co.jp>
 cc: Bruce Momjian <pgman@candle.pha.pa.us>,
        PostgreSQL-development <pgsql-hackers@postgresql.org>
 Subject: Re: [HACKERS] Open 7.1 items
 In-Reply-To: <3A70FA87.933B3D51@tpf.co.jp>
 Message-ID: <Pine.LNX.4.30.0101261604030.769-100000@peter.localdomain>
 MIME-Version: 1.0
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 X-Sender: 520083510237-0001@t-dialin.net
 Precedence: bulk
 Sender: pgsql-hackers-owner@postgresql.org
 Status: RO
 Hiroshi Inoue writes:
 > What does this item mean ?
 > Is it the following ?
 >
 > 	begin;
 > 	insert into pk (id) values (1);
 > 	update(delete from) pk where id=1;
 > 	ERROR:  triggered data change violation on relation pk"
 >
 > If so, isn't it a simple bug ?
 Depends on the definition of "bug".  It's not spec compliant and it's not
 documented and it's annoying.  But it's been like this for a year and the
 issue is well known and can normally be avoided.  It looks like a
 documentation to-do to me.
 -- 
 Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/
 From pgsql-hackers-owner+M3876@postgresql.org Fri Jan 26 13:07:10 2001
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA26086
 	for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 13:07:09 -0500 (EST)
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI4Vq30248;
 	Fri, 26 Jan 2001 13:04:31 -0500 (EST)
 	(envelope-from pgsql-hackers-owner+M3876@postgresql.org)
 Received: from sectorbase2.sectorbase.com ([208.48.122.131])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI3Aq30098
 	for <pgsql-hackers@postgreSQL.org>; Fri, 26 Jan 2001 13:03:11 -0500 (EST)
 	(envelope-from vmikheev@SECTORBASE.COM)
 Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
 	id <D49FAF71>; Fri, 26 Jan 2001 09:41:23 -0800
 Message-ID: <8F4C99C66D04D4118F580090272A7A234D32C1@sectorbase1.sectorbase.com>
 From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
 To: "'Jan Wieck'" <janwieck@Yahoo.com>,
        PostgreSQL HACKERS
  <pgsql-hackers@postgresql.org>,
        Bruce Momjian <root@candle.pha.pa.us>
 Subject: RE: [HACKERS] Open 7.1 items
 Date: Fri, 26 Jan 2001 10:02:59 -0800
 MIME-Version: 1.0
 X-Mailer: Internet Mail Service (5.5.2653.19)
 Content-Type: text/plain;
 	charset="iso-8859-1"
 Precedence: bulk
 Sender: pgsql-hackers-owner@postgresql.org
 Status: RO
 > > FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
 > 
 >     A well known issue, and I've asked multiple times how exactly
 >     we want to define the behaviour for deferred constraints.  Do
 >     foreign keys reference just to a key value and are happy with
 >     it's existance, or do they refer to a particular row?
 I think first. The last is closer to OODBMS world, not to [O]RDBMS one.
 >     Consider you have a deferred "ON DELETE  CASCADE"  constraint
 >     and  do  a  DELETE, INSERT of a PK. Do the FK rows need to be
 >     deleted or not?
 Good example. I think FK should not be deleted. If someone really
 want to delete "old" FK then he can do 
 DELETE PK;
 SET CONSTRAINT ... IMMEDIATE; -- FK need to be deleted here
 INSERT PK;
 >     Consider you have a deferred "ON  DELETE  RESTRICT"  and  "ON
 >     UPDATE  CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
 >     to PK1, the FK2 rows need to follow, but does PK2 inherit all
 >     FK1 rows now so it's the master of both groups?
 Yes. Again one can use SET CONSTRAINT to achieve desirable results.
 It seems that SET CONSTRAINT was designed for these purposes - ie
 for better flexibility.
 Though, it would be better to look how other DBes handle all these
 cases -:)
 Vadim
 From janwieck@yahoo.com Fri Jan 26 12:20:27 2001
 Received: from smtp6.mail.yahoo.com (smtp6.mail.yahoo.com [128.11.69.103])
 	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22158
 	for <root@candle.pha.pa.us>; Fri, 26 Jan 2001 12:20:27 -0500 (EST)
 Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
  by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 17:20:26 -0000
 X-Apparently-From: <janwieck@yahoo.com>
 Received: (from janwieck@localhost)
 	by jupiter.greatbridge.com (8.9.3/8.9.3) id MAA03196;
 	Fri, 26 Jan 2001 12:30:05 -0500
 From: Jan Wieck <janwieck@yahoo.com>
 Message-Id: <200101261730.MAA03196@jupiter.greatbridge.com>
 Subject: Re: [HACKERS] Open 7.1 items
 To: PostgreSQL HACKERS <pgsql-hackers@postgreSQL.org>,
        Bruce Momjian <root@candle.pha.pa.us>
 Date: Fri, 26 Jan 2001 12:30:05 -0500 (EST)
 X-Mailer: ELM [version 2.4ME+ PL68 (25)]
 MIME-Version: 1.0
 Content-Type: text/plain; charset=US-ASCII
 Content-Transfer-Encoding: 7bit
 Status: RO
 Bruce Momjian wrote:
 > Here are my open 7.1 items.  Thanks for shrinking the list so far.
 >
 > ---------------------------------------------------------------------------
 >
 > FreeBSD locale bug
 > Reorder INSERT firing in rules
    I  don't  recall  why this is wanted. AFAIK there's no reason
    NOT to do so, except for the actual state of beeing  far  too
    close to a release candidate.
 > Philip Warner UPDATE crash
 > JDBC LargeObject short read return value missing
 > SELECT cash_out(1) crashes all backends
 > LAZY VACUUM
 > FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
    A well known issue, and I've asked multiple times how exactly
    we want to define the behaviour for deferred constraints.  Do
    foreign keys reference just to a key value and are happy with
    it's existance, or do they refer to a particular row?
    Consider you have a deferred "ON DELETE  CASCADE"  constraint
    and  do  a  DELETE, INSERT of a PK. Do the FK rows need to be
    deleted or not?
    Consider you have a deferred "ON  DELETE  RESTRICT"  and  "ON
    UPDATE  CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
    to PK1, the FK2 rows need to follow, but does PK2 inherit all
    FK1 rows now so it's the master of both groups?
    These  are  only two possible combinations. There are many to
    think of.  As said, I've asked before, but noone  voted  yet.
    Move  the item to 7.2 anyway, because changing this behaviour
    would require massive changes in the trigger queue *and*  the
    generic  RI triggers, which cannot be tested enough any more.
 Jan
 > Usernames limited in length
 > Does pg_dump preserve COMMENTs?
 > Failure of nested cursors in JDBC
 > JDBC setMaxRows() is global variable affecting other objects
 > Does JDBC Makefile need current dir?
 > Fix for pg_dump of bad system tables
 > Steve Howe failure query with rules
 > ODBC/JDBC not disconnecting properly?
 > Magnus Hagander ODBC issues?
 > Merge MySQL/PgSQL translation scripts
 > Fix ipcclean on Linux
 > Merge global and template BKI files?
 >
 >
 > --
 >   Bruce Momjian                        |  http://candle.pha.pa.us
 >   pgman@candle.pha.pa.us               |  (610) 853-3000
 >   +  If your life is a hard drive,     |  830 Blythe Avenue
 >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
 >
 --
 #======================================================================#
 # It's easier to get forgiveness for being wrong than for being right. #
 # Let's break this rule - forgive me.                                  #
 #================================================== JanWieck@Yahoo.com #
 _________________________________________________________
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com
 From pgsql-general-owner+M590@postgresql.org Tue Nov 14 16:30:40 2000
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA22313
 	for <pgman@candle.pha.pa.us>; Tue, 14 Nov 2000 17:30:39 -0500 (EST)
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAEMSJs66979;
 	Tue, 14 Nov 2000 17:28:21 -0500 (EST)
 	(envelope-from pgsql-general-owner+M590@postgresql.org)
 Received: from megazone23.bigpanda.com (138.210.6.64.reflexcom.com [64.6.210.138])
 	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAEMREs66800
 	for <pgsql-general@postgresql.org>; Tue, 14 Nov 2000 17:27:14 -0500 (EST)
 	(envelope-from sszabo@megazone23.bigpanda.com)
 Received: from localhost (sszabo@localhost)
 	by megazone23.bigpanda.com (8.11.1/8.11.0) with ESMTP id eAEMPpH69059;
 	Tue, 14 Nov 2000 14:25:51 -0800 (PST)
 Date: Tue, 14 Nov 2000 14:25:51 -0800 (PST)
 From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
 To: "Beth K. Gatewood" <bethg@mbt.washington.edu>
 cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] a request for some experienced input.....
 In-Reply-To: <3A11ACA1.E5D847DD@mbt.washington.edu>
 Message-ID: <Pine.BSF.4.21.0011141403380.68986-100000@megazone23.bigpanda.com>
 MIME-Version: 1.0
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 Precedence: bulk
 Sender: pgsql-general-owner@postgresql.org
 Status: OR
 On Tue, 14 Nov 2000, Beth K. Gatewood wrote:
 > >
 > 
 > Stephan-
 > 
 > Thank you so much for taking the effort to answer this these questions.  You
 > help is truly appreciated....
 > 
 > I just have a few points for clarification.
 > 
 > >
 > > MATCH PARTIAL is a specific match type which describes which rows are
 > > considered matching rows for purposes of meeting or failing the
 > > constraint.  (In match partial, a fktable (NULL, 2) would match a pk
 > > table (1,2) as well as a pk table (2,2).  It's different from match
 > > full in which case (NULL,2) would be invalid or match unspecified
 > > in which case it would match due to the existance of the NULL in any
 > > case).  There are some bizarre implementation details involved with
 > > it and it's different from the others in ways that make it difficult.
 > > It's in my list of things to do, but I haven't come up with an acceptable
 > > mechanism in my head yet.
 > 
 > Does this mean, currently that I can not have foreign keys with null values?
 Not exactly...
 Match full = In FK row, all columns must be NULL or the value of each
 	column must not be null and there is a row in the PK table where
 	each referencing column equals the corresponding referenced
 	column.
 Unspecified = In FK row, at least one column must be NULL or each
 	referencing column shall be equal to the corresponding referenced
 	column in some row of the referenced table
 Match partial is similar to match full except we ignore the null columns
 for purposes of the each referencing column equals bit.
 For example:
           PK Table Key values: (1,2), (1,3), (3,3)
 Attempted FK Table Key values: (1,2), (1,NULL), (5,NULL), (NULL, NULL)
 (hopefully I get this right)...
 In match full, only the 1st and 4th fk values are valid.
 In match partial, the 1st, 2nd, and 4th fk values are valid.
 In match unspecified, all the fk values are valid.
 The other note is that generally speaking, all three are basically the
 same for the single column key.  If you're only doing references on one
 column, the match type is mostly meaningless.
 > > PENDANT adds that for each row of the referenced table the values of
 > > the specified column(s) are the same as the values of the specified
 > > column(s) in some row of the referencing tables.
 > 
 > I am not sure I know what you mean here.....Are you saying that the value for
 > the FK column must match the value for the PK column?
 I haven't really looked at PENDANT, the above was just a small rewrite of
 some descriptive text in the sql99 draft I have.  There's a whole bunch
 of rules in the actual text of the referential constraint definition.
 The base stuff seems to be: (Rf is the referencing columns, T is the
 referenced table)
      3) If PENDANT is specified, then:
         a) For a given row in the referencing table, let pendant
           reference designate an instance in which all Rf are
           non-null.
         b) Let number of pendant paths be the number of pendant
           references to the same referenced row in a referenced table
           from all referencing rows in all base tables.
         c) For every row in T, the number of pendant paths is equal to
 	   or greater than 1.
 So, I'd read it as every row in T must have at least one referencing row
 in some base table.
 There are some details about updates and that you can't mix PENDANT and
 MATCH PARTIAL or SET DEFAULT actions.
 > > The main issues in 7.0 are that older versions (might be fixed in
 > > 7.0.3) would fail very badly if you used alter table to rename tables that
 > > were referenced in a fk constraint and that you need to give update
 > > permission to the referenced table.  For the former, 7.1 will (and 7.0.3
 > > may) give an elog(ERROR) to you rather than crashing the backend and the
 > > latter should be fixed for 7.1 (although you still need to have write
 > > perms to the referencing table for referential actions to work properly)
 > 
 > Are the steps to this outlined somewhere then?
 The permissions stuff is just a matter of using GRANT and REVOKE to set
 the permissions that a user has to a table.  
--- a/doc/TODO.detail/fsync
+++ b/doc/TODO.detail/fsync
@ -1,129 +0,0 @@
 From pgsql-hackers-owner+M908@postgresql.org Sun Nov 19 14:27:43 2000
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA10885
 	for <pgman@candle.pha.pa.us>; Sun, 19 Nov 2000 14:27:42 -0500 (EST)
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAJJSMs83653;
 	Sun, 19 Nov 2000 14:28:22 -0500 (EST)
 	(envelope-from pgsql-hackers-owner+M908@postgresql.org)
 Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46] (may be forged))
 	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAJJQns83565
 	for <pgsql-hackers@postgreSQL.org>; Sun, 19 Nov 2000 14:26:49 -0500 (EST)
 	(envelope-from pgman@candle.pha.pa.us)
 Received: (from pgman@localhost)
 	by candle.pha.pa.us (8.9.0/8.9.0) id OAA06790;
 	Sun, 19 Nov 2000 14:23:06 -0500 (EST)
 From: Bruce Momjian <pgman@candle.pha.pa.us>
 Message-Id: <200011191923.OAA06790@candle.pha.pa.us>
 Subject: Re: [HACKERS] WAL fsync scheduling
 In-Reply-To: <002101c0525e$2d964480$b97a30d0@sectorbase.com> "from Vadim Mikheev
 	at Nov 19, 2000 11:23:19 am"
 To: Vadim Mikheev <vmikheev@sectorbase.com>
 Date: Sun, 19 Nov 2000 14:23:06 -0500 (EST)
 CC: Tom Samplonius <tom@sdf.com>, Alfred@candle.pha.pa.us,
        Perlstein <bright@wintelcom.net>, Larry@candle.pha.pa.us,
        Rosenman <ler@lerctr.org>,
        PostgreSQL-development <pgsql-hackers@postgresql.org>
 X-Mailer: ELM [version 2.4ME+ PL77 (25)]
 MIME-Version: 1.0
 Content-Transfer-Encoding: 7bit
 Content-Type: text/plain; charset=US-ASCII
 Precedence: bulk
 Sender: pgsql-hackers-owner@postgresql.org
 Status: OR
 [ Charset ISO-8859-1 unsupported, converting... ]
 > > There are two parts to transaction commit.  The first is writing all
 > > dirty buffers or log changes to the kernel, and second is fsync of the
 >    ^^^^^^^^^^^^
 > Backend doesn't write any dirty buffer to the kernel at commit time.
 Yes, I suspected that.
 > 
 > > log file.
 > 
 > The first part is writing commit record into WAL buffers in shmem.
 > This is what XLogInsert does.  After that XLogFlush is called to ensure
 > that  entire commit record is on disk. XLogFlush does *both* write() and
 > fsync() (single slock is used for both writing and fsyncing) if it needs to
 > do it at all.
 Yes, I realize there are new steps in WAL.
 > 
 > > I suggest having a per-backend shared memory byte that has the following
 > > values:
 > > 
 > > START_LOG_WRITE
 > > WAIT_ON_FSYNC
 > > NOT_IN_COMMIT
 > > backend_number_doing_fsync
 > > 
 > > I suggest that when each backend starts a commit, it sets its byte to
 > > START_LOG_WRITE. 
 >   ^^^^^^^^^^^^^^^^^^^^^^^
 > Isn't START_COMMIT more meaningful?
 Yes.
 > 
 > > When it gets ready to fsync, it checks all backends. 
 >    ^^^^^^^^^^^^^^^^^^^^^^^^^^
 > What do you mean by this? The moment just after XLogInsert?
 Just before it calls fsync().
 > 
 > > If all are NOT_IN_COMMIT, it does fsync and continues.
 > 
 > 1st edition:
 > > If one or more are in START_LOG_WRITE, it waits until no one is in
 > > START_LOG_WRITE.  It then checks all WAIT_ON_FSYNC, and if it is the
 > > lowest backend in WAIT_ON_FSYNC, marks all others with its backend
 > > number, and does fsync.  It then clears all backends with its number to
 > > NOT_IN_COMMIT.  Other backend will see they are not the lowest
 > > WAIT_ON_FSYNC and will wait for their byte to be set to NOT_IN_COMMIT
 > > so they can then continue, knowing their data was synced.
 > 
 > 2nd edition:
 > > I have another idea.  If a backend gets to the point that it needs
 > > fsync, and there is another backend in START_LOG_WRITE, it can go to an
 > > interuptable sleep, knowing another backend will perform the fsync and
 > > wake it up.  Therefore, there is no busy-wait or timed sleep.
 > > 
 > > Of course, a backend must set its status to WAIT_ON_FSYNC to avoid a
 > > race condition.
 > 
 > The 2nd edition is much better. But I'm not sure do we really need in
 > these per-backend bytes in shmem. Why not just have some counters?
 > We can use a semaphore to wake-up all waiters at once.
 Yes, that is much better and clearer.  My idea was just to say, "if no
 one is entering commit phase, do the commit.  If someone else is coming,
 sleep and wait for them to do the fsync and wake me up with a singal."  
 > 
 > > This allows a single backend not to sleep, and allows multiple backends
 > > to bunch up only when they are all about to commit.
 > > 
 > > The reason backend numbers are written is so other backends entering the
 > > commit code will not interfere with the backends performing fsync.
 > 
 > Being waked-up backend can check what's written/fsynced by calling XLogFlush.
 Seems that may not be needed anymore with a counter.  The only issue is
 that other backends may enter commit while fsync() is happening.  The
 process that did the fsync must be sure to wake up only the backends
 that were waiting for it, and not other backends that may be also be
 doing fsync as a group while the first fsync was happening.  I leave
 those details to people more experienced.  :-)
 I am just glad people liked my idea.
 -- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
--- a/doc/TODO.detail/optimizer
+++ b/doc/TODO.detail/optimizer
--- a/doc/TODO.detail/persistent
+++ b/doc/TODO.detail/persistent
@ -1,102 +0,0 @@
 From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
 Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
 	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
 Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
 Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
 Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
 Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
 	Mon, 11 May 1998 11:14:43 -0400 (EDT)
 To: Brett McCormick <brett@work.chicken.org>
 cc: hackers@postgreSQL.org
 Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
 In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
             <13655.4384.345723.466046@abraxas.scene.com> 
 Date: Mon, 11 May 1998 11:14:43 -0400
 Message-ID: <24913.894899683@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Sender: owner-pgsql-hackers@hub.org
 Precedence: bulk
 Status: RO
 Brett McCormick <brett@work.chicken.org> writes:
 > same way that the current network socket is passed -- through an execv
 > argument.  hopefully, however, the non-execv()ing fork will be in 6.4.
 Um, you missed the point, Brett.  David was hoping to transfer a client
 connection from the postmaster to an *already existing* backend process.
 Fork, with or without exec, solves the problem for a backend that's
 started after the postmaster has accepted the client socket.
 This does lead to a different line of thought, however.  Pre-started
 backends would have access to the "master" connection socket on which
 the postmaster listens for client connections, right?  Suppose that we
 fire the postmaster as postmaster, and demote it to being simply a
 manufacturer of new backend processes as old ones get used up.  Have
 one of the idle backend processes be the one doing the accept() on the
 master socket.  Once it has a client connection, it performs the
 authentication handshake and then starts serving the client (or just
 quits if authentication fails).  Meanwhile the next idle backend process
 has executed accept() on the master socket and is waiting for the next
 client; and shortly the postmaster/factory/whateverwecallitnow notices
 that it needs to start another backend to add to the idle-backend pool.
 This'd probably need some interlocking among the backends.  I have no
 idea whether it'd be safe to have all the idle backends trying to
 do accept() on the master socket simultaneously, but it sounds risky.
 Better to use a mutex so that only one gets to do it while the others
 sleep.
 			regards, tom lane
 From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
 Received: from hub.org (hub.org [209.47.148.200])
 	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
 	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
 Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
 Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
 Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
 	Mon, 11 May 1998 11:26:44 -0400 (EDT)
 To: Brett McCormick <brett@work.chicken.org>
 cc: hackers@postgreSQL.org
 Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
 In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
             <13655.4384.345723.466046@abraxas.scene.com> 
 Date: Mon, 11 May 1998 11:26:44 -0400
 Message-ID: <25004.894900404@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Sender: owner-pgsql-hackers@hub.org
 Precedence: bulk
 Status: RO
 Meanwhile, *I* missed the point about Brett's second comment :-(
 Brett McCormick <brett@work.chicken.org> writes:
 > There will have to be some sort of arg parsing in any case,
 > considering that you can pass configurable arguments to the backend..
 If we do the sort of change David and I were just discussing, then the
 pre-spawned backend would become responsible for parsing and dealing
 with the PGOPTIONS portion of the client's connection request message.
 That's just part of shifting the authentication handshake code from
 postmaster to backend, so it shouldn't be too hard.
 BUT: the whole point is to be able to initialize the backend before it
 is connected to a client.  How much of the expensive backend startup
 work depends on having the client connection options available?
 Any work that needs to know the options will have to wait until after
 the client connects.  If that means most of the startup work can't
 happen in advance anyway, then we're out of luck; a pre-started backend
 won't save enough time to be worth the effort.  (Unless we are willing
 to eliminate or redefine the troublesome options...)
 			regards, tom lane
--- a/doc/TODO.detail/pool
+++ b/doc/TODO.detail/pool
@ -1319,3 +1319,105 @@ DDI: +64(4)916-7201    MOB: +64(21)635-694    OFFICE: +64(4)499-2267
 ---------------------------(end of broadcast)---------------------------
 TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
 Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
 	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
 Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
 Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
 Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
 Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
 	Mon, 11 May 1998 11:14:43 -0400 (EDT)
 To: Brett McCormick <brett@work.chicken.org>
 cc: hackers@postgreSQL.org
 Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
 In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
             <13655.4384.345723.466046@abraxas.scene.com> 
 Date: Mon, 11 May 1998 11:14:43 -0400
 Message-ID: <24913.894899683@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Sender: owner-pgsql-hackers@hub.org
 Precedence: bulk
 Status: RO
 Brett McCormick <brett@work.chicken.org> writes:
 > same way that the current network socket is passed -- through an execv
 > argument.  hopefully, however, the non-execv()ing fork will be in 6.4.
 Um, you missed the point, Brett.  David was hoping to transfer a client
 connection from the postmaster to an *already existing* backend process.
 Fork, with or without exec, solves the problem for a backend that's
 started after the postmaster has accepted the client socket.
 This does lead to a different line of thought, however.  Pre-started
 backends would have access to the "master" connection socket on which
 the postmaster listens for client connections, right?  Suppose that we
 fire the postmaster as postmaster, and demote it to being simply a
 manufacturer of new backend processes as old ones get used up.  Have
 one of the idle backend processes be the one doing the accept() on the
 master socket.  Once it has a client connection, it performs the
 authentication handshake and then starts serving the client (or just
 quits if authentication fails).  Meanwhile the next idle backend process
 has executed accept() on the master socket and is waiting for the next
 client; and shortly the postmaster/factory/whateverwecallitnow notices
 that it needs to start another backend to add to the idle-backend pool.
 This'd probably need some interlocking among the backends.  I have no
 idea whether it'd be safe to have all the idle backends trying to
 do accept() on the master socket simultaneously, but it sounds risky.
 Better to use a mutex so that only one gets to do it while the others
 sleep.
 			regards, tom lane
 From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
 Received: from hub.org (hub.org [209.47.148.200])
 	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
 	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
 Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
 Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
 Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
 Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
 	Mon, 11 May 1998 11:26:44 -0400 (EDT)
 To: Brett McCormick <brett@work.chicken.org>
 cc: hackers@postgreSQL.org
 Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
 In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
             <13655.4384.345723.466046@abraxas.scene.com> 
 Date: Mon, 11 May 1998 11:26:44 -0400
 Message-ID: <25004.894900404@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Sender: owner-pgsql-hackers@hub.org
 Precedence: bulk
 Status: RO
 Meanwhile, *I* missed the point about Brett's second comment :-(
 Brett McCormick <brett@work.chicken.org> writes:
 > There will have to be some sort of arg parsing in any case,
 > considering that you can pass configurable arguments to the backend..
 If we do the sort of change David and I were just discussing, then the
 pre-spawned backend would become responsible for parsing and dealing
 with the PGOPTIONS portion of the client's connection request message.
 That's just part of shifting the authentication handshake code from
 postmaster to backend, so it shouldn't be too hard.
 BUT: the whole point is to be able to initialize the backend before it
 is connected to a client.  How much of the expensive backend startup
 work depends on having the client connection options available?
 Any work that needs to know the options will have to wait until after
 the client connects.  If that means most of the startup work can't
 happen in advance anyway, then we're out of luck; a pre-started backend
 won't save enough time to be worth the effort.  (Unless we are willing
 to eliminate or redefine the troublesome options...)
 			regards, tom lane
--- a/doc/TODO.detail/prepare
+++ b/doc/TODO.detail/prepare
--- a/doc/TODO.detail/replication
+++ b/doc/TODO.detail/replication
--- a/doc/TODO.detail/typeconv
+++ b/doc/TODO.detail/typeconv
@ -1,916 +0,0 @@
 From pgsql-hackers-owner+M1833@hub.org Sat May 13 22:49:26 2000
 Received: from news.tht.net (news.hub.org [216.126.91.242])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07394
 	for <pgman@candle.pha.pa.us>; Sat, 13 May 2000 22:49:24 -0400 (EDT)
 Received: from hub.org (majordom@hub.org [216.126.84.1])
 	by news.tht.net (8.9.3/8.9.3) with ESMTP id WAB99859;
 	Sat, 13 May 2000 22:44:15 -0400 (EDT)
 	(envelope-from pgsql-hackers-owner+M1833@hub.org)
 Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
 	by hub.org (8.9.3/8.9.3) with ESMTP id WAA51058
 	for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:41:16 -0400 (EDT)
 	(envelope-from tgl@sss.pgh.pa.us)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA18343
 	for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:40:38 -0400 (EDT)
 To: pgsql-hackers@postgresql.org
 Subject: [HACKERS] Proposal for fixing numeric type-resolution issues
 Date: Sat, 13 May 2000 22:40:38 -0400
 Message-ID: <18340.958272038@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 X-Mailing-List: pgsql-hackers@postgresql.org
 Precedence: bulk
 Sender: pgsql-hackers-owner@hub.org
 Status: ORr
 We've got a collection of problems that are related to the parser's
 inability to make good type-resolution choices for numeric constants.
 In some cases you get a hard error; for example "NumericVar + 4.4"
 yields
 ERROR:  Unable to identify an operator '+' for types 'numeric' and 'float8'
        You will have to retype this query using an explicit cast
 because "4.4" is initially typed as float8 and the system can't figure
 out whether to use numeric or float8 addition.  A more subtle problem
 is that a query like "... WHERE Int2Var < 42" is unable to make use of
 an index on the int2 column: 42 is resolved as int4, so the operator
 is int24lt, which works but is not in the opclass of an int2 index.
 Here is a proposal for fixing these problems.  I think we could get this
 done for 7.1 if people like it.
 The basic problem is that there's not enough smarts in the type resolver
 about the interrelationships of the numeric datatypes.  All it has is
 a concept of a most-preferred type within the category of numeric types.
 (We are abusing the most-preferred-type mechanism, BTW, because both
 FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
 category!  This is in fact why the resolver can't make a choice for
 "numeric+float8".)  We need more intelligence than that.
 I propose that we set up a strictly-ordered hierarchy of numeric
 datatypes, running from least preferred to most preferred:
 	int2, int4, int8, numeric, float4, float8.
 Rather than simply considering coercions to the most-preferred type,
 the type resolver should use the following rules:
 1. No value will be down-converted (eg int4 to int2) except by an
 explicit conversion.
 2. If there is not an exact matching operator, numeric values will be
 up-converted to the highest numeric datatype present among the operator
 or function's arguments.  For example, given "int2 + int8" we'd up-
 convert the int2 to int8 and apply int8 addition.
 The final piece of the puzzle is that the type initially assigned to
 an undecorated numeric constant should be NUMERIC if it contains a
 decimal point or exponent, and otherwise the smallest of int2, int4,
 int8, NUMERIC that will represent it.  This is a considerable change
 from the current lexer behavior, where you get either int4 or float8.
 For example, given "NumericVar + 4.4", the constant 4.4 will initially
 be assigned type NUMERIC, we will resolve the operator as numeric plus,
 and everything's fine.  Given "Float8Var + 4.4", the constant is still
 initially numeric, but will be up-converted to float8 so that float8
 addition can be used.  The end result is the same as in traditional
 Postgres: you get float8 addition.  Given "Int2Var < 42", the constant
 is initially typed as int2, since it fits, and we end up selecting
 int2lt, thereby allowing use of an int2 index.  (On the other hand,
 given "Int2Var < 100000", we'd end up using int4lt, which is correct
 to avoid overflow.)
 A couple of crucial subtleties here:
 1. We are assuming that the parser or optimizer will constant-fold
 any conversion functions that are introduced.  Thus, in the
 "Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
 time execution begins, so there's no performance loss.
 2. We cannot lose precision by initially representing a constant as
 numeric and later converting it to float.  Nor can we exceed NUMERIC's
 range (the default 1000-digit limit is more than the range of IEEE
 float8 data).  It would not work as well to start out by representing
 a constant as float and then converting it to numeric.
 Presently, the pg_proc and pg_operator tables contain a pretty fair
 collection of cross-datatype numeric operators, such as int24lt,
 float48pl, etc.  We could perhaps leave these in, but I believe that
 it is better to remove them.  For example, if int42lt is left in place,
 then it would capture cases like "Int4Var < 42", whereas we need that
 to be translated to int4lt so that an int4 index can be used.  Removing
 these operators will eliminate some code bloat and system-catalog bloat
 to boot.
 As far as I can tell, this proposal is almost compatible with the rules
 given in SQL92: in particular, SQL92 specifies that an operator having
 both "approximate numeric" (float) and "exact numeric" (int or numeric)
 inputs should deliver an approximate-numeric result.  I propose
 deviating from SQL92 in a single respect: SQL92 specifies that a
 constant containing an exponent (eg 1.2E34) is approximate numeric,
 which implies that the result of an operator using it is approximate
 even if the other operand is exact.  I believe it's better to treat
 such a constant as exact (ie, type NUMERIC) and only convert it to
 float if the other operand is float.  Without doing that, an assignment
 like
 	UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
 will not work as desired because the constant will be prematurely
 coerced to float, causing precision loss.
 Comments?
 			regards, tom lane
 From tgl@sss.pgh.pa.us Sun May 14 17:30:56 2000
 Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05808
 	for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:30:52 -0400 (EDT)
 Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.4 $) with ESMTP id RAA16657 for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:29:52 -0400 (EDT)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA20914;
 	Sun, 14 May 2000 17:29:30 -0400 (EDT)
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
 Subject: Re: [HACKERS] type conversion discussion 
 In-reply-to: <200005141950.PAA04636@candle.pha.pa.us> 
 References: <200005141950.PAA04636@candle.pha.pa.us>
 Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
 	message dated "Sun, 14 May 2000 15:50:20 -0400"
 Date: Sun, 14 May 2000 17:29:30 -0400
 Message-ID: <20911.958339770@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Status: OR
 Bruce Momjian <pgman@candle.pha.pa.us> writes:
 > As some point, it seems we need to get all the PostgreSQL minds together
 > to discuss type conversion issues.  These problems continue to come up
 > from release to release.  We are getting better, but it seems a full
 > discussion could help solidify our strategy.
 OK, here are a few things that bug me about the current type-resolution
 code:
 1. Poor choice of type to attribute to numeric literals.  (A possible
   solution is sketched in my earlier message, but do we need similar
   mechanisms for other type categories?)
 2. Tensions between treating string literals as "unknown" type and
   as "text" type, per this thread so far.
 3. IS_BINARY_COMPATIBLE seems like a bogus concept.  Do we really want a
   fully symmetrical ring of types in each group?  I'd prefer to see a
   one-way equivalence, which allows eg. OID to be silently converted
   to INT4, but *not* vice versa (except perhaps by specific user cast).
   This'd be more like a traditional "is-a" or inheritance relationship
   between datatypes, which has well-understood semantics.
 4. I'm also concerned that the behavior of IS_BINARY_COMPATIBLE isn't
   very predictable because it will happily go either way.  For example,
   if I do 
 	select * from pg_class where oid = 1234;
   it's unclear whether I will get an oideq or an int4eq operator ---
   and that's a rather critical point since only one of them can exploit
   an index on the oid column.  Currently, there is some klugery in the
   planner that works around this by overriding the parser's choice of
   operator to substitute one that is compatible with an available index.
   That's a pretty ugly solution ... I'm not sure I know a better one,
   but as long as we're discussing type resolution issues ...
 5. Lack of extensibility.  There's way too much knowledge hard-wired
   into the parser about type categories, preferred types, binary
   compatibility, etc.  All of it falls down when faced with
   user-defined datatypes.  If we do something like I suggested with
   a hardwired hierarchy of numeric datatypes, it'll get even worse.
   All this stuff ought to be driven off fields in pg_type rather than
   be hardwired into the code, so that the same concepts can be extended
   to user-defined types.
 I don't have worked-out proposals for any of these but the first,
 but they've all been bothering me for a while.
 			regards, tom lane
 From tgl@sss.pgh.pa.us Sun May 14 21:02:31 2000
 Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07700
 	for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 21:02:28 -0400 (EDT)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA21261;
 	Sun, 14 May 2000 21:03:17 -0400 (EDT)
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
 Subject: Re: [HACKERS] type conversion discussion 
 In-reply-to: <20911.958339770@sss.pgh.pa.us> 
 References: <200005141950.PAA04636@candle.pha.pa.us> <20911.958339770@sss.pgh.pa.us>
 Comments: In-reply-to Tom Lane <tgl@sss.pgh.pa.us>
 	message dated "Sun, 14 May 2000 17:29:30 -0400"
 Date: Sun, 14 May 2000 21:03:17 -0400
 Message-ID: <21258.958352597@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Status: OR
 Here are the results of some further thoughts about type-conversion
 issues.  This is not a complete proposal yet, but a sketch of an
 approach that might solve several of the gripes in my previous proposal.
 While thinking about this, I realized that my numeric-types proposal
 of yesterday would break at least a few cases that work nicely now.
 For example, I frequently do things like
 	select * from pg_class where oid = 1234;
 whilst poking around in system tables and querytree dumps.  If that
 constant is initially resolved as int2, as I suggested yesterday,
 then we have "oid = int2" for which there is no operator.  To succeed
 we must decide to promote the constant to int4 --- but with no int4
 visible among the operands of the "=", it will not work to just "promote
 numerics to the highest type seen in the operands" as I suggested
 yesterday.  So there has to be some more interaction in there.
 Anyway, I was complaining about the looseness of the concept of
 binary-compatible types and the fact that the parser's type conversion
 knowledge is mostly hardwired.  These might be resolved by generalizing
 the numeric type hierarchy idea into a "type promotion lattice", which
 would work like this:
 * Add a "typpromote" column to pg_type, which contains either zero or
  the OID of another type that the parser is allowed to promote this
  type to when searching for usable functions/operators.  For example,
  my numeric-types hierarchy of yesterday would be expressed by making
  int2 promote to int4, int4 to int8, int8 to numeric, numeric to
  float4, and float4 to float8.  The promotion idea also replaces the
  current concept of binary-compatible types: for example, OID would
  link to int4 and varchar would link to text (but not vice versa!).
 * Also add a "typpromotebin" boolean column to pg_type, which contains
  't' if the type conversion indicated by typpromote is "free", ie,
  no conversion function need be executed before regarding a value as
  belonging to the promoted type.  This distinguishes binary-compatible
  from non-binary-compatible cases.  If "typpromotebin" is 'f' and the
  parser decides it needs to apply the conversion, then it has to look
  up the appropriate conversion function in pg_proc.  (More about this
  below.)
 Now, if the parser fails to find an exact match for a given function
 or operator name and the exact set of input data types, it proceeds by
 chasing up the promotion chains for the input data types and trying to
 locate a set of types for which there is a matching function/operator.
 If there are multiple possibilities, we choose the one which is the
 "least promoted" by some yet-to-be-determined metric.  (This metric
 would probably favor "free" conversions over non-free ones, but other
 than that I'm not quite sure how it should work.  The metric would
 replace a whole bunch of ad-hoc heuristics that are currently applied
 in the type resolver, so even if it seems rather ad-hoc it'd still be
 cleaner than what we have ;-).)
 In a situation like the "oid = int2" example above, this mechanism would
 presumably settle on "int4 = int4" as being the least-promoted
 equivalent operator.  (It could not find "oid = oid" since there is
 no promotion path from int2 to oid.)  That looks bad since it isn't
 compatible with an oidops index --- but I have a solution for that!
 I don't think we need the oid opclass at all; why shouldn't indexes
 on oid be expressed as int4 indexes to begin with?  In general, if
 two types are considered binary-equivalent under the old scheme, then
 the one that is considered the subtype probably shouldn't have separate
 index operators under this new scheme.  Instead it should just rely on
 the index operators of the promoted type.
 The point of the proposed typpromotebin field is to save a pg_proc
 lookup when trying to determine whether a particular promotion is "free"
 or not.  We could save even more lookups if we didn't store the boolean
 but instead the actual OID of the conversion function, or zero if the
 promotion is "free".  The trouble with that is that it creates a
 circularity problem when trying to define a new user type --- you can't
 define the conversion function if its input type doesn't exist yet.
 In any case, we want the parser to do a function lookup if we've
 advanced more than one step in the promotion hierarchy: if we've decided
 to promote int4 to float8 (which will be a four-step chain through int8,
 numeric, float4) we sure want the thing to use a direct int4tofloat8
 conversion function if available, not a chain of four conversion
 functions.  So on balance I think we want to look in pg_proc once we've
 decided which conversion to perform.  The only reason for having
 typpromotebin is that the promotion metric will want to know which
 conversions are free, and we don't want to have to do a lookup in
 pg_proc for each alternative we consider, only the ones that are finally
 selected to be used.
 I can think of at least one special case that still isn't cleanly
 handled under this scheme, and that is bpchar vs. varchar comparison.
 Currently, we have
 regression=# select 'a'::bpchar = 'a '::bpchar;
 ?column?
 ----------
 t
 (1 row)
 This is correct since trailing blanks are insignificant in bpchar land,
 so the two values should be considered equal.  If we try
 regression=# select 'a'::bpchar = 'a '::varchar;
 ERROR:  Unable to identify an operator '=' for types 'bpchar' and 'varchar'
        You will have to retype this query using an explicit cast
 which is pretty bogus but at least it saves the system from making some
 random choice about whether bpchar or varchar comparison rules apply.
 On the other hand,
 regression=# select 'a'::bpchar = 'a '::text;
 ?column?
 ----------
 f
 (1 row)
 Here the bpchar value has been promoted to text and then text comparison
 (where trailing blanks *are* significant) is applied.  I'm not sure that
 we can really justify doing this in this case when we reject the bpchar
 vs varchar case, but maybe someone wants to argue that that's correct.
 The natural setup in my type-promotion scheme would be that both bpchar
 and varchar link to 'text' as their promoted type.  If we do nothing
 special then text-style comparison would be used in a bpchar vs varchar
 comparison, which is arguably wrong.
 One way to deal with this without introducing kluges into the type
 resolver is to provide a full set of bpchar vs text and text vs bpchar
 operators, and make sure that the promotion metric is such that these
 will be used in place of text vs text operators if they apply (which
 should hold, I think, for any reasonable metric).  This is probably
 the only way to get the "right" behavior in any case --- I think that
 the "right" behavior for such comparisons is to strip trailing blanks
 from the bpchar side but not the text/varchar side.  (I haven't checked
 to see if SQL92 agrees, though.)
 Another issue is how to fit resolution of "unknown" literals into this
 scheme.  We could probably continue to handle them more or less as we
 do now, but they might complicate the promotion metric.
 I am not clear yet on whether we'd still need the concept of "type
 categories" as they presently exist in the resolver.  It's possible
 that we wouldn't, which would be a nice simplification.  (If we do
 still need them, we should have a column in pg_type that defines the
 category of a type, instead of hard-wiring category assignments.)
 			regards, tom lane
 From e99re41@DoCS.UU.SE Mon May 15 07:39:03 2000
 Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA10251
 	for <pgman@candle.pha.pa.us>; Mon, 15 May 2000 07:39:01 -0400 (EDT)
 Received: from Zebra.DoCS.UU.SE (e99re41@Zebra.DoCS.UU.SE [130.238.9.158])
 	by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id NAA10849;
 	Mon, 15 May 2000 13:39:45 +0200 (MET DST)
 Received: from localhost (e99re41@localhost) by Zebra.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id NAA26523; Mon, 15 May 2000 13:39:44 +0200
 X-Authentication-Warning: Zebra.DoCS.UU.SE: e99re41 owned process doing -bs
 Date: Mon, 15 May 2000 13:39:44 +0200 (MET DST)
 From: Peter Eisentraut <e99re41@DoCS.UU.SE>
 Reply-To: Peter Eisentraut <peter_e@gmx.net>
 To: Tom Lane <tgl@sss.pgh.pa.us>
 cc: Bruce Momjian <pgman@candle.pha.pa.us>,
        PostgreSQL-development <pgsql-hackers@postgresql.org>
 Subject: Re: [HACKERS] type conversion discussion 
 In-Reply-To: <20911.958339770@sss.pgh.pa.us>
 Message-ID: <Pine.GSO.4.02A.10005151309020.26399-100000@Zebra.DoCS.UU.SE>
 MIME-Version: 1.0
 Content-Type: TEXT/PLAIN; charset=iso-8859-1
 Content-Transfer-Encoding: 8bit
 X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id HAA10251
 Status: OR
 On Sun, 14 May 2000, Tom Lane wrote:
 > 1. Poor choice of type to attribute to numeric literals.  (A possible
 >    solution is sketched in my earlier message, but do we need similar
 >    mechanisms for other type categories?)
 I think your plan looks good for the numerical land. (I'll ponder the oid
 issues in a second.) For other type categories, perhaps not. Should a line
 be promoted to a polygon so you can check if it contains a point? Or a
 polygon to a box? Higher dimensions? :-)
 > 2. Tensions between treating string literals as "unknown" type and
 >    as "text" type, per this thread so far.
 Yes, while we're at it, let's look at this in detail. I claim that
 something of the form 'xxx' should always be text (or char or whatever),
 period. Let's consider the cases were this could potentially clash with
 the current behaviour:
 a) The target type is unambiguously clear, e.g., UPDATE ... SET. Then you
 cast text to the target type. The effect is identical.
 b) The target type is completely unspecified, e.g. CREATE TABLE AS SELECT
 'xxx'; This will currently create an "unknown" column. It should arguably
 create a "text" column.
 Function argument resolution:
 c) There is only one function and it has a "text" argument. No-brainer.
 d) There is only one function and it has an argument other than text. Try
 to cast text to that type. (This is what's done in general, isn't it?)
 e) The function is overloaded for many types, amongst which is text. Then
 call the text version. I believe this would currently fail, which I'd
 consider a deficiency.
 f) The function is overloaded for many types, none of which is text. In
 that case you have to cast anyway, so you don't lose anything.
 On thing to also keep in mind regarding required casting for (b) and (f)
 is that SQL never allowed literals of "fancy" types (e.g., DATE) to have
 undecorated 'yyyy-mm-dd' constants, you always have to say DATE
 'yyyy-mm-dd'. What Postgres allows is a convencience where DATE would be
 obvious or implied. In the end it's a win-win situation: you tell the
 system what you want, and your code is clearer.
 > 3. IS_BINARY_COMPATIBLE seems like a bogus concept.
 At least it's bogus when used for types which are not actually binary
 compatible, e.g. int4 and oid. The result of the current implementation is
 that you can perfectly happily insert and retrieve negative numbers from
 oid fields.
 I'm not so sure about the value of this particular equivalency anyway.
 AFAICS the only functions that make sense for oids are comparisons (incl.
 min, max), adding integers to them, subtracting one oid from another.
 Silent mangling with int4 means that you can multiply them, square them,
 add floating point numbers to them (doesn't really work in practice
 though), all things that have no business with oids.
 I'd say define the operators that are useful for oids explicitly for oids
 and require casts for all others, so the users know what they're doing.
 The fact that an oid is also a number should be an implementation detail.
 In my mind oids are like pointers in C. Indiscriminate mangling of
 pointers and integers in C has long been dismissed as questionable coding.
 Of course I'd be very willing to consider counterexamples to these
 theories ...
 -- 
 Peter Eisentraut                  Sernanders väg 10:115
 peter_e@gmx.net                   75262 Uppsala
 http://yi.org/peter-e/            Sweden
 From tgl@sss.pgh.pa.us Tue Jun 13 04:58:20 2000
 Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24281
 	for <pgman@candle.pha.pa.us>; Tue, 13 Jun 2000 03:58:18 -0400 (EDT)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA02571;
 	Tue, 13 Jun 2000 03:58:43 -0400 (EDT)
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Proposal for fixing numeric type-resolution issues 
 In-reply-to: <200006130741.DAA23502@candle.pha.pa.us> 
 References: <200006130741.DAA23502@candle.pha.pa.us>
 Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
 	message dated "Tue, 13 Jun 2000 03:41:56 -0400"
 Date: Tue, 13 Jun 2000 03:58:43 -0400
 Message-ID: <2568.960883123@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Status: OR
 Bruce Momjian <pgman@candle.pha.pa.us> writes:
 > Again, anything to add to the TODO here?
 IIRC, there was some unhappiness with the proposal you quote, so I'm
 not sure we've quite agreed what to do... but clearly something must
 be done.
 			regards, tom lane
 >> We've got a collection of problems that are related to the parser's
 >> inability to make good type-resolution choices for numeric constants.
 >> In some cases you get a hard error; for example "NumericVar + 4.4"
 >> yields
 >> ERROR:  Unable to identify an operator '+' for types 'numeric' and 'float8'
 >> You will have to retype this query using an explicit cast
 >> because "4.4" is initially typed as float8 and the system can't figure
 >> out whether to use numeric or float8 addition.  A more subtle problem
 >> is that a query like "... WHERE Int2Var < 42" is unable to make use of
 >> an index on the int2 column: 42 is resolved as int4, so the operator
 >> is int24lt, which works but is not in the opclass of an int2 index.
 >> 
 >> Here is a proposal for fixing these problems.  I think we could get this
 >> done for 7.1 if people like it.
 >> 
 >> The basic problem is that there's not enough smarts in the type resolver
 >> about the interrelationships of the numeric datatypes.  All it has is
 >> a concept of a most-preferred type within the category of numeric types.
 >> (We are abusing the most-preferred-type mechanism, BTW, because both
 >> FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
 >> category!  This is in fact why the resolver can't make a choice for
 >> "numeric+float8".)  We need more intelligence than that.
 >> 
 >> I propose that we set up a strictly-ordered hierarchy of numeric
 >> datatypes, running from least preferred to most preferred:
 >> int2, int4, int8, numeric, float4, float8.
 >> Rather than simply considering coercions to the most-preferred type,
 >> the type resolver should use the following rules:
 >> 
 >> 1. No value will be down-converted (eg int4 to int2) except by an
 >> explicit conversion.
 >> 
 >> 2. If there is not an exact matching operator, numeric values will be
 >> up-converted to the highest numeric datatype present among the operator
 >> or function's arguments.  For example, given "int2 + int8" we'd up-
 >> convert the int2 to int8 and apply int8 addition.
 >> 
 >> The final piece of the puzzle is that the type initially assigned to
 >> an undecorated numeric constant should be NUMERIC if it contains a
 >> decimal point or exponent, and otherwise the smallest of int2, int4,
 >> int8, NUMERIC that will represent it.  This is a considerable change
 >> from the current lexer behavior, where you get either int4 or float8.
 >> 
 >> For example, given "NumericVar + 4.4", the constant 4.4 will initially
 >> be assigned type NUMERIC, we will resolve the operator as numeric plus,
 >> and everything's fine.  Given "Float8Var + 4.4", the constant is still
 >> initially numeric, but will be up-converted to float8 so that float8
 >> addition can be used.  The end result is the same as in traditional
 >> Postgres: you get float8 addition.  Given "Int2Var < 42", the constant
 >> is initially typed as int2, since it fits, and we end up selecting
 >> int2lt, thereby allowing use of an int2 index.  (On the other hand,
 >> given "Int2Var < 100000", we'd end up using int4lt, which is correct
 >> to avoid overflow.)
 >> 
 >> A couple of crucial subtleties here:
 >> 
 >> 1. We are assuming that the parser or optimizer will constant-fold
 >> any conversion functions that are introduced.  Thus, in the
 >> "Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
 >> time execution begins, so there's no performance loss.
 >> 
 >> 2. We cannot lose precision by initially representing a constant as
 >> numeric and later converting it to float.  Nor can we exceed NUMERIC's
 >> range (the default 1000-digit limit is more than the range of IEEE
 >> float8 data).  It would not work as well to start out by representing
 >> a constant as float and then converting it to numeric.
 >> 
 >> Presently, the pg_proc and pg_operator tables contain a pretty fair
 >> collection of cross-datatype numeric operators, such as int24lt,
 >> float48pl, etc.  We could perhaps leave these in, but I believe that
 >> it is better to remove them.  For example, if int42lt is left in place,
 >> then it would capture cases like "Int4Var < 42", whereas we need that
 >> to be translated to int4lt so that an int4 index can be used.  Removing
 >> these operators will eliminate some code bloat and system-catalog bloat
 >> to boot.
 >> 
 >> As far as I can tell, this proposal is almost compatible with the rules
 >> given in SQL92: in particular, SQL92 specifies that an operator having
 >> both "approximate numeric" (float) and "exact numeric" (int or numeric)
 >> inputs should deliver an approximate-numeric result.  I propose
 >> deviating from SQL92 in a single respect: SQL92 specifies that a
 >> constant containing an exponent (eg 1.2E34) is approximate numeric,
 >> which implies that the result of an operator using it is approximate
 >> even if the other operand is exact.  I believe it's better to treat
 >> such a constant as exact (ie, type NUMERIC) and only convert it to
 >> float if the other operand is float.  Without doing that, an assignment
 >> like
 >> UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
 >> will not work as desired because the constant will be prematurely
 >> coerced to float, causing precision loss.
 >> 
 >> Comments?
 >> 
 >> regards, tom lane
 >> 
 > -- 
 >   Bruce Momjian                        |  http://www.op.net/~candle
 >   pgman@candle.pha.pa.us               |  (610) 853-3000
 >   +  If your life is a hard drive,     |  830 Blythe Avenue
 >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
 From tgl@sss.pgh.pa.us Mon Jun 12 14:09:45 2000
 Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01993
 	for <pgman@candle.pha.pa.us>; Mon, 12 Jun 2000 13:09:43 -0400 (EDT)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA01515;
 	Mon, 12 Jun 2000 13:10:01 -0400 (EDT)
 To: Peter Eisentraut <peter_e@gmx.net>
 cc: Bruce Momjian <pgman@candle.pha.pa.us>,
        "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>,
        PostgreSQL-development <pgsql-hackers@postgresql.org>
 Subject: Re: [HACKERS] Adding time to DATE type 
 In-reply-to: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain> 
 References: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
 Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
 	message dated "Sun, 11 Jun 2000 13:41:24 +0200"
 Date: Mon, 12 Jun 2000 13:10:00 -0400
 Message-ID: <1512.960829800@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Status: ORr
 Peter Eisentraut <peter_e@gmx.net> writes:
 > Bruce Momjian writes:
 >> Can someone give me a TODO summary for this issue?
 > * make 'text' constants default to text type (not unknown)
 > (I think not everyone's completely convinced on this issue, but I don't
 > recall anyone being firmly opposed to it.)
 It would be a mistake to eliminate the distinction between unknown and
 text.  See for example my just-posted response to John Cochran on
 pgsql-general about why 'BOULEVARD'::text behaves differently from
 'BOULEVARD'::char.  If string literals are immediately assigned type
 text then we will have serious problems with char(n) fields.
 I think it's fine to assign string literals a type of 'unknown'
 initially.  What we need to do is add a phase of type resolution that
 considers treating them as text, but only after the existing logic fails
 to deduce a type.
 (BTW it might be better to treat string literals as defaulting to char(n)
 instead of text, allowing the normal promotion rules to replace char(n)
 with text if necessary.  Not sure if that would make things more or less
 confusing for operations that intermix fixed- and variable-width char
 types.)
 			regards, tom lane
 From pgsql-hackers-owner+M1936@postgresql.org Sun Dec 10 13:17:54 2000
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA20676
 	for <pgman@candle.pha.pa.us>; Sun, 10 Dec 2000 13:17:54 -0500 (EST)
 Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eBAIGvZ40566;
 	Sun, 10 Dec 2000 13:16:57 -0500 (EST)
 	(envelope-from pgsql-hackers-owner+M1936@postgresql.org)
 Received: from sss.pgh.pa.us (sss.pgh.pa.us [209.114.132.154])
 	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eBAI8HZ39820
 	for <pgsql-hackers@postgreSQL.org>; Sun, 10 Dec 2000 13:08:17 -0500 (EST)
 	(envelope-from tgl@sss.pgh.pa.us)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss.pgh.pa.us (8.11.1/8.11.1) with ESMTP id eBAI82o28682;
 	Sun, 10 Dec 2000 13:08:02 -0500 (EST)
 To: Thomas Lockhart <lockhart@alumni.caltech.edu>
 cc: pgsql-hackers@postgresql.org
 Subject: [HACKERS] Unknown-type resolution rules, redux
 Date: Sun, 10 Dec 2000 13:08:02 -0500
 Message-ID: <28679.976471682@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Precedence: bulk
 Sender: pgsql-hackers-owner@postgresql.org
 Status: OR
 parse_coerce.c contains the following conversation --- I believe the
 first XXX comment is from me and the second from you:
    /*
     * Still too many candidates? Try assigning types for the unknown
     * columns.
     *
     * We do this by examining each unknown argument position to see if all
     * the candidates agree on the type category of that slot.  If so, and
     * if some candidates accept the preferred type in that category,
     * eliminate the candidates with other input types.  If we are down to
     * one candidate at the end, we win.
     *
     * XXX It's kinda bogus to do this left-to-right, isn't it?  If we
     * eliminate some candidates because they are non-preferred at the
     * first slot, we won't notice that they didn't have the same type
     * category for a later slot.
     * XXX Hmm. How else would you do this? These candidates are here because
     * they all have the same number of matches on arguments with explicit
     * types, so from here on left-to-right resolution is as good as any.
     * Need a counterexample to see otherwise...
     */
 The comment is out of date anyway because it fails to mention the new
 rule about preferring STRING category.  But to answer your request for
 a counterexample: consider
 	SELECT foo('bar', 'baz')
 First, suppose the available candidates are
 	foo(float8, int4)
 	foo(float8, point)
 In this case, we examine the first argument position, see that all the
 candidates agree on NUMERIC category, so we consider resolving the first
 unknown input to float8.  That eliminates neither candidate so we move
 on to the second argument position.  Here there is a conflict of
 categories so we can't eliminate anything, and we decide the call is
 ambiguous.  That's correct (or at least Operating As Designed ;-)).
 But now suppose we have
 	foo(float8, int4)
 	foo(float4, point)
 Here, at the first position we will still see that all candidates agree
 on NUMERIC category, and then we will eliminate candidate 2 because it
 isn't the preferred type in that category.  Now when we come to the
 second argument position, there's only one candidate left so there's
 no category conflict.  Result: this call is considered non-ambiguous.
 This means there is a left-to-right bias in the algorithm.  For example,
 the exact same call *would* be considered ambiguous if the candidates'
 argument orders were reversed:
 	foo(int4, float8)
 	foo(point, float4)
 I do not like that.  You could maybe argue that earlier arguments are
 more important than later ones for functions, but it's harder to make
 that case for binary operators --- and in any case this behavior is
 extremely difficult to explain in prose.
 To fix this, I think we need to split the loop into two passes.
 The first pass does *not* remove any candidates.  What it does is to
 look separately at each UNKNOWN-argument position and attempt to deduce
 a probable category for it, using the following rules:
 * If any candidate has an input type of STRING category, use STRING
 category; else if all candidates agree on the category, use that
 category; else fail because no resolution can be made.
 * The first pass must also remember whether any candidates are of a
 preferred type within the selected category.
 The probable categories and exists-preferred-type booleans are saved in
 local arrays.  (Note this has to be done this way because
 IsPreferredType currently allows more than one type to be considered
 preferred in a category ... so the first pass cannot try to determine a
 unique type, only a category.)
 If we find a category for every UNKNOWN arg, then we enter a second loop
 in which we discard candidates.  In this pass we discard a candidate if
 (a) it is of the wrong category, or (b) it is of the right category but
 is not of preferred type in that category, *and* we found candidate(s)
 of preferred type at this slot.
 If we end with exactly one candidate then we win.
 It is clear in this algorithm that there is no order dependency: the
 conditions for keeping or discarding a candidate are fixed before we
 start the second pass, and do not vary depending on which other
 candidates were discarded before it.
 Comments?
 			regards, tom lane
 From pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 15:47:47 2001
 Return-path: <pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org>
 Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
 	by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBTKlkT05111
 	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 15:47:46 -0500 (EST)
 Received: from postgresql.org (postgresql.org [64.49.215.8])
 	by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTKhZN74322
 	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 14:43:35 -0600 (CST)
 	(envelope-from pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org)
 Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
 	by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTKaem38452
 	for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 15:36:40 -0500 (EST)
 	(envelope-from pgman@candle.pha.pa.us)
 Received: (from pgman@localhost)
 	by candle.pha.pa.us (8.11.6/8.10.1) id fBTKaTg04256;
 	Sat, 29 Dec 2001 15:36:29 -0500 (EST)
 From: Bruce Momjian <pgman@candle.pha.pa.us>
 Message-ID: <200112292036.fBTKaTg04256@candle.pha.pa.us>
 Subject: Re: [GENERAL] Casting Varchar to Numeric
 In-Reply-To: <20011206150158.O28880-100000@megazone23.bigpanda.com>
 To: Stephan Szabo <sszabo@megazone23.bigpanda.com>
 Date: Sat, 29 Dec 2001 15:36:29 -0500 (EST)
 cc: Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
 X-Mailer: ELM [version 2.4ME+ PL96 (25)]
 MIME-Version: 1.0
 Content-Transfer-Encoding: 7bit
 Content-Type: text/plain; charset=US-ASCII
 Precedence: bulk
 Sender: pgsql-general-owner@postgresql.org
 Status: OR
 > On Mon, 3 Dec 2001, Andy Marden wrote:
 > 
 > > Martijn,
 > >
 > > It does work (believe it or not). I've now tried the method you mention
 > > below - that also works and is much nicer. I can't believe that PostgreSQL
 > > can't work this out. Surely implementing an algorithm that understands that
 > > if you can go from a ->b and b->c then you can certainly go from a->c. If
 > 
 > It's more complicated than that (and postgres does some of this but not
 > all), for example the cast text->float8->numeric potentially loses
 > precision and should probably not be an automatic cast for that reason.
 > 
 > > this is viewed as too complex a task for the internals - at least a diagram
 > > or some way of understanding how you should go from a->c would be immensely
 > > helpful wouldn't it! Daunting for anyone picking up the database and trying
 > > to do something simple(!)
 > 
 > There may be a need for documentation on this.  Would you like to write
 > some ;)
 OK, I ran some tests:
 	test=> create table test (x text);
 	CREATE
 	test=> insert into test values ('323');
 	INSERT 5122745 1
 	test=> select cast (x as numeric) from test;
 	ERROR:  Cannot cast type 'text' to 'numeric'
 I can see problems with automatically casting numeric to text because
 you have to guess the desired format, but going from text to numeric
 seems quite easy to do.  Is there a reason we don't do it?
 I can cast to integer and float8 fine:
 	test=> select cast ( x as integer) from test;
 	 ?column? 
 	----------
 	      323
 	(1 row)
 	test=> select cast ( x as float8) from test;
 	 ?column? 
 	----------
 	      323
 	(1 row)
 -- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
 ---------------------------(end of broadcast)---------------------------
 TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 From pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 19:10:38 2001
 Return-path: <pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org>
 Received: from west.navpoint.com (west.navpoint.com [207.106.42.13])
 	by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBU0AbT23972
 	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 19:10:37 -0500 (EST)
 Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
 	by west.navpoint.com (8.11.6/8.10.1) with ESMTP id fBTNVj008959
 	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 18:31:45 -0500 (EST)
 Received: from postgresql.org (postgresql.org [64.49.215.8])
 	by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTNQrN78655
 	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 17:26:53 -0600 (CST)
 	(envelope-from pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org)
 Received: from sss.pgh.pa.us ([192.204.191.242])
 	by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTN8Fm47978
 	for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 18:08:15 -0500 (EST)
 	(envelope-from tgl@sss.pgh.pa.us)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id fBTN7vg20245;
 	Sat, 29 Dec 2001 18:07:57 -0500 (EST)
 To: Bruce Momjian <pgman@candle.pha.pa.us>
 cc: Stephan Szabo <sszabo@megazone23.bigpanda.com>,
   Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
 Subject: Re: [GENERAL] Casting Varchar to Numeric 
 In-Reply-To: <200112292036.fBTKaTg04256@candle.pha.pa.us> 
 References: <200112292036.fBTKaTg04256@candle.pha.pa.us>
 Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
 	message dated "Sat, 29 Dec 2001 15:36:29 -0500"
 Date: Sat, 29 Dec 2001 18:07:57 -0500
 Message-ID: <20242.1009667277@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Precedence: bulk
 Sender: pgsql-general-owner@postgresql.org
 Status: OR
 Bruce Momjian <pgman@candle.pha.pa.us> writes:
 > I can see problems with automatically casting numeric to text because
 > you have to guess the desired format, but going from text to numeric
 > seems quite easy to do.  Is there a reason we don't do it?
 I do not think it's a good idea to have implicit casts between text and
 everything under the sun, because that essentially destroys the type
 checking system.  What we need (see previous discussion) is a flag in
 pg_proc that says whether a type conversion function may be invoked
 implicitly or not.  I've got no problem with offering text(numeric) and
 numeric(text) functions that are invoked by explicit function calls or
 casts --- I just don't want the system trying to use them to make
 sense of a bogus query.
 > I can cast to integer and float8 fine:
 I don't believe that those should be available as implicit casts either.
 They are, at the moment:
 regression=# select 33 || 44.0;
 ?column?
 ----------
 3344
 (1 row)
 Ugh.
 			regards, tom lane
 ---------------------------(end of broadcast)---------------------------
 TIP 6: Have you searched our list archives?
 http://archives.postgresql.org
--- a/doc/TODO.detail/vacuum
+++ b/doc/TODO.detail/vacuum
--- a/doc/TODO.detail/yacc
+++ b/doc/TODO.detail/yacc
@ -1,402 +0,0 @@
 From selkovjr@mcs.anl.gov Sat Jul 25 05:31:05 1998
 Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA16564
 	for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:31:03 -0400 (EDT)
 Received: from antares.mcs.anl.gov (mcs.anl.gov [140.221.9.6]) by renoir.op.net (o1/$ Revision: 1.18 $) with SMTP id FAA01775 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:28:22 -0400 (EDT)
 Received: from mcs.anl.gov (wit.mcs.anl.gov [140.221.5.148]) by antares.mcs.anl.gov (8.6.10/8.6.10)  with ESMTP
 	id EAA28698 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 04:27:05 -0500
 Sender: selkovjr@mcs.anl.gov
 Message-ID: <35B9968D.21CF60A2@mcs.anl.gov>
 Date: Sat, 25 Jul 1998 08:25:49 +0000
 From: "Gene Selkov, Jr." <selkovjr@mcs.anl.gov>
 Organization: MCS, Argonne Natl. Lab
 X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.32 i586)
 MIME-Version: 1.0
 To: Bruce Momjian <maillist@candle.pha.pa.us>
 Subject: position-aware scanners
 References: <199807250524.BAA07296@candle.pha.pa.us>
 Content-Type: text/plain; charset=us-ascii
 Content-Transfer-Encoding: 7bit
 Status: RO
 Bruce,
 I attached here (trough the web links) a couple examples, totally
 irrelevant to postgres but good enough to discuss token locations. I
 might as well try to patch the backend parser, though not sure how soon.
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1. 
 The first c parser I wrote,
 http://wit.mcs.anl.gov/~selkovjr/unit-troff.tgz, is not very
 sophisticated, so token locations reported by yyerr() may be slightly
 incorrect (+/- one position depending on the existence and type of the
 lookahead token. It is a filter used to typeset the units of measurement
 with eqn. To use it, unpack the tar file and run make. The Makefile is
 not too generic but I built it on various systems including linux,
 freebsd and sunos 4.3. The invocation can be something like this:
 ./check 0 parse "l**3/(mmoll*min)"
 parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
 `'(''
 l**3/(mmoll*min)
      ^^^^^
 Now to the guts. As far as I can imagine, the only way to consistently
 keep track of each character read by the scanner (regardless of the
 length of expressions it will match) is to redefine its YY_INPUT like
 this:
 #undef YY_INPUT
 #define YY_INPUT(buf,result,max_size) \
 { \
 	int c	= (int) buffer[pos++]; \
 	result = (c == '\0') ?	YY_NULL	: (buf[0] = c, 1); \
 }
 Here, buffer is the pointer to the origin of the string being scanned
 and pos is a global variable, similar in usage to a file pointer (you
 can both read and manipulate it at will). The buffer and the pointer are
 initialized by the function 
 void setString(char *s)
 {
   buffer = s;
   pos = 0;
 }
 each time the new string is to be parsed. This (exportable) function is
 part of the interface. 
 In this simplistic design, yyerror() is part of the scanner module and
 it uses the pos variable to report the location of unexpected tokens.
 The downside of such arrangement is that in case of error condition, you
 can't easily tell whether your context is current or lookahead token, it
 just reports the position of the last token read (be it $ (end of
 buffer) or something else):
 ./check 0 convert "mol/foo"
 parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
 `'(''
 mol/foo
       ^^^
 (should be at the beginning of "foo")
 ./check 0 convert "mmol//l"        
 parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
 `'(''
 mmol//l
    ^
 (should be at the second '/')
 I believe this is why most simple parsers made with yacc would report
 parse errors being "at or near" some token, which is fair enough if the
 expression is not too complex.
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2. The second version of the same scanner,
 http://wit.mcs.anl.gov/~selkovjr/scanner-example.tgz, addresses this
 problem by recording exact locations of the tokens in each instance of
 the token semantic data structure. The global,
 UNIT_YYSTYPE unit_yylval;
 would be normally used to export the token semantics (including its
 original or modified text and location data) to the parser.
 Unfortunately, I cannot show you the parser part in c, because that's
 about when I stopped writing parsers in c. Instead, I included a small
 test program, test.c, that mimics the parser's expectations for the
 scanner data pretty well. I am assuming here that you are not interested
 in digging someone else's ugly guts for relatively small bit of
 information; let me know if I am wrong and I will send you the complete
 perl code (also generated with bison).
 To run this example, unpack the tar file and run Make. Then do
  gcc test.c scanner.o
 and run a.out
 Note the line
    yylval = unit_getyylval();
 in test.c. You will not normally need it in a c parser. It is enough to
 define yylval as an external variable and link it to yylval in yylex()
 In the bison-generated parser, yylval gets pushed into a stack (pointed
 to by yylsp) each time a new token is read. For each syntax rule, the
 bison macros @1, @2, ... are just shortcuts to locations in the stack 1,
 2, ... levels deep. In following code fragment, @3 refers to the
 location info for the third term in the rule (INTEGER):
 (sorry about perl, but I think you can do the same things in c without
 significant changes to your existing parser)
 term:           base    {
                        $$ = $1;
                        $$->{'order'} = 1;
                }
        |       base EXP INTEGER {
                        $$ = $1;
                        $$->{'order'} = @3->{'text'};
                        $$->{'scale'} = $$->{'scale'} ** $$->{'order'};
                        if ( $$->{'order'} == 0 ) {
                                yyerror("Error: expecting a non-zero
 integer exponent");
                                YYERROR;
                        }
                }
 which translates to:
  ($yyn == 10)    && do {
          $yyval = $yyvsa[-1];
          $yyval->{'order'} = 1;
          last SWITCH;
  };
  ($yyn == 11)    && do {
          $yyval = $yyvsa[-3];
          $yyval->{'order'} = $yylsa[-1]->{'text'}
          $yyval->{'scale'} = $yyval->{'scale'} ** $yyval->{'order'};
          if ( $yyval->{'order'} == 0 ) {
                   yyerror("Error: expecting a non-zero integer
 exponent");
                   goto yyerrlab1 ;
          }
          last SWITCH;
  };
 In c, you will have a bit more complicated pointer arithmetic to adress
 the stack, but the usage of objects will be the same. Note here that it
 is convenient to keep all information about the token in its location
 info, (yylsa, yylsp, yylval, @n), while everything relating to the value
 of the expression, or to the parse tree, is better placed in the
 semantic stack (yyssa, yyssp, yysval, $n). Also note that in some cases
 you can do semantic checks inside rules and report useful messages
 before or instead of invoking yyerror();
 Finally, it is useful to make the following wrapper function around
 external yylex() in order to maintain your own token stack. Unlike the
 parser's internal stack which is only as deep as the rule being reduced,
 this one can hold all tokens recognized during the current run, and that
 can be extremely helpful for error reporting and any transformations you
 may need. In this way, you can even scan (tokenize) the whole buffer
 before handing it off to the parser (who knows, you may need a token
 ahead of what is currently seen by the parser):
 sub tokenize {
    undef @tokenTable;
    my ($tok, $text, $name, $unit, $first_line, $first_column,
 $last_line, $last_column);
    while ( ($tok = &UnitLex::yylex()) > 0 ) { # this is where the
 c-coded yylex is called,
                                               # UnitLex is the perl
 extension encapsulating it                            
       ( $text, $name, $unit, $first_line, $first_column, $last_line,
 $last_column ) = &UnitLex::getyylval;
       push(@tokenTable, 
           Unit::yyltype->new (
              'token'         => $tok,
              'text'          => $text,
              'name'          => $name,
              'unit'          => $unit,
              'first_line'    => $first_line,
              'first_column'  => $first_column,
              'last_line'     => $last_line,
              'last_column'   => $last_column,
           )
       )
    }
 }
 It is now a lot easier to handle various state-related problems, such as
 backtracking and error reporting. The yylex() function as seen by the
 parser might be constructed somewhat like this:
 sub yylex {
    $yylloc = $tokenTable[$tokenNo];  # $tokenNo is a global; now
 instead of a "file pointer",
                                      # as in the first example, we have
 a "token pointer"
    undef $yylval;
    # disregard this; name this block "computing semantic values"       
    if ( $yylloc->{'token'} == UNIT) {
        $yylval = Unit::Operand->new(
        'unit'  => Unit::Dict::unit($yylloc->{'unit'}),
        'base'  => Unit::Dict::base($yylloc->{'unit'}),
        'scale' => Unit::Dict::scale($yylloc->{'unit'}),
        'scaleToBase' => Unit::Dict::scaleToBase($yylloc->{'unit'}),
        'loc'   => $yylloc,
       );    
    }
    elsif ( ($yylloc->{'token'} == INTEGER ) || ($yylloc->{'token'} ==
 POSITIVE_NUMBER) ) {
        $yylval = Unit::Operand->new(
          'unit' => '1',
          'base' => '1',
          'scale' => 1,
          'scaleToBase' => 1,
          'loc'   => $yylloc,
        );
    }
    $tokenNo++;
    return(%{$yylloc}->{'token'}); # This is all the parser needs to
 know about this token. 
                                   # But we already made sure we saved
 everything we need to know.
 }
 Now the most interesting part, the error reporting routine:
 sub yyerror {
    my ($str) = @_;
    my ($message, $start, $end, $loc);
    $loc = $tokenTable[$tokenNo-1]; # This is the same as to say, 
                                    # "obtain the location info for the
 current token"
    # You may use this routine for your own purposes or let parser use
 it
    if( $str ne 'parse error' ) {
        $message = "$str instead of `" . $loc->{'name'} . "' <" .
 $loc->{'text'} . ">,  at line " . $loc->{'first_line'} . ":\n\
 n";
    }
    else {
        $message = "unexpected token `" . $loc->{'name'} . "' <" .
 $loc->{'text'} . ">,  at line " . loc->{'first_line'} . ":\n
 \n";
    }
    $message .= $parseBuffer . "\n"; # that's the original string that
 was used to set the parser buffer
    $message .= ( ' ' x ($loc->{'first_column'} + 1) ) . ( '^' x
 length($loc->{'text'}) ). "\n";
    if( $str ne 'parse error' ) {
        print STDERR "$str instead of `", $loc->{'name'}, "' {",
 $loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
    }
    else {
        print STDERR "unexpected token `", $loc->{'name'}, "' {",
 $loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
    }
    print STDERR "$parseBuffer\n";
    print STDERR ' ' x ($loc->{'first_column'} + 1), '^' x
 length($loc->{'text'}), "\n";
 }
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Scanners used in these examples assume there is a single line of text on
 the input (the first_line and last_line elements of yylloc are simply
 ignored). If you want to be able to parse multi-line buffers, just add a
 lex rule for '\n' that will increment the line count and reset the pos
 variable to zero.
 Ugly as it may seem, I find this approach extremely liberating. If the
 grammar becomes too complicated for a LALR(1) parser, I can cascade
 multiple parsers. The token table can then be used to reassemble parts
 of original expression for subordinate parsers, preserving the location
 info all the way down, so that subordinate parsers can report their
 problems consistently. You probably don't need this, as SQL is very well
 thought of and has parsable grammar. But it may be of some help, for
 error reporting. 
 --Gene
 From pgsql-patches-owner+M1499@postgresql.org Sat Aug  4 13:11:53 2001
 Return-path: <pgsql-patches-owner+M1499@postgresql.org>
 Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f74HBrh11339
 	for <pgman@candle.pha.pa.us>; Sat, 4 Aug 2001 13:11:53 -0400 (EDT)
 Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 	by postgresql.org (8.11.3/8.11.4) with SMTP id f74H89655183;
 	Sat, 4 Aug 2001 13:08:09 -0400 (EDT)
 	(envelope-from pgsql-patches-owner+M1499@postgresql.org)
 Received: from sss.pgh.pa.us ([192.204.191.242])
 	by postgresql.org (8.11.3/8.11.4) with ESMTP id f74Gxb653074
 	for <pgsql-patches@postgresql.org>; Sat, 4 Aug 2001 12:59:37 -0400 (EDT)
 	(envelope-from tgl@sss.pgh.pa.us)
 Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 	by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f74GtPC29183;
 	Sat, 4 Aug 2001 12:55:25 -0400 (EDT)
 To: Dave Page <dpage@vale-housing.co.uk>
 cc: "'Fernando Nasser'" <fnasser@cygnus.com>,
   Bruce Momjian <pgman@candle.pha.pa.us>, Neil Padgett <npadgett@redhat.com>,
   pgsql-patches@postgresql.org
 Subject: Re: [PATCHES] Patch for Improved Syntax Error Reporting 
 In-Reply-To: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk> 
 References: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk>
 Comments: In-reply-to Dave Page <dpage@vale-housing.co.uk>
 	message dated "Sat, 04 Aug 2001 12:37:23 +0100"
 Date: Sat, 04 Aug 2001 12:55:24 -0400
 Message-ID: <29180.996944124@sss.pgh.pa.us>
 From: Tom Lane <tgl@sss.pgh.pa.us>
 Precedence: bulk
 Sender: pgsql-patches-owner@postgresql.org
 Status: OR
 Dave Page <dpage@vale-housing.co.uk> writes:
 > Oh, I quite agree. I'm not adverse to updating my code, I just want to avoid
 > users getting misleading messages until I come up with those updates.
 Hmm ... if they were actively misleading then I'd share your concern.
 I guess what you're thinking is that the error offset reported by the
 backend won't correspond directly to what the user typed, and if the
 user tries to use the offset to manually count off characters, he may
 arrive at the wrong place?  Good point.  I'm not sure whether a message
 like
 	ERROR:  parser: parse error at or near 'frum';
 	POSITION: 42
 would be likely to encourage people to try that.  Thoughts?  (I do think
 this is a good argument for not embedding the position straight into the
 main error message though...)
 One possible compromise is to combine the straight character-offset
 approach with a simplistic context display:
 	ERROR:  parser: parse error at or near 'frum';
 	POSITION: 42  ... oid,relname FRUM ...
 The idea is to define the "POSITION" field as an integer offset possibly
 followed by whitespace and noise words.  An updated client would grab
 the offset, ignore the rest of the field, and do the right thing.  A
 not-updated client would display the entire message, and with any luck
 the user would read it correctly.
 			regards, tom lane
 ---------------------------(end of broadcast)---------------------------
 TIP 5: Have you checked our extensive FAQ?
 http://www.postgresql.org/users-lounge/docs/faq.html