postgresql/doc/TODO

TODO list for PostgreSQL
========================
Last updated:		Mon Oct 28 15:05:02 EST 2002

Current maintainer:	Bruce Momjian (pgman@candle.pha.pa.us)

The most recent version of this document can be viewed at
the PostgreSQL web site, http://www.PostgreSQL.org.

A dash (-) marks changes that will appear in the upcoming 7.3 release.

Bracketed items "[]" have more detailed.


Urgent
======

* Add replication of distributed databases [replication]
	o automatic failover
	o load balancing
	o master/slave replication
	o multi-master replication
	o partition data across servers
	o sample implementation in contrib/rserv
	o queries across databases or servers (two-phase commit)
	o allow replication over unreliable or non-persistent links
	o http://gborg.postgresql.org/project/pgreplication/projdisplay.php
* Point-in-time data recovery using backup and write-ahead log


Reporting
=========

* Allow elog() to return error codes, module name, file name, line
  number, not just messages (Peter E)
* Add error codes (Peter E)
* Make error messages more consistent [error]
* -Change DEBUG startup tag to LOG (Bruce)
* Show location of syntax error in query [yacc]
* -Add pg_backend_pid() function to backend
* -Allow logging of query durations


Permissions
===========

* -Improve control over user privileges, including table creation
* -Allow user/group names to be specified directly in pg_hba.conf (Bruce)
* -Add ~/.pgpass to store passwords with user/host/password combinations
* -Allow permissions for functions (Peter E)
* -Allow object creation to be disabled for specific users


Administration
==============

* Incremental backups
* -Make it easier to create a database owned by someone who can't createdb,
  perhaps CREATE DATABASE dbname WITH OWNER = "user" (Gavin)
* -Make equals sign optional in CREATE DATABASE WITH param = 'val'
* Remove unreferenced table files and temp tables during database vacuum
  or postmaster startup (Bruce)
* Remove behavior of postmaster -o after making postmaster/postgres
  flags unique
* -Prevent SIGHUP and 'pg_ctl reload' from changing command line
  specified parameters to postgresql.conf defaults (Peter E)
* Allow easy display of usernames in a group
* Allow configuration files to be specified in a different directory
* -Reserve last few process slots for super-user if max_connections reached
* -Add GUC parameter to print queries that generate errors
* Add start time to pg_stat_activity
* Allow limits on per-db/user connections


Data Types
==========

* -Add domain capability (Rod Taylor)
* Add IPv6 capability to INET/CIDR types
* Remove Money type, add money formatting for decimal type
* -SELECT cash_out(2) crashes because of opaque
* -Declare typein/out functions in pg_proc with a special "C string" data type
* -Functions returning sets do not totally work
* Change factorial to return a numeric
* Change NUMERIC data type to use base 10,000 internally
* Change NUMERIC to enforce the maximum precision, and increase it
* Add function to return compressed length of TOAST data values (Tom)
* -Add GUC parameter for DATESTYLE
* Allow INET subnet tests using non-constants
* -Allow bytea to handle LIKE with non-TEXT patterns
* -to_char(0,'FM999.99') returns a period, to_char(1,'FM999.99') doesn't (Karel)
* -Add floor(float8) and other missing functions
* Add now("transaction|statement|clock") functionality

* CONVERSION
	o -Store binary-compatible type information in the system
	o Allow better handling of numeric constants, type conversion
	  [typeconv]

* ARRAYS
	o Allow nulls in arrays
	o Allow arrays to be ORDER'ed
	o -Ensure we have array-eq operators for every built-in array type
	o Support construction of array result values in expressions

* BINARY DATA
	o Improve vacuum of large objects, like /contrib/vacuumlo
	o Add security checking for large objects
	o Make file in/out interface for TOAST columns, similar to large object
	  interface (force out-of-line storage and no compression)
	o Auto-delete large objects when referencing row is deleted


Multi-Language Support
======================

* Add NCHAR (as distinguished from ordinary varchar),
* Allow LOCALE on a per-column basis, default to ASCII
* Support multiple simultaneous character sets, per SQL92
* Improve Unicode combined character handling
* Optimize locale to have minimal performance impact when not used (Peter E)
* Add octet_length_server() and octet_length_client() (Thomas, Tatsuo)
* Make octet_length_client the same as octet_length() (?)
* Prevent mismatch of frontend/backend encodings from converting bytea
  data from being interpreted as encoded strings
* Remove Cyrillic recode support


Views / Rules
=============

* Automatically create rules on views so they are updateable, per SQL92 [view]
* Add the functionality for WITH CHECK OPTION clause of CREATE VIEW
* Allow NOTIFY in rules involving conditionals
* Have views on temporary tables exist in the temporary namespace
* Move psql backslash information into views
* Allow RULE recompilation
* -Remove brackets as multi-statement rule grouping, must use parens (Bruce)
* Prevent aggregates from being used in rule WHERE clauses


Indexes
=======

* Allow CREATE INDEX zman_index ON test (date_trunc( 'day', zman ) datetime_ops)
  fails index can't store constant parameters
* Order duplicate index entries by tid for faster heap lookups
* Allow inherited tables to inherit index, UNIQUE constraint, and primary
  key, foreign key  [inheritance]
* UNIQUE INDEX on base column not honored on inserts from inherited table
  INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
  [inheritance]
* -Allow UPDATE/DELETE on inherited table
* Add UNIQUE capability to non-btree indexes
* Add btree index support for reltime, tinterval, regproc
* Add rtree index support for line, lseg, path, point
* Certain indexes will not shrink, e.g. indexes on ever-increasing
  columns and indexes with many duplicate keys
* Use indexes for min() and max() or convert to SELECT col FROM tab ORDER
  BY col DESC LIMIT 1 if appropriate index exists and WHERE clause acceptible
* Allow LIKE indexing optimization for non-ASCII locales
* Use index to restrict rows returned by multi-key index when used with
  non-consecutive keys or OR clauses, so fewer heap accesses
* Be smarter about insertion of already-ordered data into btree index
* -Add deleted bit to index tuples to reduce heap access
* Prevent index uniqueness checks when UPDATE does not modifying column
* Use bitmaps to fetch heap pages in sequential order [performance]
* Use bitmaps to combine existing indexes [performance]
* Improve handling of index scans for NULL
* Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8,
  float4, numeric/decimal too [optimizer]
* Add FILLFACTOR to btree index creation
* Improve concurrency in GIST
* Improve concurrency of hash indexes (Neil)
* -Test hash index performance and discourage usage


Commands
========

* -Add SIMILAR TO to allow character classes, 'pg_[a-c]%'
* Add BETWEEN ASYMMETRIC/SYMMETRIC (Christopher)
* -Remove LIMIT #,# and force use LIMIT and OFFSET clauses in 7.3 (Bruce)
* Allow LIMIT/OFFSET to use expressions
* -Disallow TRUNCATE on tables that are involved in referential constraints
* -Add OR REPLACE clauses to non-FUNCTION object creation
* CREATE TABLE AS can not determine column lengths from expressions [atttypmod]
* Allow UPDATE to handle complex aggregates [update]
* -Prevent create/drop scripts from allowing extra args (Bruce)
* Allow command blocks to ignore certain types of errors
* Allow backslash handling in quoted strings to be disabled for portability
* Return proper effected tuple count from complex commands [return]
* Allow DELETE to handle table aliases for self-joins [delete]
* Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
* Allow REINDEX to rebuild all indexes, remove /contrib/reindex
* Make a transaction-safe TRUNCATE

* ALTER
	o ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT
	o -Add ALTER TABLE DROP COLUMN feature
	o -Add ALTER TABLE DROP non-CHECK CONSTRAINT
	o -ALTER TABLE ADD PRIMARY KEY (Tom)
	o -ALTER TABLE ADD UNIQUE (Tom)
	o -ALTER TABLE ALTER COLUMN SET/DROP NOT NULL (Christopher)
	o ALTER TABLE ADD COLUMN column DEFAULT should fill existing
	  rows with DEFAULT value
	o ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence because
          of the item above
	o -Have ALTER TABLE OWNER change all dependant objects like indexes
	o Add ALTER TABLE tab SET WITHOUT OIDS

* CLUSTER
	o -Cluster all tables at once using pg_index.indisclustered set during
          previous CLUSTER
	o -Prevent loss of indexes, permissions, inheritance
	o Automatically maintain clustering on a table
	o Allow CLUSTER to cluster all tables, remove clusterdb

* COPY
	o -Allow specification of column names
	o Allow psql \copy to specify column names
	o Allow dump/load of CSV format
	o -Change syntax to WITH DELIMITER, (keep old syntax around?)
	o Allow COPY to report error lines and continue;  optionally
	  allow error codes to be specified; requires savepoints or can
	  not be run in a multi-statement transaction
	o -Generate failure on short COPY lines rather than pad NULLs
	o Allow copy to understand \x as hex

* CURSOR
	o Allow BINARY option to SELECT, just like DECLARE
	o MOVE 0 should not move to end of cursor
	o Allow UPDATE/DELETE WHERE CURRENT OF cursor using per-cursor tid
	  stored in the backend
	o Prevent DROP of table being referenced by our own open cursor
	o Allow cursors outside transactions [cursor]

* INSERT
	o Allow INSERT/UPDATE of system-generated oid value for a row
	o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..)
	o -Allow INSERT INTO my_table VALUES (a, b, c, DEFAULT, x, y, z, ...)
	o -Disallow missing columns in INSERT ... (col) VALUES, per ANSI
	o Allow INSERT/UPDATE ... RETURNING new.col or old.col; handle
	  RULE cases (Philip)

* SHOW/SET
	o -Add command to display locks
	o -Add SET or BEGIN timeout parameter to cancel query
	o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT using printf args
	o -Remove SET KSQO option now that OR processing is improved (Bruce)
	o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
	  ANALYZE, and CLUSTER
	o -Add SHOW command to see locale
	o -Allow SHOW to output as a query result, like EXPLAIN
	o -Abort all SET changes made in an aborted transaction
	o Add SET SCHEMA
	o Allow EXPLAIN EXECUTE to see prepared plans
	o Allow SHOW of non-modifiable variables, like pg_controldata
	o Add GUC parameter to control the maximum number of rewrite cycles

* SERVER-SIDE LANGUAGES
	o Allow PL/PgSQL's RAISE function to take expressions
	o -Fix PL/PgSQL to handle quoted mixed-case identifiers
	o Change PL/PgSQL to use palloc() instead of malloc()
	o Add untrusted version of plpython
	o Allow Java server-side programming, http://pljava.sourceforge.net
	  [java]
	o Fix problems with complex temporary table creation/destruction
	  without using PL/PgSQL EXECUTE, needs cache prevention/invalidation
        o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW
	o Improve PL/PgSQL exception handling
	o Allow parameters to be specified by name and type during
	  definition
	o Allow function parameters to be passed by name,
	  get_employee_salary(emp_id => 12345, tax_year => 2001)
	o Add PL/PgSQL packages
	o Allow array declarations and other data types in PL/PgSQL DECLARE
	o Add PL/PgSQL PROCEDURES that can return multiple values
	o Add table function support to pltcl, plperl, plpython
	o Make PL/PgSQL %TYPE schema-aware


Clients
=======

* -Have pg_dump use LEFT OUTER JOIN in multi-table SELECTs
  or multiple SELECTS to avoid bad system catalog entries
* -Have pg_dump -C dump database location and encoding information
* -Allow psql \d to show foreign keys
* -Allow psql \d to show temporary table structure (Tom)
* Allow psql to show transaction status if backend protocol changes made
* Add XML interface:  psql, pg_dump, COPY, separate server (?)
* -Have pg_dump use ADD PRIMARY KEY after COPY, for performance (Neil)
* Add schema, cast, and conversion backslash commands to psql
* Allow pg_dump to dump a specific schema

* JDBC
	o Comprehensive test suite. This may be available already.
	o -Updateable resultSet
	o JDBC-standard BLOB support
	o Error Codes (pending backend implementation)
	o Support both 'make' and 'ant'
	o Fix LargeObject API to handle OIDs as unsigned ints
	o -Implement cancel() method on Statement
	o Use cursors implicitly to avoid large results (see setCursorName())
        o -Add support for CallableStatements
	o Add LISTEN/NOTIFY support to the JDBC driver (Barry)
	o -Compile under jdk 1.4

* ECPG
	o Implement set descriptor, using descriptor
	o Make casts work in variable initializations
	o Implement SQLDA
	o Allow multi-threaded use of SQLCA
	o Solve cardinality > 1 for input descriptors / variables
	o Understand structure definitions outside a declare section
	o sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified
	o Improve error handling
	o Allow :var[:index] or :var[<integer>] as cvariable for an array var
	o Add a semantic check level, e.g. check if a table really exists
	o Fix nested C comments
	o Add SQLSTATE
	o fix handling of DB attributes that are arrays


Referential Integrity
=====================

* Add MATCH PARTIAL referential integrity [foreign]
* Add deferred trigger queue file (Jan)
* -Allow oid to act as a foreign key
* Implement dirty reads and use them in RI triggers
* Enforce referential integrity for system tables
* -Allow user to control trigger firing order (Tom)
* -Add ALTER TRIGGER ... RENAME
* Change foreign key constraint for array -> element to mean element
  in array
* -Fix foreign key constraints to not error on intermediate db states (Stephan)
* Allow DEFERRABLE UNIQUE constraints
* Allow triggers to be disabled [trigger]


Dependency Checking
===================

* -Add pg_depend table for dependency recording; use sysrelid, oid,
  depend_sysrelid, depend_oid, name
* -Auto-destroy sequence on DROP of table with SERIAL; perhaps a separate
  SERIAL type
* -Prevent column dropping if column is used by foreign key
* -Propagate column or table renaming to foreign key constraints
* -Automatically drop constraints/functions when object is dropped
* -Make foreign key constraints clearer in dump file
* -Make other constraints clearer in dump file
* -Make foreign keys easier to identify
* Flush cached query plans when their underlying catalog data changes
* Use dependency information to dump data in proper order


Transactions
============

* -Allow autocommit so always in a transaction block
* Overhaul bufmgr/lockmgr/transaction manager
* Allow savepoints / nested transactions [transactions]


Exotic Features
===============

* Add sql3 recursive unions
* Add the concept of dataspaces/tablespaces [tablespaces]
* -Add SQL92 schemas (Tom)
* Allow queries across multiple databases [crossdb]
* Add pre-parsing phase that converts non-ANSI features to supported features
* Allow plug-in modules to emulate features from other databases
* SQL*Net listener that makes PostgreSQL appear as an Oracle database
  to clients


PERFORMANCE
===========


Fsync
=====

* Delay fsync() when other backends are about to commit too [fsync]
	o Determine optimal commit_delay value
* Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
	o Allow multiple blocks to be written to WAL with one write()


Cache
=====
* -Cache most recent query plan(s) (Neil) [prepare]
* Shared catalog cache, reduce lseek()'s by caching table size in shared area
* Add free-behind capability for large sequential scans (Bruce)
* Allow binding query args over FE/BE protocol
* Consider use of open/fcntl(O_DIRECT) to minimize OS caching
* Make blind writes go through the file descriptor cache


Vacuum
======

* Improve speed with indexes (perhaps recreate index instead) [vacuum]
* Reduce lock time by moving tuples with read lock, then write
  lock and truncate table [vacuum]
* Provide automatic running of vacuum in the background (Tom) [vacuum]
* Allow free space map to be auto-sized or warn when it is too small


Locking
=======

* Make locking of shared data structures more fine-grained
* Add code to detect an SMP machine and handle spinlocks accordingly
  from distributted.net, http://www1.distributed.net/source,
  in client/common/cpucheck.cpp
* Research use of sched_yield() for spinlock acquisition failure


Startup Time
============

* Experiment with multi-threaded backend [thread]
* Add connection pooling [pool]
* Allow persistent backends [persistent]
* Create a transaction processor to aid in persistent connections and
  connection pooling
* Do listen() in postmaster and accept() in pre-forked backend
* Have pre-forked backend pre-connect to last requested database or pass
  file descriptor to backend pre-forked for matching database
* -Cache system catalog information in per-database files (Tom)


Write-Ahead Log
===============

* Have after-change WAL write()'s write only modified data to kernel
* Reduce number of after-change WAL writes; they exist only to gaurd against
  partial page writes [wal]
* Turn off after-change writes if fsync is disabled (?)
* Add WAL index reliability improvement to non-btree indexes
* -Reorder postgresql.conf WAL items in order of importance (Bruce)
* -Remove wal_files postgresql.conf option because WAL files are now recycled
* Find proper defaults for postgresql.conf WAL entries
* Add checkpoint_min_warning postgresql.conf option to warn about checkpoints
  that are too frequent
* Allow xlog directory location to be specified during initdb, perhaps
  using symlinks
* Allow pg_xlog to be moved without symlinks


Optimizer / Executor
====================

* Improve Subplan list handling
* Allow Subplans to use efficient joins(hash, merge) with upper variable
* -Improve dynamic memory allocation by introducing tuple-context memory
  allocation (Tom)
* Add hash for evaluating GROUP BY aggregates
* -Nested FULL OUTER JOINs don't work (Tom)
* Allow merge and hash joins on expressions not just simple variables (Tom)
* -Add new pg_proc cachable settings to specify whether function can be
  evaluated only once or once per query
* -Change FIXED_CHAR_SEL to 0.20 from 0.04 to give better selectivity (Bruce)
* Make IN/NOT IN have similar performance to EXISTS/NOT EXISTS [exists]
* Missing optimizer selectivities for date, r-tree, etc. [optimizer]
* Allow ORDER BY ... LIMIT to select top values without sort or index
  using a sequential scan for highest/lowest values (Oleg)
* Inline simple SQL functions to avoid overhead (Tom)
* Precompile SQL functions to avoid overhead (Neil)
* Add utility to compute accurate random_page_cost value
* Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
* Use CHECK constraints to improve optimizer decisions
* Check GUC geqo_threshold to see if it is still accurate
* Allow sorting, temp files, temp tables to use multiple work directories


Miscellaneous
=============

* Do async I/O for faster random read-ahead of data
* Get faster regex() code from Henry Spencer <henry@zoo.utoronto.ca>
  when it is available
* Use mmap() rather than SYSV shared memory or to write WAL files (?) [mmap]
* Improve caching of attribute offsets when NULLs exist in the row
* -Add Intimate Shared Memory(ISM) for Solaris
* -Use faster flex flags for performance improvement (Peter E)
* -Add BSD-licensed qsort() for Solaris


Source Code
===========

* Add use of 'const' for variables in source tree
* -Fix problems with libpq non-blocking/async code
* -Make sure all block numbers are unsigned to increase maximum table size
* -Merge LockMethodCtl and LockMethodTable into one shared structure (Bruce)
* -HOLDER/HOLDERTAB rename to PROCLOCK/PROCLOCKTAG (Bruce)
* -Remove LockMethodTable.prio field, not used (Bruce)
* Rename some /contrib modules from pg* to pg_*
* Move some things from /contrib into main tree
* Remove warnings created by -Wcast-align
* Move platform-specific ps status display info from ps_status.c to ports
* -Make one version of simple_prompt() in code (Bruce, Tom)
* -Compile in syslog functionaility by default (Tatsuo)
* Modify regression tests to prevent failures do to minor numeric rounding
* Add OpenBSD's getpeereid() call for local socket authentication (Bruce)
* Improve access-permissions check on data directory in Cygwin (Tom)
* -Report failure to find readline or zlib at end of configure run
* Add --port flag to regression tests
* -Increase identifier length (NAMEDATALEN) if small performance hit,
* -Increase maximum number of function parameters if little wasted space
* Add documentation for perl, including mention of DBI/DBD perl location
* Add optional CRC checksum to heap and index pages
* Change representation of whole-tuple parameters to functions
* Clarify use of 'application' and 'command' tags in SGML docs
* Better document ability to build only certain interfaces (Marc)
* Remove or relicense modules that are not under the BSD license, if possible
* Remove memory/file descriptor freeing befor elog(ERROR)  (Bruce)
* Create native Win32 port [win32]
* -Fix glibc's mktime() to handle pre-1970's dates
* -Move /contrib/retep to gborg.postgresql.org

---------------------------------------------------------------------------


Developers who have claimed items are:
--------------------------------------
* Barry is Barry Lind <barry@xythos.com>
* Billy is Billy G. Allie <Bill.Allie@mug.org>
* Bruce is Bruce Momjian <pgman@candle.pha.pa.us>
* Christopher is Christopher Kings-Lynne <chriskl@familyhealth.com.au>
* D'Arcy is D'Arcy J.M. Cain <darcy@druid.net>
* Dave is Dave Cramer <dave@fastcrypt.com>
* Edmund is Edmund Mergl <E.Mergl@bawue.de>
* Gavin Sherry <swm@linuxworld.com.au>
* Hiroshi is Hiroshi Inoue <Inoue@tpf.co.jp>
* Karel is Karel Zak <zakkr@zf.jcu.cz>
* Jan is Jan Wieck <wieck@sapserv.debis.de>
* Liam is Liam Stewart <liams@redhat.com>
* Marc is Marc Fournier <scrappy@hub.org>
* Mark is Mark Hollomon <mhh@mindspring.com>
* Marko is Marko Kreen <marko@l-t.ee>
* Michael is Michael Meskes <meskes@postgresql.org>
* Neil is Neil Conway <neilc@samurai.com>
* Oleg is Oleg Bartunov <oleg@sai.msu.su>
* Peter M is Peter T Mount <peter@retep.org.uk>
* Peter E is Peter Eisentraut <peter_e@gmx.net>
* Philip is Philip Warner <pjw@rhyme.com.au>
* Rod is Rod Taylor <rbt@zort.ca>
* Ross is Ross J. Reedstrom <reedstrm@wallace.ece.rice.edu>
* Ryan is Ryan Bradetich <rbrad@hpb50023.boi.hp.com>
* Stephan is Stephan Szabo <sszabo@megazone23.bigpanda.com>
* Tatsuo is Tatsuo Ishii <t-ishii@sra.co.jp>
* Thomas is Thomas Lockhart <lockhart@fourpalms.org>
* Tom is Tom Lane <tgl@sss.pgh.pa.us>
* TomH is Tom I Helbekkmo <tih@Hamartun.Priv.no>
* Vadim is Vadim B. Mikheev <vadim4o@email.com>