Last updated: Wed Jun 11 10:44:40 EDT 1997
Version: 6.1
Current maintainer: Bruce Momjian (maillist@candle.pha.pa.us)
The most recent version of this document can be viewed at the postgreSQL Web site, http://postgreSQL.org.
Linux-specific questions are answered in http://postgreSQL.org/docs/FAQ-Linux.phtml.
Irix-specific questions are answered in http://postgreSQL.org/docs/FAQ-Irix.phtml.
Changes in this version (* = modified, + = new):
PostgreSQL is an enhancement of the POSTGRES database management system, a next-generation DBMS research prototype. While PostgreSQL retains the powerful data model and rich data types of POSTGRES, it replaces the PostQuel query language with an extended subset of SQL. PostgreSQL is free and the complete source is available.
PostgreSQL development is being performed by a team of Internet developers who all subscribe to the PostgreSQL development mailing list. The current coordinator is Marc G. Fournier (scrappy@postgreSQL.org). (See below on how to join). This team is now responsible for all current and future development of PostgreSQL.
The authors of PostgreSQL 1.01 were Andrew Yu and Jolly Chen. Many others have contributed to the porting, testing, debugging and enhancement of the code. The original Postgres code, from which PostgreSQL is derived, was the effort of many graduate students, undergraduate students, and staff programmers working under the direction of Professor Michael Stonebraker at the University of California, Berkeley.
The original name of the software at Berkeley was Postgres. When SQL functionality was added in 1995, its name was changed to Postgres95. The name was changed at the end of 1996 to PostgreSQL.
The authors have compiled and tested PostgreSQL on the following platforms(some of these compiles require gcc 2.7.0):
The primary anonymous ftp site for PostgreSQL is:
A mirror site exists at:
PostgreSQL is subject to the following COPYRIGHT.
PostgreSQL Data Base Management System
Copyright (c) 1994-6 Regents of the University of California
Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
There is no official support for PostgreSQL from the original maintainers or from University of California, Berkeley. It is maintained through volunteer effort only.
The main mailing list is: questions@postgreSQL.org. It is available for discussion o f matters pertaining to PostgreSQL, including but not limited to bug reports and fixes. For info on how to subscribe, send a mail with the lines in the body (not the subject line)
subscribe
end
to questions-request@postgreSQL.org.
There is also a digest list available. To subscribe to this list, send email to: questions-digest-request@postgreSQL.org with a BODY of:
subscribe endDigests are sent out to members of this list whenever the main list has received around 30k of messages.
There is a bugs mailing list available. To subscribe to this list, send email to bugs-request@postgreSQL.org with a BODY of:
There is also a developers discussion mailing list available. To subscribe to this list, send email to hackers-request@postgreSQL.org with a BODY of:
subscribe end
Additional information about PostgreSQL can be found via the PostgreSQL WWW home page at:
http://postgreSQL.org
The latest release of PostgreSQL is version 6.0, which was released on January 31, 1997. 6.1 is scheduled for release soon. For information about what is new in 6.1, see our TODO list on our WWW page.
We expect a 7.0 release in several months that will remove time-travel and reduce by 50% the size of on-disk system columns maintained for each row in a table. This release will also require a dump and restore.
Illustra Information Technology (a wholly owned subsidiary of Informix Software, Inc.) sells an object-relational DBMS called Illustra that was originally based on postgres. Illustra has cosmetic similarities to PostgreSQL but has more features, is more robust, performs better, and offers real documentation and support. On the flip side, it costs money. For more information, contact sales@illustra.com
A user manual, manual pages, and some small test examples are included in the distribution. The sql and built-in manual pages are particularly important.
The www page contains pointers to an implementation guide and five papers written about postgres design concepts and features.
PostgreSQL supports a subset of SQL-92. It has most of the important constructs but lacks some of the functionality. The most visible differences are:
On the other hand, you get to create user-defined types, functions, inheritance etc. If you're willing to help with PostgreSQL coding, eventually we can also add the missing features listed above.
PostgreSQL v1.09 is compatible with databases created with v1.01. Those upgrading from 1.0 should read the directions in the MIGRATION_1.0_TO_1.02 directory.
Upgrading to 6.0 requires a dump and restore from previous releases.
Upgrading to 6.1 requires a dump and restore from previous releases.
Those ugrading from versions earlier than 1.09 must upgrade to 1.09 first without a dump/reload, then dump the data from 1.09, and then load it into 6.0 or 6.1.
Since we don't have any licensing or registration scheme, it's impossible to tell. We do know hundreds copies of PostgreSQL v1.* have been downloaded, and that there many hundreds of subscribers to the mailing lists.
You probably do not have the right path set up. The 'postgres' executable needs to be in your path.
Check your locale configuration. PostgreSQL uses the locale settings of the user that ran the postmaster process. Set those accordingly for your operating environment.
You need to edit Makefile.global and change POSTGRESDIR accordingly, or create a Makefile.custom and define POSTGRESDIR there.
It could be a variety of problems, but first check to see that you have system V extensions installed on your kernel. PostgreSQL requires kernel support for shared memory.
You either do not have shared memory configured properly in kernel or you need to enlarge the shared memory available in the kernel. The exact amount you need depends on your architecture and how many buffers you configure postmaster to run with. For most systems, with default buffer sizes, you need a minimum of ~760K.
The Makefiles do not have the proper dependencies for include files. You have to do a 'make clean' and then another 'make'.
Column constraints are not supported in PostgreSQL. As a consequence, the system does not check for duplicates.
Under 6.0, create a unique index on the column. Attempts to create duplicate of that column will report an error.
Subqueries are not implemented, but they can be simulated using sql functions.
PostgreSQL 6.0 supports unique indices.
Currently, the rule system in PostgreSQL is mostly broken. It works enough to support the view mechanism, but that's about it. Use PostgreSQL rules at your own peril.
The Inversion large object system in PostgreSQL is also mostly broken. It works well enough for storing large wads of data and reading them back out, but the implementation has some underlying problems. Use PostgreSQL large objects at your own peril.
No. No. No. Not in the official distribution at least. Some users have reported some success at using 'pgbrowse' and 'onyx' as frontends to PostgreSQL. Several contributions are working on tk based frontend tools. Ask on the mailing list.
PostgreSQL supports a C-callable library interface called libpq as well as a Tcl-based library interface called libtcl.
Others have contributed a perl interface and a WWW gateway to PostgreSQL. See the PostgreSQL home pages for more details.
Use host-based authentication by modifying the file $PGDATA/pg_hba accordingly.
Currently, there is no easy interface to set up user groups. You have to explicitly insert/update the pg_group table. For example:
jolly=> insert into pg_group (groname, grosysid, grolist)
jolly=> values ('posthackers', '1234', '{5443, 8261}');
INSERT 548224
jolly=> grant insert on foo to group posthackers;
CHANGE
jolly=>
The fields in pg_group are:
Normal cursors return data back in ASCII format. Since data is stored natively in binary format, the system must do a conversion to produce the ASCII format. In addition, ASCII formats are often large in size than binary format. Once the attributes come back in ASCII, often the client application then has to convert it to a binary format to manipulate it anyway.
Binary cursors give you back the data in the native binary representation. Thus, binary cursors will tend to be a little faster since there's less overhead of conversion.
However, ASCII is architectural neutral whereas binary representation can differ between different machine architecture. Thus, if your client machine uses a different representation than you server machine, getting back attributes in binary format is probably not what you want. Also, if your main purpose is displaying the data in ASCII, then getting it back in ASCII will save you some effort on the client side.
SQL specifies <> as the inequality operator, and that is what we have defined for the built-in types.
In 6.0, != is equivalent to <>.
An r-tree index is used for indexing spatial data. A hash index can't handle range searches. A B-tree index only handles range searches in a single dimension. R-tree's can handle multi-dimensional data. For example, if a R-tree index can be built on an attribute of type 'point', the system can more efficient answer queries like select all points within a bounding rectangle.
The canonical paper that describes the original R-Tree design is:
Guttman, A. "R-Trees: A Dynamic Index Structure for Spatial Searching." Proc of the 1984 ACM SIGMOD Int'l Conf on Mgmt of Data, 45-57.
You can also find this paper in Stonebraker's "Readings in Database Systems"
Tuples are limited to 8K bytes. Taking into account system attributes and other overhead, one should stay well shy of 8,000 bytes to be on the safe side. To use attributes larger than 8K, try using the large objects interface.
Tuples do not cross 8k boundaries so a 5k tuple will require 8k of storage.
PostgreSQL does not automatically maintain statistics. One has to make an explicit 'vacuum' call to update the statistics. After statistics are updated, the optimizer has a better shot at using indices. Note that the optimizer is limited and does not use indices in some circumstances (such as OR clauses).
If the system still does not see the index, it is probably because you have created an index on a field with the improper *_ops type. For example, you have created a CHAR(4) field, but have specified a char_ops index type_class.
See the create_index manual page for information on what type classes are available. It must match the field type.
Postgres does not warn the user when the improper index is created.
Indexes not used for ORDER BY operations.
There are two ODBC drivers available, PostODBC and OpenLink ODBC.
For all people being interested in PostODBC, there are now two mailing lists devoted to the discussion of PostODBC. The mailing lists are:
these lists are ordinary majordomo mailing lists. You can subscribe by sending a mail to:
OpenLink ODBC is currently in beta under Linux. You can get it from http://www.openlinksw.com/postgres.html. It works with our standard ODBC client software so you'll have Postgres ODBC available on every client platform we support (Win, Mac, Unix, VMS).
We will probably be selling this product to people who need commercial-quality support, but a freeware version will always be available. Questions to postgres95@openlink.co.uk.
Builtin R-Trees can handle polygons and boxes. In theory, R-trees can be extended to handle higher number of dimensions. In practice, extending R-trees require a bit of work and we don't currently have any documentation on how to do it.
PostgreSQL supports the SQL LIKE syntax as well as more general regular expression searching with the ~ operator. The !~ is the negated regexp operator. ~* and !~* are the case-insensitive regular expression operators.
You should not create database users with user id 0(root). They will be unable to access the database. This is a security precaution because of the ability of any user to dynamically link object modules into the database engine.
If the server crashes during a vacuum command, chances are it will leave a lock file hanging around. Attempts to re-run the vacuum command result in
WARN:can't create lock file -- another vacuum cleaner running?
If you are sure that no vacuum is actually running, you can remove the file called "pg_vlock" in your database directory (which is $PGDATA/base/<dbName>)
Type Internal Name Notes -------------------------------------------------- CHAR char 1 character } CHAR2 char2 2 characters } CHAR4 char4 4 characters } optimized for a fixed length CHAR8 char8 8 characters } CHAR16 char16 16 characters } CHAR(#) bpchar blank padded to the specified fixed length VARCHAR(#) varchar size specifies maximum length, no padding TEXT text length limited only by maximum tuple length BYTEA bytea variable-length array of bytes
Remember, you need to use the internal name when creating indexes on these fields or when doing other internal operations.
The last four types above are "varlena" types (i.e. the first four bytes is the length, followed by the data). CHAR(#) and VARCHAR(#) allocate the maximum number of bytes no matter how much data is stored in the field. TEXT and BYTEA are the only character types that have variable length on the disk.
PostgreSQL has two builtin keywords, "isnull" and "notnull" (note no spaces). Version 1.05 and later and 6.* understand IS NULL and IS NOT NULL.
Place the word 'EXPLAIN' at the beginning of the query, for example:
EXPLAIN SELECT * FROM table1 WHERE age = 23;
Postgres does not allow the user to specifiy a user column as type SERIAL. Instead, you can use each row's oid field as a unique value. However, if you need to dump and reload the database, you need to be using postgres version 1.07 or later or 6.* with pgdump's -o option or COPY's WITH OIDS option to preserver the oids.
Another valid way of doing this is to create a function:
create table my_oids (f1 int4);
insert into my_oids values (1);
create function new_oid () returns int4 as
'update my_oids set f1 = f1 + 1; select f1 from my_oids; '
language 'sql';
then:
create table my_stuff (my_key int4, value text);
insert into my_stuff values (new_oid(), 'hello');
However, keep in mind there is a race condition here where one server could do the update, then another one do an update, and they both could select the same new id. This statement should be performed within a transaction.
Sequences are implemented in 6.1
In 6.0, you can not directly create a multi-column index using create index. You need to define a function which acts on the multiple columns, then use create index with that function.
In 6.1, this feature is available.
They are temp_ files generated by the query executor. For example, if a sort needs to be done to satisfy an ORDER BY, some temp files are generated as a result of the sort.
If you have no transactions or sorts running at the time, it is safe to delete the temp_ files.
If you run vacuum in pre-6.0, unused rows will be marked for reuse, but the file blocks are not released.
In 6.0, vacuum properly shrinks tables.
The default configuration allows only connections from tcp/ip host localhost. You need to add a host entry to the file pgsql/data/pg_hba.
You probably used:
create index idx1 on person using btree (name);
PostgreSQL indexes are extensible, and therefore in pre-6.0, you must specify a class_type when creating an index. Read the manual page for create index (called create_index).
Version 6.0, if you do not specify a class_type, it defaults to the proper type for the column.
You have probably defined an incorrect *_ops type class for the field you are indexing.
Run the file pgsql/src/tutorial/syscat.source. It illustrates many of the 'select's needed to get information out of the database system tables.
You have compile postgres with flex version 2.5.3. There is bug in this version of flex. Use flex version 2.5.2 or flex 2.5.4 instead. There is a doc/README.flex file which will properly patch the flex 2.5.3 source code.
This problem can be caused by a kernel that is not configured to support semaphores.
For web integration, PHP/FI is an excellent interface. The URL for that is http://www.vex.net/php/
PHP is great for simple stuff, but for more complex stuff, some still use the perl interface and CGI.pm.
An example of using WWW with C to talk to Postgres is can be tried at:
An WWW gatway based on WDB using perl can be downloaded from:
PostgreSQL handles data changes differently than most database systems. When a row is changed in a table, the original row is marked with the time it was changed, and a new row is created with the current data. By default, only current rows are used in a table. If you specify a date/time after the table name in a FROM clause, you can access the data that was current at that time, i.e.
SELECT *
FROM employees ['July 24, 1996 09:00:00']
displays employee rows in the table at the specified time. You can specify intervals like [date,date], [date,], [,date], or [,]. This last option accesses all rows that ever existed.
INSERTed rows get a timestamp too, so rows that were not in the table at the desired time will not appear.
Vacuum removes rows that are no longer current. This time-warp feature is used by the engine for rollback and crash recovery. Expiration times can be set with purge.
In 6.0, once a table is vacuumed, the creation time of a row may be incorrect, causing time-traval to fail.
The time-travel feature will be removed in 7.0.
There are two things that can be done. You can use Openlink's option to disable fsync() by starting the postmaster with a '-o -F' option. This will prevent fsync()'s from flushing to disk after every transaction.
You can also use the postmaster -B option to increase the number of shared memory buffers shared among the backend processes. If you make this parameter too high, the process will not start or crash unexpectedly. Each buffer is 8K and the defualt is 64 buffers.
PostgreSQL has several features that report status information that can be valuable for debugging purposes.
First, by compiling with DEBUG defined, many assert()'s monitor the progress of the backend and halt the program when something unexpected occurs.
Both postmaster and postgres have several debug options available. First, whenever you start the postmaster, make sure you send the standard output and error to a log file, like:
cd /usr/local/pgsql ./bin/postmaster >server.log 2>&1 &
This will put a server.log file in the top-level PostgreSQL directory. This file can contain useful information about problems or errors encountered by the server. Postmaster has a -d option that allows even more detailed information to be reported. The -d option takes a number 1-3 that specifies the debug level. The query plans in a verbose debug file can be formatted using the 'indent' program. (You may need to remove the '====' lines in 1.* releases.) Be warned that a debug level greater than one generates large log files in 1.* releases.
You can actuall run the postgres backend from the command line, and type your SQL statement directly. This is recommended ONLY for debugging purposes. Note that a newline terminates the query, not a semicolon. If you have compiled with debugging symbols, you can perhaps use a debugger to see what is happening. Because the backend was not started from the postmaster, it is not running in an identical environment and locking/backend interaction problems may not be duplicated. Some operating system can attach to a running backend directly to diagnose problems.
The postgres program has a -s, -A, -t options that can be very usefull for debugging and performance measurements.
The EXPLAIN command (see this FAQ) allows you to see how PostgreSQL is iterpreting your query.
Oids are Postgres's answer to unique row ids or serial columns. Every row that is created in Postgres gets a unique oid. All oids generated by initdb are less than 16384 (from backend/access/transam.h). All post-initdb (user-created) oids are equal or greater that this. All these oids are unique not only within a table, or database, but unique within the entire postgres installation.
Postgres uses oids in its internal system tables to link rows in separate tables. These oids can be used to identify specific user rows and used in joins. It is recommended you use column type oid to store oid values. See the sql(l) manual page to see the other internal columns.
Tids are used to indentify specific physical rows with block and offset values. Tids change after rows are modified or reloaded. They are used by index entries to point to physical rows. They can not be accessed through sql.
Some of the source code and older documentation use terms that have more common usage. Here are some:
Please let me know if you think of any more.
The GEQO module in PostgreSQL is intended to solve the query optimization problem of joining many tables by means of a Genetic Algorithm (GA). It allows the handling of large join queries through non-exhaustive search.
For further information see README.GEQO <utesch@aut.tu-freiberg.de>.
There was a bug in 6.0 that caused this problem under Solaris with -O2 optimization. Upgrade to 6.1.
Edit include/storage/sinvaladt.h, and change the value of MaxBackendId. In the future, we plan to make this a configurable prameter.
The problem could be a number of things. Try testing your user-defined function in a stand alone test program first. Also, make sure you are not sending elog NOTICES when the front-end is expecting data, such as during a type_in() or type_out() functions
You are pfree'ing something that was not palloc'ed. When writing user-defined functions, do not include the file "libpq-fe.h". Doing so will cause your palloc to be a malloc instead of a free. Then, when the backend pfrees the storage, you get the notice message.
Please share them with other PostgreSQL users. Send your extensions to mailing list, and they will eventually end up in the contrib/ subdirectory.
This requires extreme wizardry, so extreme that the authors have not ever tried it, though in principle it can be done. The short answer is ... you can't. This capability is forthcoming in the future.
Check the current FAQ at http://postgreSQL.org
Also check out our ftp site ftp://ftp.postgreSQL.org/pub to see if there is a more recent PostgreSQL version.
You can also fill out the "bug-template" file and send it to:
This is the address of the developers mailing list.