diff --git a/doc/FAQ b/doc/FAQ index 7d9d3e2fd9..fd99d36088 100644 --- a/doc/FAQ +++ b/doc/FAQ @@ -1,7 +1,7 @@ Frequently Asked Questions (FAQ) for PostgreSQL - Last updated: Tue Feb 26 23:52:13 EST 2002 + Last updated: Sun Mar 3 11:02:16 EST 2002 Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us) @@ -706,28 +706,30 @@ 4.8) My queries are slow or don't make use of the indexes. Why? - PostgreSQL does not automatically maintain statistics. VACUUM must be - run to update the statistics. After statistics are updated, the - optimizer knows how many rows in the table, and can better decide if - it should use indexes. Note that the optimizer does not use indexes in - cases when the table is small because a sequential scan would be - faster. + Indexes are not automatically used by every query. Indexes are only + used if the table is larger than a minimum size, and the index selects + only a small percentage of the rows in the table. This is because the + random disk access caused by an index scan is sometimes slower than a + straight read through the table, or sequential scan. - For column-specific optimization statistics, use VACUUM ANALYZE. - VACUUM ANALYZE is important for complex multijoin queries, so the - optimizer can estimate the number of rows returned from each table, - and choose the proper join order. The backend does not keep track of - column statistics on its own, so VACUUM ANALYZE must be run to collect - them periodically. + To determine if an index should be used, PostgreSQL must have + statistics about the table. These statistics are collected using + VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer + knows how many rows are in the table, and can better determine if + indexes should be used. Statistics are also valuable in determining + optimal join order and join methods. Statistics collection should be + performed periodically as the contents of the table change. - Indexes are usually not used for ORDER BY or joins. A sequential scan - followed by an explicit sort is faster than an indexscan of all tuples - of a large table. This is because random disk access is very slow. + Indexes are normally not used for ORDER BY or to perform joins. A + sequential scan followed by an explicit sort is usually faster than an + index scan of a large table. + However, LIMIT combined with ORDER BY often will use an index because + only a small portion of the table is returned. When using wild-card operators such as LIKE or ~, indexes can only be used if the beginning of the search is anchored to the start of the - string. So, to use indexes, LIKE searches should not begin with %, and - ~(regular expression searches) should start with ^. + string. Therefore, to use indexes, LIKE patterns must not start with + %, and ~(regular expression) patterns must start with ^. 4.9) How do I see how the query optimizer is evaluating my query? diff --git a/doc/src/FAQ/FAQ.html b/doc/src/FAQ/FAQ.html index 9a7a09a097..dbb656686e 100644 --- a/doc/src/FAQ/FAQ.html +++ b/doc/src/FAQ/FAQ.html @@ -14,7 +14,7 @@ alink="#0000ff">

Frequently Asked Questions (FAQ) for PostgreSQL

-

Last updated: Tue Feb 26 23:52:13 EST 2002

+

Last updated: Sun Mar 3 11:02:16 EST 2002

Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
@@ -72,7 +72,8 @@ get IpcMemoryCreate errors. Why?
3.4) When I try to start postmaster, I get IpcSemaphoreCreate errors. Why?
- 3.5) How do I control connections from other hosts?
+ 3.5) How do I control connections from other + hosts?
3.6) How do I tune the database engine for better performance?
3.7) What debugging features are available?
@@ -116,9 +117,9 @@ SERIAL insert?
4.15.3) Don't currval() and nextval() lead to a race condition with other users?
- 4.15.4) Why aren't my sequence numbers reused - on transaction abort? Why are there gaps in the numbering of my - sequence/SERIAL column?
+ 4.15.4) Why aren't my sequence numbers + reused on transaction abort? Why are there gaps in the numbering of + my sequence/SERIAL column?
4.16) What is an OID? What is a TID?
4.17) What is the meaning of some of the terms @@ -213,9 +214,9 @@ UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

-

The above is the BSD license, the classic open-source license. It - has no restrictions on how the source code may be used. We like it - and have no intention of changing it.

+

The above is the BSD license, the classic open-source license. + It has no restrictions on how the source code may be used. We like + it and have no intention of changing it.

1.3) What Unix platforms does PostgreSQL run on?

@@ -326,9 +327,11 @@ "http://www.PostgreSQL.org/docs/awbook.html">http://www.PostgreSQL.org/docs/awbook.html and http://www.commandprompt.com/ppbook/. - There is a list of PostgreSQL books available for purchase at http://www.postgresql.org/books/. - There is also a collection of PostgreSQL technical articles at http://techdocs.postgresql.org/.

psql has some nice \d commands to show information about @@ -348,9 +351,9 @@

The PostgreSQL book at http://www.PostgreSQL.org/docs/awbook.html - teaches SQL. There is another PostgreSQL book at - - http://www.commandprompt.com/ppbook. + teaches SQL. There is another PostgreSQL book at http://www.commandprompt.com/ppbook. There is a nice tutorial at http://www.intermedia.net/support/sql/sqltut.shtm, at 4.6) How much database disk space is required to store data from a typical text file? -

A PostgreSQL database may require up to five times the disk space - to store data from a text file.

+

A PostgreSQL database may require up to five times the disk + space to store data from a text file.

As an example, consider a file of 100,000 lines with an integer - and text description on each line. Suppose the text string avergages - twenty bytes in length. The flat file would be 2.8 MB. The size - of the PostgreSQL database file containing this data can be - estimated as 6.4 MB:

+ and text description on each line. Suppose the text string + avergages twenty bytes in length. The flat file would be 2.8 MB. + The size of the PostgreSQL database file containing this data can + be estimated as 6.4 MB:

     36 bytes: each row header (approximate)
     24 bytes: one int field and one text filed
@@ -899,33 +902,33 @@
 
     

4.8) My queries are slow or don't make use of the indexes. Why?

+ Indexes are not automatically used by every query. Indexes are only + used if the table is larger than a minimum size, and the index + selects only a small percentage of the rows in the table. This is + because the random disk access caused by an index scan is sometimes + slower than a straight read through the table, or sequential scan. -

PostgreSQL does not automatically maintain statistics. - VACUUM must be run to update the statistics. After - statistics are updated, the optimizer knows how many rows in the - table, and can better decide if it should use indexes. Note that - the optimizer does not use indexes in cases when the table is small - because a sequential scan would be faster.

+

To determine if an index should be used, PostgreSQL must have + statistics about the table. These statistics are collected using + VACUUM ANALYZE, or simply ANALYZE. + Using statistics, the optimizer knows how many rows are in the + table, and can better determine if indexes should be used. + Statistics are also valuable in determining optimal join order and + join methods. Statistics collection should be performed + periodically as the contents of the table change.

-

For column-specific optimization statistics, use VACUUM - ANALYZE. VACUUM ANALYZE is important for - complex multijoin queries, so the optimizer can estimate the number - of rows returned from each table, and choose the proper join order. - The backend does not keep track of column statistics on its own, so - VACUUM ANALYZE must be run to collect them - periodically.

- -

Indexes are usually not used for ORDER BY or - joins. A sequential scan followed by an explicit sort is faster - than an indexscan of all tuples of a large table. This is because - random disk access is very slow.

+

Indexes are normally not used for ORDER BY or to + perform joins. A sequential scan followed by an explicit sort is + usually faster than an index scan of a large table.

+ However, LIMIT combined with ORDER BY + often will use an index because only a small portion of the table + is returned.

When using wild-card operators such as LIKE or ~, indexes can only be used if the beginning of the search - is anchored to the start of the string. So, to use indexes, - LIKE searches should not begin with %, and - ~(regular expression searches) should start with - ^.

+ is anchored to the start of the string. Therefore, to use indexes, + LIKE patterns must not start with %, and + ~(regular expression) patterns must start with ^.

4.9) How do I see how the query optimizer is evaluating my query?

@@ -1091,13 +1094,14 @@ BYTEA bytea variable-length byte array (null-byte safe)

No. Currval() returns the current value assigned by your backend, not by all users.

-

4.15.4) Why aren't my sequence numbers reused - on transaction abort? Why are there gaps in the numbering of my - sequence/SERIAL column?

+

4.15.4) Why aren't my sequence numbers + reused on transaction abort? Why are there gaps in the numbering of + my sequence/SERIAL column?

To improve concurrency, sequence values are given out to running transactions as needed and are not locked until the transaction - completes. This causes gaps in numbering from aborted transactions. + completes. This causes gaps in numbering from aborted + transactions.

4.16) What is an OID? What is a TID?