Make an editorial pass over the newly SGML-ified contrib documentation.

Fix lots of bad markup, bad English, bad explanations.

Second round of commits.  pgcrypto and pgstandby still to go...
This commit is contained in:
Tom Lane 2007-12-10 05:32:51 +00:00
parent a3102ce1ef
commit 8828689ae9
12 changed files with 2351 additions and 2003 deletions

View File

@ -1,82 +1,145 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/intagg.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="intagg">
<title>intagg</title>
<indexterm zone="intagg">
<primary>intagg</primary>
</indexterm>
<para>
This section describes the <literal>intagg</literal> module which provides an integer aggregator and an enumerator.
The <filename>intagg</filename> module provides an integer aggregator and an
enumerator.
</para>
<para>
Many database systems have the notion of a one to many table. Such a table usually sits between two indexed tables, as:
</para>
<programlisting>
CREATE TABLE one_to_many(left INT, right INT) ;
</programlisting>
<sect2>
<title>Functions</title>
<para>
And it is used like this:
The aggregator is an aggregate function
<function>int_array_aggregate(integer)</>
that produces an integer array
containing exactly the integers it is fed.
Here is a not-tremendously-useful example:
</para>
<programlisting>
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = item;
test=# select int_array_aggregate(i) from
test-# generate_series(1,10,2) i;
int_array_aggregate
---------------------
{1,3,5,7,9}
(1 row)
</programlisting>
<para>
This will return all the items in the right hand table for an entry
The enumerator is a function
<function>int_array_enum(integer[])</>
that returns <type>setof integer</>. It is essentially the reverse
operation of the aggregator: given an array of integers, expand it
into a set of rows. For example,
</para>
<programlisting>
test=# select * from int_array_enum(array[1,3,5,7,9]);
int_array_enum
----------------
1
3
5
7
9
(5 rows)
</programlisting>
</sect2>
<sect2>
<title>Sample Uses</title>
<para>
Many database systems have the notion of a one to many table. Such a table
usually sits between two indexed tables, for example:
</para>
<programlisting>
CREATE TABLE left (id INT PRIMARY KEY, ...);
CREATE TABLE right (id INT PRIMARY KEY, ...);
CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
</programlisting>
<para>
It is typically used like this:
</para>
<programlisting>
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = <replaceable>item</>;
</programlisting>
<para>
This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
</para>
<para>
Now, this methodology can be cumbersome with a very large number of
entries in the one_to_many table. Depending on the order in which
data was entered, a join like this could result in an index scan
entries in the <structname>one_to_many</> table. Often,
a join like this would result in an index scan
and a fetch for each right hand entry in the table for a particular
left hand entry. If you have a very dynamic system, there is not much you
left hand entry. If you have a very dynamic system, there is not much you
can do. However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
</para>
<programlisting>
CREATE TABLE summary as SELECT left, int_array_aggregate(right)
AS right FROM one_to_many GROUP BY left;
CREATE TABLE summary as
SELECT left, int_array_aggregate(right) AS right
FROM one_to_many
GROUP BY left;
</programlisting>
<para>
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
the array, thats why there is an array enumerator.
the array; that's why there is an array enumerator. You can do
</para>
<programlisting>
SELECT left, int_array_enum(right) FROM summary WHERE left = item;
SELECT left, int_array_enum(right) FROM summary WHERE left = <replaceable>item</>;
</programlisting>
<para>
The above query using int_array_enum, produces the same results as:
The above query using <function>int_array_enum</> produces the same results
as
</para>
<programlisting>
SELECT left, right FROM one_to_many WHERE left = item;
SELECT left, right FROM one_to_many WHERE left = <replaceable>item</>;
</programlisting>
<para>
The difference is that the query against the summary table has to get
only one row from the table, where as the query against "one_to_many"
must index scan and fetch a row for each entry.
only one row from the table, whereas the direct query against
<structname>one_to_many</> must index scan and fetch a row for each entry.
</para>
<para>
On our system, an EXPLAIN shows a query with a cost of 8488 gets reduced
to a cost of 329. The query is a join between the one_to_many table,
</para>
<programlisting>
SELECT right, count(right) FROM
(
SELECT left, int_array_enum(right) AS right FROM summary JOIN
(SELECT left FROM left_table WHERE left = item) AS lefts
ON (summary.left = lefts.left )
) AS list GROUP BY right ORDER BY count DESC ;
</programlisting>
</sect1>
<para>
On one system, an <command>EXPLAIN</> showed a query with a cost of 8488 was
reduced to a cost of 329. The original query was a join involving the
<structname>one_to_many</> table, which was replaced by:
</para>
<programlisting>
SELECT right, count(right) FROM
( SELECT left, int_array_enum(right) AS right
FROM summary JOIN (SELECT left FROM left_table WHERE left = <replaceable>item</>) AS lefts
ON (summary.left = lefts.left)
) AS list
GROUP BY right
ORDER BY count DESC;
</programlisting>
</sect2>
</sect1>

View File

@ -1,196 +1,207 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/intarray.sgml,v 1.5 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="intarray">
<title>intarray</title>
<indexterm zone="intarray">
<primary>intarray</primary>
</indexterm>
<para>
This is an implementation of RD-tree data structure using GiST interface
of PostgreSQL. It has built-in lossy compression.
</para>
<para>
Current implementation provides index support for one-dimensional array of
integers: gist__int_ops, suitable for small and medium size of arrays (used by
default), and gist__intbig_ops for indexing large arrays (we use superimposed
signature with length of 4096 bits to represent sets). There is also a
non-default gin__int_ops for GIN indexes on integer arrays.
The <filename>intarray</> module provides a number of useful functions
and operators for manipulating one-dimensional arrays of integers.
There is also support for indexed searches using some of the operators.
</para>
<sect2>
<title>Functions</title>
<title><filename>intarray</> Functions and Operators</title>
<itemizedlist>
<table id="intarray-func-table">
<title><filename>intarray</> Functions</title>
<listitem>
<para>
<literal>int icount(int[])</literal> - the number of elements in intarray
</para>
<programlisting>
test=# select icount('{1,2,3}'::int[]);
icount
--------
3
(1 row)
</programlisting>
</listitem>
<tgroup cols="5">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<listitem>
<para>
<literal>int[] sort(int[], 'asc' | 'desc')</literal> - sort intarray
</para>
<programlisting>
test=# select sort('{1,2,3}'::int[],'desc');
sort
---------
{3,2,1}
(1 row)
</programlisting>
</listitem>
<tbody>
<row>
<entry><function>icount(int[])</function></entry>
<entry><type>int</type></entry>
<entry>number of elements in array</entry>
<entry><literal>icount('{1,2,3}'::int[])</literal></entry>
<entry><literal>3</literal></entry>
</row>
<listitem>
<para>
<literal>int[] sort(int[])</literal> - sort in ascending order
</para>
</listitem>
<row>
<entry><function>sort(int[], text dir)</function></entry>
<entry><type>int[]</type></entry>
<entry>sort array &mdash; <parameter>dir</> must be <literal>asc</> or <literal>desc</></entry>
<entry><literal>sort('{1,2,3}'::int[], 'desc')</literal></entry>
<entry><literal>{3,2,1}</literal></entry>
</row>
<listitem>
<para>
<literal>int[] sort_asc(int[]),sort_desc(int[])</literal> - shortcuts for sort
</para>
</listitem>
<row>
<entry><function>sort(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>sort in ascending order</entry>
<entry><literal>sort(array[11,77,44])</literal></entry>
<entry><literal>{11,44,77}</literal></entry>
</row>
<listitem>
<para>
<literal>int[] uniq(int[])</literal> - returns unique elements
</para>
<programlisting>
test=# select uniq(sort('{1,2,3,2,1}'::int[]));
uniq
---------
{1,2,3}
(1 row)
</programlisting>
</listitem>
<row>
<entry><function>sort_asc(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>sort in ascending order</entry>
<entry><literal></literal></entry>
<entry><literal></literal></entry>
</row>
<listitem>
<para>
<literal>int idx(int[], int item)</literal> - returns index of first
intarray matching element to item, or '0' if matching failed.
</para>
<programlisting>
test=# select idx('{1,2,3,2,1}'::int[],2);
idx
-----
2
(1 row)
</programlisting>
</listitem>
<row>
<entry><function>sort_desc(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>sort in descending order</entry>
<entry><literal></literal></entry>
<entry><literal></literal></entry>
</row>
<listitem>
<para>
<literal>int[] subarray(int[],int START [, int LEN])</literal> - returns
part of intarray starting from element number START (from 1) and length LEN.
</para>
<programlisting>
test=# select subarray('{1,2,3,2,1}'::int[],2,3);
subarray
----------
{2,3,2}
(1 row)
</programlisting>
</listitem>
<row>
<entry><function>uniq(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>remove adjacent duplicates</entry>
<entry><literal>uniq(sort('{1,2,3,2,1}'::int[]))</literal></entry>
<entry><literal>{1,2,3}</literal></entry>
</row>
<listitem>
<para>
<literal>int[] intset(int4)</literal> - casting int4 to int[]
</para>
<programlisting>
test=# select intset(1);
intset
--------
{1}
(1 row)
</programlisting>
</listitem>
<row>
<entry><function>idx(int[], int item)</function></entry>
<entry><type>int</type></entry>
<entry>index of first element matching <parameter>item</> (0 if none)</entry>
<entry><literal>idx(array[11,22,33,22,11], 22)</literal></entry>
<entry><literal>2</literal></entry>
</row>
</itemizedlist>
</sect2>
<row>
<entry><function>subarray(int[], int start, int len)</function></entry>
<entry><type>int[]</type></entry>
<entry>portion of array starting at position <parameter>start</>, len <parameter>elements</></entry>
<entry><literal>subarray('{1,2,3,2,1}'::int[], 2, 3)</literal></entry>
<entry><literal>{2,3,2}</literal></entry>
</row>
<sect2>
<title>Operations</title>
<table>
<title>Operations</title>
<tgroup cols="2">
<row>
<entry><function>subarray(int[], int start)</function></entry>
<entry><type>int[]</type></entry>
<entry>portion of array starting at position <parameter>start</></entry>
<entry><literal>subarray('{1,2,3,2,1}'::int[], 2)</literal></entry>
<entry><literal>{2,3,2,1}</literal></entry>
</row>
<row>
<entry><function>intset(int)</function></entry>
<entry><type>int[]</type></entry>
<entry>make single-element array</entry>
<entry><literal>intset(42)</literal></entry>
<entry><literal>{42}</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<table id="intarray-op-table">
<title><filename>intarray</> Operators</title>
<tgroup cols="3">
<thead>
<row>
<entry>Operator</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>int[] &amp;&amp; int[]</literal></entry>
<entry>overlap - returns TRUE if arrays have at least one common element</entry>
<entry><type>boolean</type></entry>
<entry>overlap &mdash; <literal>true</> if arrays have at least one common element</entry>
</row>
<row>
<entry><literal>int[] @&gt; int[]</literal></entry>
<entry>contains - returns TRUE if left array contains right array</entry>
<entry><type>boolean</type></entry>
<entry>contains &mdash; <literal>true</> if left array contains right array</entry>
</row>
<row>
<entry><literal>int[] &lt;@ int[]</literal></entry>
<entry>contained - returns TRUE if left array is contained in right array</entry>
<entry><type>boolean</type></entry>
<entry>contained &mdash; <literal>true</> if left array is contained in right array</entry>
</row>
<row>
<entry><literal># int[]</literal></entry>
<entry>returns the number of elements in array</entry>
<entry><type>int</type></entry>
<entry>number of elements in array</entry>
</row>
<row>
<entry><literal>int[] # int</literal></entry>
<entry><type>int</type></entry>
<entry>index (same as <function>idx</> function)</entry>
</row>
<row>
<entry><literal>int[] + int</literal></entry>
<entry>push element to array ( add to end of array)</entry>
<entry><type>int[]</type></entry>
<entry>push element onto array (add it to end of array)</entry>
</row>
<row>
<entry><literal>int[] + int[] </literal></entry>
<entry>merge of arrays (right array added to the end of left one)</entry>
<entry><type>int[]</type></entry>
<entry>array concatenation (right array added to the end of left one)</entry>
</row>
<row>
<entry><literal>int[] - int</literal></entry>
<entry>remove entries matched by right argument from array</entry>
<entry><type>int[]</type></entry>
<entry>remove entries matching right argument from array</entry>
</row>
<row>
<entry><literal>int[] - int[]</literal></entry>
<entry>remove right array from left</entry>
<entry><type>int[]</type></entry>
<entry>remove elements of right array from left</entry>
</row>
<row>
<entry><literal>int[] | int</literal></entry>
<entry>returns intarray - union of arguments</entry>
<entry><type>int[]</type></entry>
<entry>union of arguments</entry>
</row>
<row>
<entry><literal>int[] | int[]</literal></entry>
<entry>returns intarray as a union of two arrays</entry>
<entry><type>int[]</type></entry>
<entry>union of arrays</entry>
</row>
<row>
<entry><literal>int[] &amp; int[]</literal></entry>
<entry>returns intersection of arrays</entry>
<entry><type>int[]</type></entry>
<entry>intersection of arrays</entry>
</row>
<row>
<entry><literal>int[] @@ query_int</literal></entry>
<entry>
returns TRUE if array satisfies query (like
<literal>'1&amp;(2|3)'</literal>)
</entry>
<entry><type>boolean</type></entry>
<entry><literal>true</> if array satisfies query (see below)</entry>
</row>
<row>
<entry><literal>query_int ~~ int[]</literal></entry>
<entry>returns TRUE if array satisfies query (commutator of @@)</entry>
<entry><type>boolean</type></entry>
<entry><literal>true</> if array satisfies query (commutator of <literal>@@</>)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
(Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
respectively called @ and ~. These names are still available, but are
@ -198,85 +209,102 @@ test=# select intset(1);
are reversed from the convention formerly followed by the core geometric
datatypes!)
</para>
<para>
The <literal>@@</> and <literal>~~</> operators test whether an array
satisfies a <firstterm>query</>, which is expressed as a value of a
specialized data type <type>query_int</>. A <firstterm>query</>
consists of integer values that are checked against the elements of
the array, possibly combined using the operators <literal>&amp;</>
(AND), <literal>|</> (OR), and <literal>!</> (NOT). Parentheses
can be used as needed. For example,
the query <literal>1&amp;(2|3)</> matches arrays that contain 1
and also contain either 2 or 3.
</para>
</sect2>
<sect2>
<title>Index Support</title>
<para>
<filename>intarray</> provides index support for the
<literal>&amp;&amp;</>, <literal>@&gt;</>, <literal>&lt;@</>,
and <literal>@@</> operators, as well as regular array equality.
The implementation uses an RD-tree data structure with
built-in lossy compression.
</para>
<para>
Two GiST index operator classes are provided:
<literal>gist__int_ops</> (used by default) is suitable for
small and medium-size arrays, while
<literal>gist__intbig_ops</> uses a larger signature and is more
suitable for indexing large arrays.
</para>
<para>
There is also a non-default GIN operator class
<literal>gin__int_ops</>.
</para>
<para>
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed elsewhere.
As a rule of thumb, a GIN index is faster to search than a GiST index, but
slower to build or update; so GIN is better suited for static data and GiST
for often-updated data.
</para>
</sect2>
<sect2>
<title>Example</title>
<programlisting>
CREATE TABLE message (mid INT NOT NULL,sections INT[]);
CREATE TABLE message_section_map (mid INT NOT NULL,sid INT NOT NULL);
-- a message can be in one or more <quote>sections</>
CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);
-- create indices
CREATE unique index message_key ON message ( mid );
CREATE unique index message_section_map_key2 ON message_section_map (sid, mid );
CREATE INDEX message_rdtree_idx ON message USING GIST ( sections gist__int_ops);
-- create specialized index
CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops);
-- select some messages with section in 1 OR 2 - OVERLAP operator
SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}';
-- select messages in section 1 OR 2 - OVERLAP operator
SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}';
-- select messages contains in sections 1 AND 2 - CONTAINS operator
-- select messages in sections 1 AND 2 - CONTAINS operator
SELECT message.mid FROM message WHERE message.sections @&gt; '{1,2}';
-- the same, CONTAINED operator
SELECT message.mid FROM message WHERE '{1,2}' &lt;@ message.sections;
-- the same, using QUERY operator
SELECT message.mid FROM message WHERE message.sections @@ '1&amp;2'::query_int;
</programlisting>
</sect2>
<sect2>
<title>Benchmark</title>
<para>
subdirectory bench contains benchmark suite.
The source directory <filename>contrib/intarray/bench</> contains a
benchmark test suite. To run:
</para>
<programlisting>
cd ./bench
1. createdb TEST
2. psql TEST &lt; ../_int.sql
3. ./create_test.pl | psql TEST
4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries
with/without RD-Tree. Run script without arguments to
see availbale options.
a)test without RD-Tree (OR)
./bench.pl -d TEST -c -s 1,2 -v
b)test with RD-Tree
./bench.pl -d TEST -c -s 1,2 -v -r
BENCHMARKS:
Size of table &lt;message>: 200000
Size of table &lt;message_section_map>: 269133
Distribution of messages by sections:
section 0: 74377 messages
section 1: 16284 messages
section 50: 1229 messages
section 99: 683 messages
old - without RD-Tree support,
new - with RD-Tree
+----------+---------------+----------------+
|Search set|OR, time in sec|AND, time in sec|
| +-------+-------+--------+-------+
| | old | new | old | new |
+----------+-------+-------+--------+-------+
| 1| 0.625| 0.101| -| -|
+----------+-------+-------+--------+-------+
| 99| 0.018| 0.017| -| -|
+----------+-------+-------+--------+-------+
| 1,2| 0.766| 0.133| 0.628| 0.045|
+----------+-------+-------+--------+-------+
| 1,2,50,65| 0.794| 0.141| 0.030| 0.006|
+----------+-------+-------+--------+-------+
cd .../bench
createdb TEST
psql TEST &lt; ../_int.sql
./create_test.pl | psql TEST
./bench.pl
</programlisting>
<para>
The <filename>bench.pl</> script has numerous options, which
are displayed when it is run without any arguments.
</para>
</sect2>
<sect2>
<title>Authors</title>
<para>
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and Oleg
Bartunov (<email>oleg@sai.msu.su</email>). See
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and
Oleg Bartunov (<email>oleg@sai.msu.su</email>). See
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for
additional information. Andrey Oktyabrski did a great work on adding new
functions and operations.
@ -284,4 +312,3 @@ SELECT message.mid FROM message WHERE '{1,2}' &lt;@ message.sections;
</sect2>
</sect1>

View File

@ -1,429 +1,299 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/isn.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="isn">
<title>isn</title>
<indexterm zone="isn">
<primary>isn</primary>
</indexterm>
<para>
The <literal>isn</literal> module adds data types for the following
international-standard namespaces: EAN13, UPC, ISBN (books), ISMN (music),
and ISSN (serials). This module is inspired by Garrett A. Wollman's
isbn_issn code.
The <filename>isn</filename> module provides data types for the following
international product numbering standards: EAN13, UPC, ISBN (books), ISMN
(music), and ISSN (serials). Numbers are validated on input, and correctly
hyphenated on output.
</para>
<para>
This module validates, and automatically adds the correct
hyphenations to the numbers. Also, it supports the new ISBN-13
numbers to be used starting in January 2007.
</para>
<para>
Premises:
</para>
<orderedlist>
<listitem>
<para>ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers</para>
</listitem>
<listitem>
<para>EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are)</para>
</listitem>
<listitem>
<para>some ISBN13 numbers can be displayed as ISBN</para>
</listitem>
<listitem>
<para>some ISMN13 numbers can be displayed as ISMN</para>
</listitem>
<listitem>
<para>some ISSN13 numbers can be displayed as ISSN</para>
</listitem>
<listitem>
<para>all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers</para>
</listitem>
</orderedlist>
<note>
<para>
All types are internally represented as 64 bit integers,
and internally all are consistently interchangeable.
</para>
</note>
<note>
<para>
We have two operator classes (for btree and for hash) so each data type
can be indexed for faster access.
</para>
</note>
<sect2>
<title>Data types</title>
<para>
We have the following data types:
</para>
<table>
<title>Data types</title>
<title><filename>isn</filename> data types</title>
<tgroup cols="2">
<thead>
<row>
<entry><para>Data type</para></entry>
<entry><para>Description</para></entry>
</row>
</thead>
<tbody>
<row>
<entry><para><literal>EAN13</literal></para></entry>
<entry>
<para>
European Article Numbers. This type will always show the EAN13-display
format. Te output function for this is <literal>ean13_out()</literal>
</para>
</entry>
</row>
<row>
<entry><para><literal>ISBN13</literal></para></entry>
<entry>
<para>
For International Standard Book Numbers to be displayed in
the new EAN13-display format.
</para>
</entry>
</row>
<row>
<entry><para><literal>ISMN13</literal></para></entry>
<entry>
<para>
For International Standard Music Numbers to be displayed in
the new EAN13-display format.
</para>
</entry>
</row>
<row>
<entry><para><literal>ISSN13</literal></para></entry>
<entry>
<para>
For International Standard Serial Numbers to be displayed in the new
EAN13-display format.
</para>
</entry>
</row>
<row>
<entry><para><literal>ISBN</literal></para></entry>
<entry>
<para>
For International Standard Book Numbers to be displayed in the current
short-display format.
</para>
</entry>
</row>
<row>
<entry><para><literal>ISMN</literal></para></entry>
<entry>
<para>
For International Standard Music Numbers to be displayed in the
current short-display format.
</para>
</entry>
</row>
<row>
<entry><para><literal>ISSN</literal></para></entry>
<entry>
<para>
For International Standard Serial Numbers to be displayed in the
current short-display format. These types will display the short
version of the ISxN (ISxN 10) whenever it's possible, and it will
show ISxN 13 when it's impossible to show the short version. The
output function to do this is <literal>isn_out()</literal>
</para>
</entry>
</row>
<row>
<entry><para><literal>UPC</literal></para></entry>
<entry>
<para>
For Universal Product Codes. UPC numbers are a subset of the EAN13
numbers (they are basically EAN13 without the first '0' digit.)
The output function to do this is also <literal>isn_out()</literal>
</para>
</entry>
</row>
</tbody>
</tgroup>
</table>
<note>
<para>
<literal>EAN13</literal>, <literal>ISBN13</literal>,
<literal>ISMN13</literal> and <literal>ISSN13</literal> types will always
display the long version of the ISxN (EAN13). The output function to do
this is <literal>ean13_out()</literal>.
</para>
<para>
The need for these types is just for displaying in different ways the same
data: <literal>ISBN13</literal> is actually the same as
<literal>ISBN</literal>, <literal>ISMN13=ISMN</literal> and
<literal>ISSN13=ISSN</literal>.
</para>
</note>
</sect2>
<sect2>
<title>Input functions</title>
<para>
We have the following input functions:
</para>
<table>
<title>Input functions</title>
<tgroup cols="2">
<thead>
<row>
<entry>Function</entry>
<entry>Data type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><para><literal>ean13_in()</literal></para></entry>
<entry><type>EAN13</type></entry>
<entry>
<para>
To take a string and return an EAN13.
</para>
European Article Numbers, always displayed in the EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>isbn_in()</literal></para></entry>
<entry><type>ISBN13</type></entry>
<entry>
<para>
To take a string and return valid ISBN or ISBN13 numbers.
</para>
International Standard Book Numbers to be displayed in
the new EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>ismn_in()</literal></para></entry>
<entry><type>ISMN13</type></entry>
<entry>
<para>
To take a string and return valid ISMN or ISMN13 numbers.
</para>
</entry>
</row>
<row>
<entry><para><literal>issn_in()</literal></para></entry>
<entry>
<para>
To take a string and return valid ISSN or ISSN13 numbers.
</para>
International Standard Music Numbers to be displayed in
the new EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>upc_in()</literal></para></entry>
<entry><type>ISSN13</type></entry>
<entry>
<para>
To take a string and return an UPC codes.
</para>
International Standard Serial Numbers to be displayed in the new
EAN13 display format
</entry>
</row>
<row>
<entry><type>ISBN</type></entry>
<entry>
International Standard Book Numbers to be displayed in the old
short display format
</entry>
</row>
<row>
<entry><type>ISMN</type></entry>
<entry>
International Standard Music Numbers to be displayed in the
old short display format
</entry>
</row>
<row>
<entry><type>ISSN</type></entry>
<entry>
International Standard Serial Numbers to be displayed in the
old short display format
</entry>
</row>
<row>
<entry><type>UPC</type></entry>
<entry>
Universal Product Codes
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Some notes:
</para>
<orderedlist>
<listitem>
<para>ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers.</para>
</listitem>
<listitem>
<para>EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some
are).</para>
</listitem>
<listitem>
<para>Some ISBN13 numbers can be displayed as ISBN.</para>
</listitem>
<listitem>
<para>Some ISMN13 numbers can be displayed as ISMN.</para>
</listitem>
<listitem>
<para>Some ISSN13 numbers can be displayed as ISSN.</para>
</listitem>
<listitem>
<para>UPC numbers are a subset of the EAN13 numbers (they are basically
EAN13 without the first <literal>0</> digit).</para>
</listitem>
<listitem>
<para>All UPC, ISBN, ISMN and ISSN numbers can be represented as EAN13
numbers.</para>
</listitem>
</orderedlist>
<para>
Internally, all these types use the same representation (a 64-bit
integer), and all are interchangeable. Multiple types are provided
to control display formatting and to permit tighter validity checking
of input that is supposed to denote one particular type of number.
</para>
<para>
The <type>ISBN</>, <type>ISMN</>, and <type>ISSN</> types will display the
short version of the number (ISxN 10) whenever it's possible, and will show
ISxN 13 format for numbers that do not fit in the short version.
The <type>EAN13</type>, <type>ISBN13</type>, <type>ISMN13</type> and
<type>ISSN13</type> types will always display the long version of the ISxN
(EAN13).
</para>
</sect2>
<sect2>
<title>Casts</title>
<para>
We are able to cast from:
The <filename>isn</> module provides the following pairs of type casts:
</para>
<itemizedlist>
<listitem>
<para>
ISBN13 -&gt; EAN13
ISBN13 &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISMN13 -&gt; EAN13
ISMN13 &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISSN13 -&gt; EAN13
ISSN13 &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISBN -&gt; EAN13
ISBN &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISMN -&gt; EAN13
ISMN &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISSN -&gt; EAN13
ISSN &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
UPC -&gt; EAN13
UPC &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISBN &lt;-&gt; ISBN13
ISBN &lt;=&gt; ISBN13
</para>
</listitem>
<listitem>
<para>
ISMN &lt;-&gt; ISMN13
ISMN &lt;=&gt; ISMN13
</para>
</listitem>
<listitem>
<para>
ISSN &lt;-&gt; ISSN13
</para>
</listitem>
</itemizedlist>
</sect2>
<sect2>
<title>C API</title>
<para>
The C API is implemented as:
</para>
<programlisting>
extern Datum isn_out(PG_FUNCTION_ARGS);
extern Datum ean13_out(PG_FUNCTION_ARGS);
extern Datum ean13_in(PG_FUNCTION_ARGS);
extern Datum isbn_in(PG_FUNCTION_ARGS);
extern Datum ismn_in(PG_FUNCTION_ARGS);
extern Datum issn_in(PG_FUNCTION_ARGS);
extern Datum upc_in(PG_FUNCTION_ARGS);
</programlisting>
<para>
On success:
</para>
<itemizedlist>
<listitem>
<para>
<literal>isn_out()</literal> takes any of our types and returns a string containing
the shortes possible representation of the number.
</para>
</listitem>
<listitem>
<para>
<literal>ean13_out()</literal> takes any of our types and returns the
EAN13 (long) representation of the number.
</para>
</listitem>
<listitem>
<para>
<literal>ean13_in()</literal> takes a string and return a EAN13. Which, as stated in (2)
could or could not be any of our types, but it certainly is an EAN13
number. Only if the string is a valid EAN13 number, otherwise it fails.
</para>
</listitem>
<listitem>
<para>
<literal>isbn_in()</literal> takes a string and return an ISBN/ISBN13. Only if the string
is really a ISBN/ISBN13, otherwise it fails.
</para>
</listitem>
<listitem>
<para>
<literal>ismn_in()</literal> takes a string and return an ISMN/ISMN13. Only if the string
is really a ISMN/ISMN13, otherwise it fails.
</para>
</listitem>
<listitem>
<para>
<literal>issn_in()</literal> takes a string and return an ISSN/ISSN13. Only if the string
is really a ISSN/ISSN13, otherwise it fails.
</para>
</listitem>
<listitem>
<para>
<literal>upc_in()</literal> takes a string and return an UPC. Only if the string is
really a UPC, otherwise it fails.
ISSN &lt;=&gt; ISSN13
</para>
</listitem>
</itemizedlist>
<para>
(on failure, the functions 'ereport' the error)
When casting from <type>EAN13</> to another type, there is a run-time
check that the value is within the domain of the other type, and an error
is thrown if not. The other casts are simply relabelings that will
always succeed.
</para>
</sect2>
<sect2>
<title>Testing functions</title>
<title>Functions and Operators</title>
<para>
The <filename>isn</> module provides the standard comparison operators,
plus btree and hash indexing support for all these datatypes. In
addition there are several specialized functions. In this table,
<type>isn</> means any one of the module's data types:
</para>
<table>
<title>Testing functions</title>
<tgroup cols="2">
<title><filename>isn</> functions</title>
<tgroup cols="3">
<thead>
<row>
<entry><para>Function</para></entry>
<entry><para>Description</para></entry>
<entry>Function</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><para><literal>isn_weak(boolean)</literal></para></entry>
<entry><para>Sets the weak input mode.</para></entry>
<entry><function>isn_weak(boolean)</function></entry>
<entry><type>boolean</type></entry>
<entry>Sets the weak input mode (returns new setting)</entry>
</row>
<row>
<entry><para><literal>isn_weak()</literal></para></entry>
<entry><para>Gets the current status of the weak mode.</para></entry>
<entry><function>isn_weak()</function></entry>
<entry><type>boolean</type></entry>
<entry>Gets the current status of the weak mode</entry>
</row>
<row>
<entry><para><literal>make_valid()</literal></para></entry>
<entry><para>Validates an invalid number (deleting the invalid flag).</para></entry>
<entry><function>make_valid(isn)</function></entry>
<entry><type>isn</type></entry>
<entry>Validates an invalid number (clears the invalid flag)</entry>
</row>
<row>
<entry><para><literal>is_valid()</literal></para></entry>
<entry><para>Checks for the invalid flag prsence.</para></entry>
<entry><function>is_valid(isn)</function></entry>
<entry><type>boolean</type></entry>
<entry>Checks for the presence of the invalid flag</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<literal>Weak</literal> mode is used to be able to insert invalid data to
a table. Invalid as in the check digit being wrong, not missing numbers.
<firstterm>Weak</firstterm> mode is used to be able to insert invalid data
into a table. Invalid means the check digit is wrong, not that there are
missing numbers.
</para>
<para>
Why would you want to use the weak mode? Well, it could be that
you have a huge collection of ISBN numbers, and that there are so many of
them that for weird reasons some have the wrong check digit (perhaps the
numbers where scanned from a printed list and the OCR got the numbers wrong,
perhaps the numbers were manually captured... who knows.) Anyway, the thing
is you might want to clean the mess up, but you still want to be able to have
all the numbers in your database and maybe use an external tool to access
the invalid numbers in the database so you can verify the information and
validate it more easily; as selecting all the invalid numbers in the table.
numbers were scanned from a printed list and the OCR got the numbers wrong,
perhaps the numbers were manually captured... who knows). Anyway, the point
is you might want to clean the mess up, but you still want to be able to
have all the numbers in your database and maybe use an external tool to
locate the invalid numbers in the database so you can verify the
information and validate it more easily; so for example you'd want to
select all the invalid numbers in the table.
</para>
<para>
When you insert invalid numbers in a table using the weak mode, the number
will be inserted with the corrected check digit, but it will be flagged
with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!)
will be inserted with the corrected check digit, but it will be displayed
with an exclamation mark (<literal>!</>) at the end, for example
<literal>0-11-000322-5!</>. This invalid marker can be checked with
the <function>is_valid</> function and cleared with the
<function>make_valid</> function.
</para>
<para>
You can also force the insertion of invalid numbers even not in the weak mode,
appending the '!' character at the end of the number.
You can also force the insertion of invalid numbers even when not in the
weak mode, by appending the <literal>!</> character at the end of the
number.
</para>
<para>
Another special feature is that during input, you can write
<literal>?</> in place of the check digit, and the correct check digit
will be inserted automatically.
</para>
</sect2>
<sect2>
<title>Examples</title>
<programlisting>
--Using the types directly:
SELECT isbn('978-0-393-04002-9');
@ -431,18 +301,18 @@ SELECT isbn13('0901690546');
SELECT issn('1436-4522');
--Casting types:
-- note that you can only cast from ean13 to other type when the casted
-- number would be valid in the realm of the casted type;
-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
-- but these will:
SELECT upc(ean13('0220356483481'));
SELECT ean13(upc('220356483481'));
-- note that you can only cast from ean13 to another type when the
-- number would be valid in the realm of the target type;
-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
-- but these will:
SELECT upc(ean13('0220356483481'));
SELECT ean13(upc('220356483481'));
--Create a table with a single column to hold ISBN numbers:
CREATE TABLE test ( id isbn );
CREATE TABLE test (id isbn);
INSERT INTO test VALUES('9780393040029');
--Automatically calculating check digits (observe the '?'):
--Automatically calculate check digits (observe the '?'):
INSERT INTO test VALUES('220500896?');
INSERT INTO test VALUES('978055215372?');
@ -455,10 +325,10 @@ INSERT INTO test VALUES('978-0-11-000533-4');
INSERT INTO test VALUES('9780141219307');
INSERT INTO test VALUES('2-205-00876-X');
SELECT isn_weak(false);
SELECT id FROM test WHERE NOT is_valid(id);
UPDATE test SET id=make_valid(id) WHERE id = '2-205-00876-X!';
UPDATE test SET id = make_valid(id) WHERE id = '2-205-00876-X!';
SELECT * FROM test;
SELECT isbn13(id) FROM test;
@ -467,8 +337,9 @@ SELECT isbn13(id) FROM test;
<sect2>
<title>Bibliography</title>
<para>
The information to implement this module was collected through
The information to implement this module was collected from
several sites, including:
</para>
<programlisting>
@ -477,26 +348,33 @@ SELECT isbn13(id) FROM test;
http://www.ismn-international.org/
http://www.wikipedia.org/
</programlisting>
<para>
the prefixes used for hyphenation where also compiled from:
The prefixes used for hyphenation were also compiled from:
</para>
<programlisting>
http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html
http://www.isbn-international.org/en/identifiers.html
http://www.ismn-international.org/ranges.html
</programlisting>
<para>
Care was taken during the creation of the algorithms and they
were meticulously verified against the suggested algorithms
in the official ISBN, ISMN, ISSN User Manuals.
</para>
</sect2>
<sect2>
<title>Author</title>
<para>
Germán Méndez Bravo (Kronuz), 2004 - 2006
</para>
</sect2>
</sect1>
<para>
This module was inspired by Garrett A. Wollman's
isbn_issn code.
</para>
</sect2>
</sect1>

File diff suppressed because it is too large Load Diff

View File

@ -1,70 +1,157 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/oid2name.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="oid2name">
<title>oid2name</title>
<indexterm zone="oid2name">
<primary>oid2name</primary>
</indexterm>
<para>
This utility allows administrators to examine the file structure used by
PostgreSQL. To make use of it, you need to be familiar with the file
structure, which is described in <xref linkend="storage">.
<application>oid2name</> is a utility program that helps administrators to
examine the file structure used by PostgreSQL. To make use of it, you need
to be familiar with the database file structure, which is described in
<xref linkend="storage">.
</para>
<note>
<para>
The name <quote>oid2name</> is historical, and is actually rather
misleading, since most of the time when you use it, you will really
be concerned with tables' filenode numbers (which are the file names
visible in the database directories). Be sure you understand the
difference between table OIDs and table filenodes!
</para>
</note>
<sect2>
<title>Overview</title>
<para>
<literal>oid2name</literal> connects to the database and extracts OID,
filenode, and table name information. You can also have it show database
OIDs and tablespace OIDs.
<application>oid2name</application> connects to a target database and
extracts OID, filenode, and/or table name information. You can also have
it show database OIDs or tablespace OIDs. The program is controlled by
a large number of command-line switches, as shown in
<xref linkend="oid2name-switches">.
</para>
<para>
When displaying specific tables, you can select which tables to show by
using -o, -f and -t. The first switch takes an OID, the second takes
a filenode, and the third takes a tablename (actually, it's a LIKE
pattern, so you can use things like "foo%"). Note that you can use as many
of these switches as you like, and the listing will include all objects
matched by any of the switches. Also note that these switches can only
show objects in the database given in -d.
</para>
<para>
If you don't give any of -o, -f or -t it will dump all the tables in the
database given in -d. If you don't give -d, it will show a database
listing. Alternatively you can give -s to get a tablespace listing.
</para>
<table>
<title>Additional switches</title>
<table id="oid2name-switches">
<title><application>oid2name</> switches</title>
<tgroup cols="2">
<thead>
<row>
<entry>Switch</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-i</literal></entry>
<entry>include indexes and sequences in the database listing.</entry>
<entry><literal>-o</literal> <replaceable>oid</></entry>
<entry>show info for table with OID <replaceable>oid</></entry>
</row>
<row>
<entry><literal>-x</literal></entry>
<entry>display more information about each object shown: tablespace name,
schema name, OID.
</entry>
<entry><literal>-f</literal> <replaceable>filenode</></entry>
<entry>show info for table with filenode <replaceable>filenode</></entry>
</row>
<row>
<entry><literal>-t</literal> <replaceable>tablename_pattern</></entry>
<entry>show info for table(s) matching <replaceable>tablename_pattern</></entry>
</row>
<row>
<entry><literal>-s</literal></entry>
<entry>show tablespace OIDs</entry>
</row>
<row>
<entry><literal>-S</literal></entry>
<entry>also show system objects (those in information_schema, pg_toast
and pg_catalog schemas)
<entry>include system objects (those in
<literal>information_schema</literal>, <literal>pg_toast</literal>
and <literal>pg_catalog</literal> schemas)
</entry>
</row>
<row>
<entry><literal>-i</literal></entry>
<entry>include indexes and sequences in the listing</entry>
</row>
<row>
<entry><literal>-x</literal></entry>
<entry>display more information about each object shown: tablespace name,
schema name, and OID
</entry>
</row>
<row>
<entry><literal>-q</literal></entry>
<entry>don't display headers(useful for scripting)</entry>
<entry>omit headers (useful for scripting)</entry>
</row>
<row>
<entry><literal>-d</literal> <replaceable>database</></entry>
<entry>database to connect to</entry>
</row>
<row>
<entry><literal>-H</literal> <replaceable>host</></entry>
<entry>database server's host</entry>
</row>
<row>
<entry><literal>-p</literal> <replaceable>port</></entry>
<entry>database server's port</entry>
</row>
<row>
<entry><literal>-U</literal> <replaceable>username</></entry>
<entry>username to connect as</entry>
</row>
<row>
<entry><literal>-P</literal> <replaceable>password</></entry>
<entry>password (deprecated &mdash; putting this on the command line
is a security hazard)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
To display specific tables, select which tables to show by
using <literal>-o</>, <literal>-f</> and/or <literal>-t</>.
<literal>-o</> takes an OID,
<literal>-f</> takes a filenode,
and <literal>-t</> takes a tablename (actually, it's a LIKE
pattern, so you can use things like <literal>foo%</>).
You can use as many
of these switches as you like, and the listing will include all objects
matched by any of the switches. But note that these switches can only
show objects in the database given by <literal>-d</>.
</para>
<para>
If you don't give any of <literal>-o</>, <literal>-f</> or <literal>-t</>,
but do give <literal>-d</>, it will list all tables in the database
named by <literal>-d</>. In this mode, the <literal>-S</> and
<literal>-i</> switches control what gets listed.
</para>
<para>
If you don't give <literal>-d</> either, it will show a listing of database
OIDs. Alternatively you can give <literal>-s</> to get a tablespace
listing.
</para>
</sect2>
<sect2>
<title>Examples</title>
<programlisting>
$ # what's in this database server, anyway?
$ oid2name
All databases:
Oid Database Name Tablespace
@ -83,7 +170,8 @@ All tablespaces:
155151 fastdisk
155152 bigdisk
$ cd $PGDATA/17228
$ # OK, let's look into database alvherre
$ cd $PGDATA/base/17228
$ # get top 10 db objects in the default tablespace, ordered by size
$ ls -lS * | head -10
@ -98,6 +186,7 @@ $ ls -lS * | head -10
-rw------- 1 alvherre alvherre 163840 sep 14 09:50 16699
-rw------- 1 alvherre alvherre 122880 sep 6 17:51 16751
$ # I wonder what file 155173 is ...
$ oid2name -d alvherre -f 155173
From database "alvherre":
Filenode Table Name
@ -112,7 +201,7 @@ From database "alvherre":
155173 accounts
1155291 accounts_pkey
$ # you can also mix the options, and have more details
$ # you can mix the options, and get more details with -x
$ oid2name -d alvherre -t accounts -f 1155291 -x
From database "alvherre":
Filenode Table Name Oid Schema Tablespace
@ -157,7 +246,7 @@ $ ls -d 155151/*
155151/17228/ 155151/PG_VERSION
$ # Oh, what was database 17228 again?
$ oid2name
$ oid2name
All databases:
Oid Database Name Tablespace
----------------------------------
@ -178,28 +267,25 @@ From database "alvherre":
Filenode Table Name
----------------------
155156 foo
$ # end of sample session.
</programlisting>
</sect2>
<sect2>
<title>Limitations</title>
<para>
You can also get approximate size data for each object using psql. For
example,
</para>
<programlisting>
SELECT relpages, relfilenode, relname FROM pg_class ORDER BY relpages DESC;
</programlisting>
<para>
Each page is typically 8k. Relpages is updated by VACUUM.
<application>oid2name</> requires a running database server with
non-corrupt system catalogs. It is therefore of only limited use
for recovering from catastrophic database corruption situations.
</para>
</sect2>
<sect2>
<title>Author</title>
<para>
b. palmer, <email>bpalmer@crimelabs.net</email>
B. Palmer <email>bpalmer@crimelabs.net</email>
</para>
</sect2>
</sect1>

View File

@ -1,124 +1,170 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pageinspect.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pageinspect">
<title>pageinspect</title>
<indexterm zone="pageinspect">
<primary>pageinspect</primary>
</indexterm>
<para>
The functions in this module allow you to inspect the contents of data pages
at a low level, for debugging purposes.
The <filename>pageinspect</> module provides functions that allow you to
inspect the contents of database pages at a low level, which is useful for
debugging purposes. All of these functions may be used only by superusers.
</para>
<sect2>
<title>Functions included</title>
<title>Functions</title>
<itemizedlist>
<listitem>
<para>
<literal>get_raw_page</literal> reads one block of the named table and returns a copy as a
bytea field. This allows a single time-consistent copy of the block to be
made. Use of this functions is restricted to superusers.
</para>
</listitem>
<variablelist>
<varlistentry>
<term>
<function>get_raw_page(text, int) returns bytea</function>
</term>
<listitem>
<para>
<literal>page_header</literal> shows fields which are common to all PostgreSQL heap and index
pages. Use of this function is restricted to superusers.
</para>
<para>
A page image obtained with <literal>get_raw_page</literal> should be passed as argument:
</para>
<programlisting>
regression=# SELECT * FROM page_header(get_raw_page('pg_class',0));
lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid
-----------+-----+-------+-------+-------+---------+----------+---------+-----------
0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0
</programlisting>
<para>
The returned columns correspond to the fields in the PageHeaderData struct.
See src/include/storage/bufpage.h for more details.
</para>
</listitem>
<listitem>
<para>
<function>get_raw_page</function> reads the specified block of the named
table and returns a copy as a <type>bytea</> value. This allows a
single time-consistent copy of the block to be obtained.
</para>
</listitem>
</varlistentry>
<listitem>
<para>
<literal>heap_page_items</literal> shows all line pointers on a heap page. For those line
pointers that are in use, tuple headers are also shown. All tuples are
shown, whether or not the tuples were visible to an MVCC snapshot at the
time the raw page was copied. Use of this function is restricted to
superusers.
</para>
<para>
A heap page image obtained with <literal>get_raw_page</literal> should be passed as argument:
</para>
<programlisting>
test=# SELECT * FROM heap_page_items(get_raw_page('pg_class',0));
</programlisting>
<para>
See src/include/storage/itemid.h and src/include/access/htup.h for
explanations of the fields returned.
</para>
</listitem>
<varlistentry>
<term>
<function>page_header(bytea) returns record</function>
</term>
<listitem>
<para>
<literal>bt_metap()</literal> returns information about the btree index metapage:
</para>
<programlisting>
test=> SELECT * FROM bt_metap('pg_cast_oid_index');
-[ RECORD 1 ]-----
magic | 340322
version | 2
root | 1
level | 0
fastroot | 1
fastlevel | 0
</programlisting>
</listitem>
<listitem>
<para>
<function>page_header</function> shows fields that are common to all
<productname>PostgreSQL</> heap and index pages.
</para>
<listitem>
<para>
<literal>bt_page_stats()</literal> shows information about single btree pages:
</para>
<programlisting>
test=> SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
-[ RECORD 1 ]-+-----
blkno | 1
type | l
live_items | 256
dead_items | 0
avg_item_size | 12
page_size | 8192
free_size | 4056
btpo_prev | 0
btpo_next | 0
btpo | 0
btpo_flags | 3
</programlisting>
</listitem>
<para>
A page image obtained with <function>get_raw_page</function> should be
passed as argument. For example:
</para>
<programlisting>
test=# SELECT * FROM page_header(get_raw_page('pg_class', 0));
lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid
-----------+-----+-------+-------+-------+---------+----------+---------+-----------
0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0
</programlisting>
<listitem>
<para>
<literal>bt_page_items()</literal> returns information about specific items on btree pages:
<para>
The returned columns correspond to the fields in the
<structname>PageHeaderData</> struct.
See <filename>src/include/storage/bufpage.h</> for details.
</para>
<programlisting>
test=> SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------
1 | (0,1) | 12 | f | f | 23 27 00 00
2 | (0,2) | 12 | f | f | 24 27 00 00
3 | (0,3) | 12 | f | f | 25 27 00 00
4 | (0,4) | 12 | f | f | 26 27 00 00
5 | (0,5) | 12 | f | f | 27 27 00 00
6 | (0,6) | 12 | f | f | 28 27 00 00
7 | (0,7) | 12 | f | f | 29 27 00 00
8 | (0,8) | 12 | f | f | 2a 27 00 00
</programlisting>
</listitem>
</itemizedlist>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>heap_page_items(bytea) returns setof record</function>
</term>
<listitem>
<para>
<function>heap_page_items</function> shows all line pointers on a heap
page. For those line pointers that are in use, tuple headers are also
shown. All tuples are shown, whether or not the tuples were visible to
an MVCC snapshot at the time the raw page was copied.
</para>
<para>
A heap page image obtained with <function>get_raw_page</function> should
be passed as argument. For example:
</para>
<programlisting>
test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
</programlisting>
<para>
See <filename>src/include/storage/itemid.h</> and
<filename>src/include/access/htup.h</> for explanations of the fields
returned.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_metap(text) returns record</function>
</term>
<listitem>
<para>
<function>bt_metap</function> returns information about a btree
index's metapage. For example:
</para>
<programlisting>
test=# SELECT * FROM bt_metap('pg_cast_oid_index');
-[ RECORD 1 ]-----
magic | 340322
version | 2
root | 1
level | 0
fastroot | 1
fastlevel | 0
</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_page_stats(text, int) returns record</function>
</term>
<listitem>
<para>
<function>bt_page_stats</function> returns summary information about
single pages of btree indexes. For example:
</para>
<programlisting>
test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
-[ RECORD 1 ]-+-----
blkno | 1
type | l
live_items | 256
dead_items | 0
avg_item_size | 12
page_size | 8192
free_size | 4056
btpo_prev | 0
btpo_next | 0
btpo | 0
btpo_flags | 3
</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_page_items(text, int) returns setof record</function>
</term>
<listitem>
<para>
<function>bt_page_items</function> returns detailed information about
all of the items on a btree index page. For example:
</para>
<programlisting>
test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------
1 | (0,1) | 12 | f | f | 23 27 00 00
2 | (0,2) | 12 | f | f | 24 27 00 00
3 | (0,3) | 12 | f | f | 25 27 00 00
4 | (0,4) | 12 | f | f | 26 27 00 00
5 | (0,5) | 12 | f | f | 27 27 00 00
6 | (0,6) | 12 | f | f | 28 27 00 00
7 | (0,7) | 12 | f | f | 29 27 00 00
8 | (0,8) | 12 | f | f | 2a 27 00 00
</programlisting>
</listitem>
</varlistentry>
</variablelist>
</sect2>
</sect1>
</sect1>

View File

@ -1,436 +1,544 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbench.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgbench">
<title>pgbench</title>
<indexterm zone="pgbench">
<primary>pgbench</primary>
</indexterm>
<para>
<literal>pgbench</literal> is a simple program to run a benchmark test.
<literal>pgbench</literal> is a client application of PostgreSQL and runs
with PostgreSQL only. It performs lots of small and simple transactions
including SELECT/UPDATE/INSERT operations then calculates number of
transactions successfully completed within a second (transactions
per second, tps). Targeting data includes a table with at least 100k
tuples.
</para>
<para>
Example outputs from pgbench look like:
</para>
<programlisting>
number of clients: 4
number of transactions per client: 100
number of processed transactions: 400/400
tps = 19.875015(including connections establishing)
tps = 20.098827(excluding connections establishing)
</programlisting>
<para> Similar program called "JDBCBench" already exists, but it requires
Java that may not be available on every platform. Moreover some
people concerned about the overhead of Java that might lead
inaccurate results. So I decided to write in pure C, and named
it "pgbench."
<application>pgbench</application> is a simple program for running benchmark
tests on <productname>PostgreSQL</>. It runs the same sequence of SQL
commands over and over, possibly in multiple concurrent database sessions,
and then calculates the average transaction rate (transactions per second).
By default, <application>pgbench</application> tests a scenario that is
loosely based on TPC-B, involving five <command>SELECT</>,
<command>UPDATE</>, and <command>INSERT</> commands per transaction.
However, it is easy to test other cases by writing your own transaction
script files.
</para>
<para>
Features of pgbench:
Typical output from pgbench looks like:
<programlisting>
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 10
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
tps = 85.184871 (including connections establishing)
tps = 85.296346 (excluding connections establishing)
</programlisting>
The first four lines just report some of the most important parameter
settings. The next line reports the number of transactions completed
and intended (the latter being just the product of number of clients
and number of transactions); these will be equal unless the run
failed before completion. The last two lines report the TPS rate,
figured with and without counting the time to start database sessions.
</para>
<itemizedlist>
<listitem>
<para>
pgbench is written in C using libpq only. So it is very portable
and easy to install.
</para>
</listitem>
<listitem>
<para>
pgbench can simulate concurrent connections using asynchronous
capability of libpq. No threading is required.
</para>
</listitem>
</itemizedlist>
<sect2>
<title>Overview</title>
<orderedlist>
<listitem>
<para>(optional)Initialize database by:</para>
<programlisting>
pgbench -i &lt;dbname&gt;
</programlisting>
<para>
where &lt;dbname&gt; is the name of database. pgbench uses four tables
accounts, branches, history and tellers. These tables will be
destroyed. Be very careful if you have tables having same
names. Default test data contains:
</para>
<programlisting>
table # of tuples
<para>
The default TPC-B-like transaction test requires specific tables to be
set up beforehand. <application>pgbench</> should be invoked with
the <literal>-i</> (initialize) option to create and populate these
tables. (When you are testing a custom script, you don't need this
step, but will instead need to do whatever setup your test needs.)
Initialization looks like:
<programlisting>
pgbench -i <optional> <replaceable>other-options</> </optional> <replaceable>dbname</>
</programlisting>
where <replaceable>dbname</> is the name of the already-created
database to test in. (You may also need <literal>-h</>,
<literal>-p</>, and/or <literal>-U</> options to specify how to
connect to the database server.)
</para>
<caution>
<para>
<literal>pgbench -i</> creates four tables <structname>accounts</>,
<structname>branches</>, <structname>history</>, and
<structname>tellers</>, destroying any existing tables of these names.
Be very careful to use another database if you have tables having these
names!
</para>
</caution>
<para>
At the default <quote>scale factor</> of 1, the tables initially
contain this many rows:
</para>
<programlisting>
table # of rows
-------------------------
branches 1
tellers 10
accounts 100000
history 0
</programlisting>
<para>
You can increase the number of tuples by using -s option. branches,
tellers and accounts tables are created with a fillfactor which is
set using -F option. See below.
</para>
</listitem>
<listitem>
<para>Run the benchmark test</para>
<programlisting>
pgbench &lt;dbname&gt;
</programlisting>
<para>
The default configuration is:
</para>
<programlisting>
number of clients: 1
number of transactions per client: 10
</programlisting>
</listitem>
</orderedlist>
</programlisting>
<para>
You can (and, for most purposes, probably should) increase the number
of rows by using the <literal>-s</> (scale factor) option. The
<literal>-F</> (fillfactor) option might also be used at this point.
</para>
<table>
<title><literal>pgbench</literal> options</title>
<para>
Once you have done the necessary setup, you can run your benchmark
with a command that doesn't include <literal>-i</>, that is
<programlisting>
pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</programlisting>
In nearly all cases, you'll need some options to make a useful test.
The most important options are <literal>-c</> (number of clients),
<literal>-t</> (number of transactions), and <literal>-f</> (specify
a custom script file). See below for a full list.
</para>
<para>
<xref linkend="pgbench-init-options"> shows options that are used
during database initialization, while
<xref linkend="pgbench-run-options"> shows options that are used
while running benchmarks, and
<xref linkend="pgbench-common-options"> shows options that are useful
in both cases.
</para>
<table id="pgbench-init-options">
<title><application>pgbench</application> initialization options</title>
<tgroup cols="2">
<thead>
<row>
<entry>Parameter</entry>
<entry>Option</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-h hostname</literal></entry>
<entry><literal>-i</literal></entry>
<entry>
<para>
hostname where the backend is running. If this option
is omitted, pgbench will connect to the localhost via
Unix domain socket.
</para>
Required to invoke initialization mode.
</entry>
</row>
<row>
<entry><literal>-p port</literal></entry>
<entry><literal>-s</literal> <replaceable>scale_factor</></entry>
<entry>
<para>
the port number that the backend is accepting. default is
libpq's default, usually 5432.
</para>
Multiply the number of rows generated by the scale factor.
For example, <literal>-s 100</> will imply 10,000,000 rows
in the <structname>accounts</> table. Default is 1.
</entry>
</row>
<row>
<entry><literal>-c number_of_clients</literal></entry>
<entry><literal>-F</literal> <replaceable>fillfactor</></entry>
<entry>
<para>
Number of clients simulated. default is 1.
</para>
Create the <structname>accounts</>, <structname>tellers</> and
<structname>branches</> tables with the given fillfactor.
Default is 100.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="pgbench-run-options">
<title><application>pgbench</application> benchmarking options</title>
<tgroup cols="2">
<thead>
<row>
<entry>Option</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-c</literal> <replaceable>clients</></entry>
<entry>
Number of clients simulated, that is, number of concurrent database
sessions. Default is 1.
</entry>
</row>
<row>
<entry><literal>-t number_of_transactions</literal></entry>
<entry><literal>-t</literal> <replaceable>transactions</></entry>
<entry>
<para>
Number of transactions each client runs. default is 10.
</para>
</entry>
</row>
<row>
<entry><literal>-s scaling_factor</literal></entry>
<entry>
<para>
this should be used with -i (initialize) option.
number of tuples generated will be multiple of the
scaling factor. For example, -s 100 will imply 10M
(10,000,000) tuples in the accounts table.
default is 1.
</para>
<para>
NOTE: scaling factor should be at least
as large as the largest number of clients you intend
to test; else you'll mostly be measuring update contention.
Regular (not initializing) runs using one of the
built-in tests will detect scale based on the number of
branches in the database. For custom (-f) runs it can
be manually specified with this parameter.
</para>
</entry>
</row>
<row>
<entry><literal>-D varname=value</literal></entry>
<entry>
<para>
Define a variable. It can be refered to by a script
provided by using -f option. Multiple -D options are allowed.
</para>
</entry>
</row>
<row>
<entry><literal>-U login</literal></entry>
<entry>
<para>
Specify db user's login name if it is different from
the Unix login name.
</para>
</entry>
</row>
<row>
<entry><literal>-P password</literal></entry>
<entry>
<para>
Specify the db password. CAUTION: using this option
might be a security hole since ps command will
show the password. Use this for TESTING PURPOSE ONLY.
</para>
</entry>
</row>
<row>
<entry><literal>-n</literal></entry>
<entry>
<para>
No vacuuming and cleaning the history table prior to the
test is performed.
</para>
</entry>
</row>
<row>
<entry><literal>-v</literal></entry>
<entry>
<para>
Do vacuuming before testing. This will take some time.
With neither -n nor -v, pgbench will vacuum tellers and
branches tables only.
</para>
</entry>
</row>
<row>
<entry><literal>-S</literal></entry>
<entry>
<para>
Perform select only transactions instead of TPC-B.
</para>
Number of transactions each client runs. Default is 10.
</entry>
</row>
<row>
<entry><literal>-N</literal></entry>
<entry>
<para>
Do not update "branches" and "tellers". This will
avoid heavy update contention on branches and tellers,
while it will not make pgbench supporting TPC-B like
transactions.
</para>
Do not update <structname>tellers</> and <structname>branches</>.
This will avoid update contention on these tables, but
it makes the test case even less like TPC-B.
</entry>
</row>
<row>
<entry><literal>-f filename</literal></entry>
<entry><literal>-S</literal></entry>
<entry>
<para>
Read transaction script from file. Detailed
explanation will appear later.
</para>
Perform select-only transactions instead of TPC-B-like test.
</entry>
</row>
<row>
<entry><literal>-f</literal> <replaceable>filename</></entry>
<entry>
Read transaction script from <replaceable>filename</>.
See below for details.
<literal>-N</literal>, <literal>-S</literal>, and <literal>-f</literal>
are mutually exclusive.
</entry>
</row>
<row>
<entry><literal>-n</literal></entry>
<entry>
No vacuuming is performed before running the test.
This option is <emphasis>necessary</>
if you are running a custom test scenario that does not include
the standard tables <structname>accounts</>,
<structname>branches</>, <structname>history</>, and
<structname>tellers</>.
</entry>
</row>
<row>
<entry><literal>-v</literal></entry>
<entry>
Vacuum all four standard tables before running the test.
With neither <literal>-n</> nor <literal>-v</>, pgbench will vacuum
<structname>tellers</> and <structname>branches</> tables, and
will remove all entries in <structname>history</>.
</entry>
</row>
<row>
<entry><literal>-D</literal> <replaceable>varname</><literal>=</><replaceable>value</></entry>
<entry>
Define a variable for use by a custom script (see below).
Multiple <literal>-D</> options are allowed.
</entry>
</row>
<row>
<entry><literal>-C</literal></entry>
<entry>
<para>
Establish connection for each transaction, rather than
doing it just once at beginning of pgbench in the normal
mode. This is useful to measure the connection overhead.
</para>
Establish a new connection for each transaction, rather than
doing it just once per client thread.
This is useful to measure the connection overhead.
</entry>
</row>
<row>
<entry><literal>-l</literal></entry>
<entry>
<para>
Write the time taken by each transaction to a logfile,
with the name "pgbench_log.xxx", where xxx is the PID
of the pgbench process. The format of the log is:
</para>
<programlisting>
client_id transaction_no time file_no time-epoch time-us
</programlisting>
<para>
where time is measured in microseconds, , the file_no is
which test file was used (useful when multiple were
specified with -f), and time-epoch/time-us are a
UNIX epoch format timestamp followed by an offset
in microseconds (suitable for creating a ISO 8601
timestamp with a fraction of a second) of when
the transaction completed.
</para>
<para>
Here are example outputs:
</para>
<programlisting>
0 199 2241 0 1175850568 995598
0 200 2465 0 1175850568 998079
0 201 2513 0 1175850569 608
0 202 2038 0 1175850569 2663
</programlisting>
Write the time taken by each transaction to a logfile.
See below for details.
</entry>
</row>
<row>
<entry><literal>-F fillfactor</literal></entry>
<entry><literal>-s</literal> <replaceable>scale_factor</></entry>
<entry>
<para>
Create tables(accounts, tellers and branches) with the given
fillfactor. Default is 100. This should be used with -i
(initialize) option.
</para>
Report the specified scale factor in <application>pgbench</>'s
output. With the built-in tests, this is not necessary; the
correct scale factor will be detected by counting the number of
rows in the <structname>branches</> table. However, when testing
custom benchmarks (<literal>-f</> option), the scale factor
will be reported as 1 unless this option is used.
</entry>
</row>
<row>
<entry><literal>-d</literal></entry>
<entry>
<para>
debug option.
</para>
Print debugging output.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="pgbench-common-options">
<title><application>pgbench</application> common options</title>
<tgroup cols="2">
<thead>
<row>
<entry>Option</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-h</literal> <replaceable>hostname</></entry>
<entry>database server's host</entry>
</row>
<row>
<entry><literal>-p</literal> <replaceable>port</></entry>
<entry>database server's port</entry>
</row>
<row>
<entry><literal>-U</literal> <replaceable>login</></entry>
<entry>username to connect as</entry>
</row>
<row>
<entry><literal>-P</literal> <replaceable>password</></entry>
<entry>password (deprecated &mdash; putting this on the command line
is a security hazard)</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>What is the "transaction" actually performed in pgbench?</title>
<title>What is the <quote>transaction</> actually performed in pgbench?</title>
<para>
The default transaction script issues seven commands per transaction:
</para>
<orderedlist>
<listitem><para><literal>begin;</literal></para></listitem>
<listitem><para><literal>update accounts set abalance = abalance + :delta where aid = :aid;</literal></para></listitem>
<listitem><para><literal>select abalance from accounts where aid = :aid;</literal></para></listitem>
<listitem><para><literal>update tellers set tbalance = tbalance + :delta where tid = :tid;</literal></para></listitem>
<listitem><para><literal>update branches set bbalance = bbalance + :delta where bid = :bid;</literal></para></listitem>
<listitem><para><literal>insert into history(tid,bid,aid,delta) values(:tid,:bid,:aid,:delta);</literal></para></listitem>
<listitem><para><literal>end;</literal></para></listitem>
<listitem><para><literal>BEGIN;</literal></para></listitem>
<listitem><para><literal>UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;</literal></para></listitem>
<listitem><para><literal>SELECT abalance FROM accounts WHERE aid = :aid;</literal></para></listitem>
<listitem><para><literal>UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;</literal></para></listitem>
<listitem><para><literal>UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;</literal></para></listitem>
<listitem><para><literal>INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);</literal></para></listitem>
<listitem><para><literal>END;</literal></para></listitem>
</orderedlist>
<para>
If you specify -N, (4) and (5) aren't included in the transaction.
If you specify <literal>-N</>, steps 4 and 5 aren't included in the
transaction. If you specify <literal>-S</>, only the <command>SELECT</> is
issued.
</para>
</sect2>
<sect2>
<title>Script file</title>
<title>Custom Scripts</title>
<para>
<literal>pgbench</literal> has support for reading a transaction script
from a specified file (<literal>-f</literal> option). This file should
include SQL commands in each line. SQL command consists of multiple lines
are not supported. Empty lines and lines begging with "--" will be ignored.
</para>
<para>
Multiple <literal>-f</literal> options are allowed. In this case each
transaction is assigned randomly chosen script.
</para>
<para>
SQL commands can include "meta command" which begins with "\" (back
slash). A meta command takes some arguments separted by white
spaces. Currently following meta command is supported:
<application>pgbench</application> has support for running custom
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(<literal>-f</literal> option). In this case a <quote>transaction</>
counts as one execution of a script file. You can even specify
multiple scripts (multiple <literal>-f</literal> options), in which
case a random one of the scripts is chosen each time a client session
starts a new transaction.
</para>
<itemizedlist>
<listitem>
<para>
<literal>\set name operand1 [ operator operand2 ]</literal>
- Sets the calculated value using "operand1" "operator"
"operand2" to variable "name". If "operator" and "operand2"
are omitted, the value of operand1 is set to variable "name".
</para>
<para>
Example:
</para>
<programlisting>
<para>
The format of a script file is one SQL command per line; multi-line
SQL commands are not supported. Empty lines and lines beginning with
<literal>--</> are ignored. Script file lines can also be
<quote>meta commands</>, which are interpreted by <application>pgbench</>
itself, as described below.
</para>
<para>
There is a simple variable-substitution facility for script files.
Variables can be set by the command-line <literal>-D</> option,
explained above, or by the meta commands explained below.
In addition to any variables preset by <literal>-D</> command-line options,
the variable <literal>scale</> is preset to the current scale factor.
Once set, a variable's
value can be inserted into a SQL command by writing
<literal>:</><replaceable>variablename</>. When running more than
one client session, each session has its own set of variables.
</para>
<para>
Script file meta commands begin with a backslash (<literal>\</>).
Arguments to a meta command are separated by white space.
These meta commands are supported:
</para>
<variablelist>
<varlistentry>
<term>
<literal>\set <replaceable>varname</> <replaceable>operand1</> [ <replaceable>operator</> <replaceable>operand2</> ]</literal>
</term>
<listitem>
<para>
Sets variable <replaceable>varname</> to a calculated integer value.
Each <replaceable>operand</> is either an integer constant or a
<literal>:</><replaceable>variablename</> reference to a variable
having an integer value. The <replaceable>operator</> can be
<literal>+</>, <literal>-</>, <literal>*</>, or <literal>/</>.
</para>
<para>
Example:
<programlisting>
\set ntellers 10 * :scale
</programlisting>
</listitem>
<listitem>
<para>
<literal>\setrandom name min max</literal>
- Assigns random integer to name between min and max
</para>
<para>
Example:
</para>
<programlisting>
\setrandom aid 1 100000
</programlisting>
</listitem>
<listitem>
<para>
Variables can be referred to in SQL comands by adding ":" in front
of the varible name.
</para>
<para>
Example:
</para>
<programlisting>
SELECT abalance FROM accounts WHERE aid = :aid
</programlisting>
<para>
Variables can also be defined by using -D option.
</para>
</listitem>
<listitem>
<para>
<literal>\sleep num [us|ms|s]</> - Causes script execution to sleep for the
specified duration of microseconds (us), milliseconds (ms) or the default
seconds (s).
</para>
<para>
Example:
</para>
<programlisting>
\setrandom millisec 1000 2500
\sleep : millisec ms
</programlisting>
</listitem>
</itemizedlist>
</sect2>
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</></literal>
</term>
<listitem>
<para>
Sets variable <replaceable>varname</> to a random integer value
between the limits <replaceable>min</> and <replaceable>max</>.
Each limit can be either an integer constant or a
<literal>:</><replaceable>variablename</> reference to a variable
having an integer value.
</para>
<para>
Example:
<programlisting>
\setrandom aid 1 :naccounts
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>\sleep <replaceable>number</> [ us | ms | s ]</literal>
</term>
<listitem>
<para>
Causes script execution to sleep for the specified duration in
microseconds (<literal>us</>), milliseconds (<literal>ms</>) or seconds
(<literal>s</>). If the unit is omitted then seconds are the default.
<replaceable>number</> can be either an integer constant or a
<literal>:</><replaceable>variablename</> reference to a variable
having an integer value.
</para>
<para>
Example:
<programlisting>
\sleep 10 ms
</programlisting>
</para>
</listitem>
</varlistentry>
</variablelist>
<sect2>
<title>Examples</title>
<para>
Example, TPC-B like benchmark can be defined as follows(scaling
factor = 1):
</para>
<programlisting>
As an example, the full definition of the built-in TPC-B-like
transaction is:
<programlisting>
\set nbranches :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta 1 10000
BEGIN
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid
SELECT abalance FROM accounts WHERE aid = :aid
UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid
UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid
INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, 'now')
END
</programlisting>
<para>
If you want to automatically set the scaling factor from the number of
tuples in branches table, use -s option and shell command like this:
\setrandom delta -5000 5000
BEGIN;
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM accounts WHERE aid = :aid;
UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
</programlisting>
This script allows each iteration of the transaction to reference
different, randomly-chosen rows. (This example also shows why it's
important for each client session to have its own variables &mdash;
otherwise they'd not be independently touching different rows.)
</para>
<programlisting>
pgbench -s $(psql -At -c "SELECT count(*) FROM branches") -f tpc_b.sql
</programlisting>
</sect2>
<sect2>
<title>Per-transaction logging</title>
<para>
Notice that -f option does not execute vacuum and clearing history
table before starting benchmark.
With the <literal>-l</> option, <application>pgbench</> writes the time
taken by each transaction to a logfile. The logfile will be named
<filename>pgbench_log.<replaceable>nnn</></filename>, where
<replaceable>nnn</> is the PID of the pgbench process.
The format of the log is:
<programlisting>
<replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>file_no</> <replaceable>time_epoch</> <replaceable>time_us</>
</programlisting>
where <replaceable>time</> is the elapsed transaction time in microseconds,
<replaceable>file_no</> identifies which script file was used
(useful when multiple scripts were specified with <literal>-f</>),
and <replaceable>time_epoch</>/<replaceable>time_us</> are a
UNIX epoch format timestamp and an offset
in microseconds (suitable for creating a ISO 8601
timestamp with fractional seconds) showing when
the transaction completed.
</para>
<para>
Here are example outputs:
<programlisting>
0 199 2241 0 1175850568 995598
0 200 2465 0 1175850568 998079
0 201 2513 0 1175850569 608
0 202 2038 0 1175850569 2663
</programlisting>
</para>
</sect2>
<sect2>
<title>Good Practices</title>
<para>
It is very easy to use <application>pgbench</> to produce completely
meaningless numbers. Here are some guidelines to help you get useful
results.
</para>
<para>
In the first place, <emphasis>never</> believe any test that runs
for only a few seconds. Increase the <literal>-t</> setting enough
to make the run last at least a few minutes, so as to average out noise.
In some cases you could need hours to get numbers that are reproducible.
It's a good idea to try the test run a few times, to find out if your
numbers are reproducible or not.
</para>
<para>
For the default TPC-B-like test scenario, the initialization scale factor
(<literal>-s</>) should be at least as large as the largest number of
clients you intend to test (<literal>-c</>); else you'll mostly be
measuring update contention. There are only <literal>-s</> rows in
the <structname>branches</> table, and every transaction wants to
update one of them, so <literal>-c</> values in excess of <literal>-s</>
will undoubtedly result in lots of transactions blocked waiting for
other transactions.
</para>
<para>
The default test scenario is also quite sensitive to how long it's been
since the tables were initialized: accumulation of dead rows and dead space
in the tables changes the results. To understand the results you must keep
track of the total number of updates and when vacuuming happens. If
autovacuum is enabled it can result in unpredictable changes in measured
performance.
</para>
<para>
A limitation of <application>pgbench</> is that it can itself become
the bottleneck when trying to test a large number of client sessions.
This can be alleviated by running <application>pgbench</> on a different
machine from the database server, although low network latency will be
essential. It might even be useful to run several <application>pgbench</>
instances concurrently, on several client machines, against the same
database server.
</para>
</sect2>
</sect1>

View File

@ -1,85 +1,138 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbuffercache.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgbuffercache">
<title>pg_buffercache</title>
<indexterm zone="pgbuffercache">
<primary>pg_buffercache</primary>
</indexterm>
<para>
The <literal>pg_buffercache</literal> module provides a means for examining
what's happening to the buffercache at any given time without having to
restart or rebuild the server with debugging code added. The intent is to
do for the buffercache what pg_locks does for locks.
The <filename>pg_buffercache</filename> module provides a means for
examining what's happening in the shared buffer cache in real time.
</para>
<para>
This module consists of a C function <literal>pg_buffercache_pages()</literal>
that returns a set of records, plus a view <literal>pg_buffercache</literal>
to wrapper the function.
The module provides a C function <function>pg_buffercache_pages</function>
that returns a set of records, plus a view
<structname>pg_buffercache</structname> that wraps the function for
convenient use.
</para>
<para>
By default public access is REVOKED from both of these, just in case there
By default public access is revoked from both of these, just in case there
are security issues lurking.
</para>
<sect2>
<title>Notes</title>
<title>The <structname>pg_buffercache</structname> view</title>
<para>
The definition of the columns exposed in the view is:
The definitions of the columns exposed by the view are:
</para>
<programlisting>
Column | references | Description
----------------+----------------------+------------------------------------
bufferid | | Id, 1..shared_buffers.
relfilenode | pg_class.relfilenode | Refilenode of the relation.
reltablespace | pg_tablespace.oid | Tablespace oid of the relation.
reldatabase | pg_database.oid | Database for the relation.
relblocknumber | | Offset of the page in the relation.
isdirty | | Is the page dirty?
usagecount | | Page LRU count
</programlisting>
<table>
<title><structname>pg_buffercache</> Columns</title>
<tgroup cols="4">
<thead>
<row>
<entry>Name</entry>
<entry>Type</entry>
<entry>References</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><structfield>bufferid</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>ID, in the range 1..<varname>shared_buffers</></entry>
</row>
<row>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_class.relfilenode</literal></entry>
<entry>Relfilenode of the relation</entry>
</row>
<row>
<entry><structfield>reltablespace</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_tablespace.oid</literal></entry>
<entry>Tablespace OID of the relation</entry>
</row>
<row>
<entry><structfield>reldatabase</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_database.oid</literal></entry>
<entry>Database OID of the relation</entry>
</row>
<row>
<entry><structfield>relblocknumber</structfield></entry>
<entry><type>bigint</type></entry>
<entry></entry>
<entry>Page number within the relation</entry>
</row>
<row>
<entry><structfield>isdirty</structfield></entry>
<entry><type>boolean</type></entry>
<entry></entry>
<entry>Is the page dirty?</entry>
</row>
<row>
<entry><structfield>usagecount</structfield></entry>
<entry><type>smallint</type></entry>
<entry></entry>
<entry>Page LRU count</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
There is one row for each buffer in the shared cache. Unused buffers are
shown with all fields null except bufferid.
shown with all fields null except <structfield>bufferid</>. Shared system
catalogs are shown as belonging to database zero.
</para>
<para>
Because the cache is shared by all the databases, there are pages from
relations not belonging to the current database.
Because the cache is shared by all the databases, there will normally be
pages from relations not belonging to the current database. This means
that there may not be matching join rows in <structname>pg_class</> for
some rows, or that there could even be incorrect joins. If you are
trying to join against <structname>pg_class</>, it's a good idea to
restrict the join to rows having <structfield>reldatabase</> equal to
the current database's OID or zero.
</para>
<para>
When the pg_buffercache view is accessed, internal buffer manager locks are
taken, and a copy of the buffer cache data is made for the view to display.
This ensures that the view produces a consistent set of results, while not
blocking normal buffer activity longer than necessary. Nonetheless there
When the <structname>pg_buffercache</> view is accessed, internal buffer
manager locks are taken for long enough to copy all the buffer state
data that the view will display.
This ensures that the view produces a consistent set of results, while not
blocking normal buffer activity longer than necessary. Nonetheless there
could be some impact on database performance if this view is read often.
</para>
</sect2>
<sect2>
<title>Sample output</title>
<programlisting>
regression=# \d pg_buffercache;
View "public.pg_buffercache"
Column | Type | Modifiers
----------------+----------+-----------
bufferid | integer |
relfilenode | oid |
reltablespace | oid |
reldatabase | oid |
relblocknumber | bigint |
isdirty | boolean |
usagecount | smallint |
View definition:
SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase,
p.relblocknumber, p.isdirty, p.usagecount
FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid,
reltablespace oid, reldatabase oid, relblocknumber bigint,
isdirty boolean, usagecount smallint);
regression=# SELECT c.relname, count(*) AS buffers
FROM pg_class c INNER JOIN pg_buffercache b
ON b.relfilenode = c.relfilenode INNER JOIN pg_database d
ON (b.reldatabase = d.oid AND d.datname = current_database())
FROM pg_buffercache b INNER JOIN pg_class c
ON b.relfilenode = c.relfilenode AND
b.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
GROUP BY c.relname
ORDER BY 2 DESC LIMIT 10;
relname | buffers
@ -95,26 +148,23 @@
pg_depend | 22
pg_depend_reference_index | 20
(10 rows)
regression=#
</programlisting>
</sect2>
<sect2>
<title>Authors</title>
<itemizedlist>
<listitem>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
</listitem>
<listitem>
<para>Design suggestions: Neil Conway <email>neilc@samurai.com</email></para>
</listitem>
<listitem>
<para>Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email></para>
</listitem>
</itemizedlist>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
<para>
Design suggestions: Neil Conway <email>neilc@samurai.com</email>
</para>
<para>
Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email>
</para>
</sect2>
</sect1>

View File

@ -1,184 +1,203 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgfreespacemap.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgfreespacemap">
<title>pg_freespacemap</title>
<indexterm zone="pgfreespacemap">
<primary>pg_freespacemap</primary>
</indexterm>
<para>
This module provides a means for examining the free space map (FSM). It
consists of two C functions: <literal>pg_freespacemap_relations()</literal>
and <literal>pg_freespacemap_pages()</literal> that return a set
of records, plus two views <literal>pg_freespacemap_relations</literal> and
<literal>pg_freespacemap_pages</literal> for more user-friendly access to
the functions.
The <filename>pg_freespacemap</> module provides a means for examining the
free space map (FSM). It provides two C functions:
<function>pg_freespacemap_relations</function> and
<function>pg_freespacemap_pages</function> that each return a set of
records, plus two views <structname>pg_freespacemap_relations</structname>
and <structname>pg_freespacemap_pages</structname> that wrap the functions
for convenient use.
</para>
<para>
The module provides the ability to examine the contents of the free space
map, without having to restart or rebuild the server with additional
debugging code.
</para>
<para>
By default public access is REVOKED from the functions and views, just in
case there are security issues present in the code.
By default public access is revoked from the functions and views, just in
case there are security issues lurking.
</para>
<sect2>
<title>Notes</title>
<title>The <filename>pg_freespacemap</> views</title>
<para>
The definitions for the columns exposed in the views are:
The definitions of the columns exposed by the views are:
</para>
<table>
<title>pg_freespacemap_relations</title>
<tgroup cols="3">
<title><structname>pg_freespacemap_relations</> Columns</title>
<tgroup cols="4">
<thead>
<row>
<entry>Column</entry>
<entry>references</entry>
<entry>Name</entry>
<entry>Type</entry>
<entry>References</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>reltablespace</entry>
<entry>pg_tablespace.oid</entry>
<entry>Tablespace oid of the relation.</entry>
<entry><structfield>reltablespace</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_tablespace.oid</literal></entry>
<entry>Tablespace OID of the relation</entry>
</row>
<row>
<entry>reldatabase</entry>
<entry>pg_database.oid</entry>
<entry>Database oid of the relation.</entry>
<entry><structfield>reldatabase</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_database.oid</literal></entry>
<entry>Database OID of the relation</entry>
</row>
<row>
<entry>relfilenode</entry>
<entry>pg_class.relfilenode</entry>
<entry>Relfilenode of the relation.</entry>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_class.relfilenode</literal></entry>
<entry>Relfilenode of the relation</entry>
</row>
<row>
<entry>avgrequest</entry>
<entry><structfield>avgrequest</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Moving average of free space requests (NULL for indexes)</entry>
</row>
<row>
<entry>interestingpages</entry>
<entry><structfield>interestingpages</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Count of pages last reported as containing useful free space.</entry>
<entry>Count of pages last reported as containing useful free space</entry>
</row>
<row>
<entry>storedpages</entry>
<entry><structfield>storedpages</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Count of pages actually stored in free space map.</entry>
<entry>Count of pages actually stored in free space map</entry>
</row>
<row>
<entry>nextpage</entry>
<entry><structfield>nextpage</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Page index (from 0) to start next search at.</entry>
<entry>Page index (from 0) to start next search at</entry>
</row>
</tbody>
</tgroup>
</table>
<table>
<title>pg_freespacemap_pages</title>
<tgroup cols="3">
<title><structname>pg_freespacemap_pages</> Columns</title>
<tgroup cols="4">
<thead>
<row>
<entry>Column</entry>
<entry> references</entry>
<entry>Name</entry>
<entry>Type</entry>
<entry>References</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>reltablespace</entry>
<entry>pg_tablespace.oid</entry>
<entry>Tablespace oid of the relation.</entry>
<entry><structfield>reltablespace</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_tablespace.oid</literal></entry>
<entry>Tablespace OID of the relation</entry>
</row>
<row>
<entry>reldatabase</entry>
<entry>pg_database.oid</entry>
<entry>Database oid of the relation.</entry>
<entry><structfield>reldatabase</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_database.oid</literal></entry>
<entry>Database OID of the relation</entry>
</row>
<row>
<entry>relfilenode</entry>
<entry>pg_class.relfilenode</entry>
<entry>Relfilenode of the relation.</entry>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_class.relfilenode</literal></entry>
<entry>Relfilenode of the relation</entry>
</row>
<row>
<entry>relblocknumber</entry>
<entry><structfield>relblocknumber</structfield></entry>
<entry><type>bigint</type></entry>
<entry></entry>
<entry>Page number in the relation.</entry>
<entry>Page number within the relation</entry>
</row>
<row>
<entry>bytes</entry>
<entry><structfield>bytes</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Free bytes in the page, or NULL for an index page (see below).</entry>
<entry>Free bytes in the page, or NULL for an index page (see below)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
For <literal>pg_freespacemap_relations</literal>, there is one row for each
relation in the free space map. <literal>storedpages</literal> is the
number of pages actually stored in the map, while
<literal>interestingpages</literal> is the number of pages the last VACUUM
thought had useful amounts of free space.
For <structname>pg_freespacemap_relations</structname>, there is one row
for each relation in the free space map.
<structfield>storedpages</structfield> is the number of pages actually
stored in the map, while <structfield>interestingpages</structfield> is the
number of pages the last <command>VACUUM</> thought had useful amounts of
free space.
</para>
<para>
If <literal>storedpages</literal> is consistently less than interestingpages
then it'd be a good idea to increase <literal>max_fsm_pages</literal>. Also,
if the number of rows in <literal>pg_freespacemap_relations</literal> is
close to <literal>max_fsm_relations</literal>, then you should consider
increasing <literal>max_fsm_relations</literal>.
If <structfield>storedpages</structfield> is consistently less than
<structfield>interestingpages</> then it'd be a good idea to increase
<varname>max_fsm_pages</varname>. Also, if the number of rows in
<structname>pg_freespacemap_relations</structname> is close to
<varname>max_fsm_relations</varname>, then you should consider increasing
<varname>max_fsm_relations</varname>.
</para>
<para>
For <literal>pg_freespacemap_pages</literal>, there is one row for each page
in the free space map. The number of rows for a relation will match the
<literal>storedpages</literal> column in
<literal>pg_freespacemap_relations</literal>.
For <structname>pg_freespacemap_pages</structname>, there is one row for
each page in the free space map. The number of rows for a relation will
match the <structfield>storedpages</structfield> column in
<structname>pg_freespacemap_relations</structname>.
</para>
<para>
For indexes, what is tracked is entirely-unused pages, rather than free
space within pages. Therefore, the average request size and free bytes
within a page are not meaningful, and are shown as NULL.
</para>
<para>
Because the map is shared by all the databases, it will include relations
not belonging to the current database.
Because the map is shared by all the databases, there will normally be
entries for relations not belonging to the current database. This means
that there may not be matching join rows in <structname>pg_class</> for
some rows, or that there could even be incorrect joins. If you are
trying to join against <structname>pg_class</>, it's a good idea to
restrict the join to rows having <structfield>reldatabase</> equal to
the current database's OID or zero.
</para>
<para>
When either of the views are accessed, internal free space map locks are
taken, and a copy of the map data is made for them to display.
This ensures that the views produce a consistent set of results, while not
blocking normal activity longer than necessary. Nonetheless there
When either of the views is accessed, internal free space map locks are
taken for long enough to copy all the state data that the view will display.
This ensures that the views produce a consistent set of results, while not
blocking normal activity longer than necessary. Nonetheless there
could be some impact on database performance if they are read often.
</para>
</sect2>
<sect2>
<title>Sample output - pg_freespacemap_relations</title>
<programlisting>
regression=# \d pg_freespacemap_relations
View "public.pg_freespacemap_relations"
Column | Type | Modifiers
------------------+---------+-----------
reltablespace | oid |
reldatabase | oid |
relfilenode | oid |
avgrequest | integer |
interestingpages | integer |
storedpages | integer |
nextpage | integer |
View definition:
SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.avgrequest, p.interestingpages, p.storedpages, p.nextpage
FROM pg_freespacemap_relations() p(reltablespace oid, reldatabase oid, relfilenode oid, avgrequest integer, interestingpages integer, storedpages integer, nextpage integer);
<title>Sample output</title>
<programlisting>
regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
FROM pg_freespacemap_relations r INNER JOIN pg_class c
ON c.relfilenode = r.relfilenode INNER JOIN pg_database d
ON r.reldatabase = d.oid AND (d.datname = current_database())
ON r.relfilenode = c.relfilenode AND
r.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
ORDER BY r.storedpages DESC LIMIT 10;
relname | avgrequest | interestingpages | storedpages
---------------------------------+------------+------------------+-------------
@ -193,31 +212,14 @@ regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
pg_class_relname_nsp_index | | 10 | 10
pg_proc | 302 | 8 | 8
(10 rows)
</programlisting>
</sect2>
<sect2>
<title>Sample output - pg_freespacemap_pages</title>
<programlisting>
regression=# \d pg_freespacemap_pages
View "public.pg_freespacemap_pages"
Column | Type | Modifiers
----------------+---------+-----------
reltablespace | oid |
reldatabase | oid |
relfilenode | oid |
relblocknumber | bigint |
bytes | integer |
View definition:
SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.relblocknumber, p.bytes
FROM pg_freespacemap_pages() p(reltablespace oid, reldatabase oid, relfilenode oid, relblocknumber bigint, bytes integer);
regression=# SELECT c.relname, p.relblocknumber, p.bytes
FROM pg_freespacemap_pages p INNER JOIN pg_class c
ON c.relfilenode = p.relfilenode INNER JOIN pg_database d
ON (p.reldatabase = d.oid AND d.datname = current_database())
ON p.relfilenode = c.relfilenode AND
p.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
ORDER BY c.relname LIMIT 10;
relname | relblocknumber | bytes
relname | relblocknumber | bytes
--------------+----------------+-------
a_star | 0 | 8040
abstime_tbl | 0 | 7908
@ -235,8 +237,10 @@ regression=# SELECT c.relname, p.relblocknumber, p.bytes
<sect2>
<title>Author</title>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
</sect2>
</sect1>

View File

@ -1,115 +1,122 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgrowlocks.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgrowlocks">
<title>pgrowlocks</title>
<indexterm zone="pgrowlocks">
<primary>pgrowlocks</primary>
</indexterm>
<para>
The <literal>pgrowlocks</literal> module provides a function to show row
The <filename>pgrowlocks</filename> module provides a function to show row
locking information for a specified table.
</para>
<sect2>
<title>Overview</title>
<programlisting>
pgrowlocks(text) RETURNS pgrowlocks_type
</programlisting>
<synopsis>
pgrowlocks(text) returns setof record
</synopsis>
<para>
The parameter is a name of table. And <literal>pgrowlocks_type</literal> is
defined as:
The parameter is the name of a table. The result is a set of records,
with one row for each locked row within the table. The output columns
are:
</para>
<programlisting>
CREATE TYPE pgrowlocks_type AS (
locked_row TID, -- row TID
lock_type TEXT, -- lock type
locker XID, -- locking XID
multi bool, -- multi XID?
xids xid[], -- multi XIDs
pids INTEGER[] -- locker's process id
);
</programlisting>
<table>
<title>pgrowlocks_type</title>
<tgroup cols="2">
<title><function>pgrowlocks</> output columns</title>
<tgroup cols="3">
<thead>
<row>
<entry>Name</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>locked_row</entry>
<entry>tuple ID(TID) of each locked rows</entry>
<entry><structfield>locked_row</structfield></entry>
<entry><type>tid</type></entry>
<entry>Tuple ID (TID) of locked row</entry>
</row>
<row>
<entry>lock_type</entry>
<entry>"Shared" for shared lock, "Exclusive" for exclusive lock</entry>
<entry><structfield>lock_type</structfield></entry>
<entry><type>text</type></entry>
<entry><literal>Shared</> for shared lock, or
<literal>Exclusive</> for exclusive lock</entry>
</row>
<row>
<entry>locker</entry>
<entry>transaction ID of locker (Note 1)</entry>
<entry><structfield>locker</structfield></entry>
<entry><type>xid</type></entry>
<entry>Transaction ID of locker, or multixact ID if multi-transaction</entry>
</row>
<row>
<entry>multi</entry>
<entry>"t" if locker is a multi transaction, otherwise "f"</entry>
<entry><structfield>multi</structfield></entry>
<entry><type>boolean</type></entry>
<entry>True if locker is a multi-transaction</entry>
</row>
<row>
<entry>xids</entry>
<entry>XIDs of lockers (Note 2)</entry>
<entry><structfield>xids</structfield></entry>
<entry><type>xid[]</type></entry>
<entry>Transaction IDs of lockers (more than one if multi-transaction)</entry>
</row>
<row>
<entry>pids</entry>
<entry>process ids of locking backends</entry>
<entry><structfield>pids</structfield></entry>
<entry><type>integer[]</type></entry>
<entry>Process IDs of locking backends (more than one if multi-transaction)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Note1: If the locker is multi transaction, it represents the multi ID.
</para>
<para>
Note2: If the locker is multi, multiple data are shown.
</para>
<para>
The calling sequence for <literal>pgrowlocks</literal> is as follows:
<literal>pgrowlocks</literal> grabs AccessShareLock for the target table and
reads each row one by one to get the row locking information. You should
notice that:
<function>pgrowlocks</function> takes <literal>AccessShareLock</> for the
target table and reads each row one by one to collect the row locking
information. This is not very speedy for a large table. Note that:
</para>
<orderedlist>
<listitem>
<para>
if the table is exclusive locked by someone else,
<literal>pgrowlocks</literal> will be blocked.
If the table as a whole is exclusive-locked by someone else,
<function>pgrowlocks</function> will be blocked.
</para>
</listitem>
<listitem>
<para>
<literal>pgrowlocks</literal> may show incorrect information if there's a
new lock or a lock is freeed while its execution.
<function>pgrowlocks</function> is not guaranteed to produce a
self-consistent snapshot. It is possible that a new row lock is taken,
or an old lock is freed, during its execution.
</para>
</listitem>
</orderedlist>
<para>
<literal>pgrowlocks</literal> does not show the contents of locked rows. If
you want to take a look at the row contents at the same time, you could do
something like this:
</para>
<function>pgrowlocks</function> does not show the contents of locked
rows. If you want to take a look at the row contents at the same time, you
could do something like this:
<programlisting>
SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p WHERE p.locked_ row = a.ctid;
SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p
WHERE p.locked_row = a.ctid;
</programlisting>
Be aware however that (as of <productname>PostgreSQL</> 8.3) such a
query will be very inefficient.
</para>
</sect2>
<sect2>
<title>Example</title>
<para>
<literal>pgrowlocks</literal> returns the following columns:
</para>
<para>
Here is a sample execution of pgrowlocks:
</para>
<title>Sample output</title>
<programlisting>
test=# SELECT * FROM pgrowlocks('t1');
locked_row | lock_type | locker | multi | xids | pids
locked_row | lock_type | locker | multi | xids | pids
------------+-----------+--------+-------+-----------+---------------
(0,1) | Shared | 19 | t | {804,805} | {29066,29068}
(0,2) | Shared | 19 | t | {804,805} | {29066,29068}
@ -117,7 +124,14 @@ test=# SELECT * FROM pgrowlocks('t1');
(0,4) | Exclusive | 804 | f | {804} | {29066}
(4 rows)
</programlisting>
</sect2>
</sect1>
<sect2>
<title>Author</title>
<para>
Tatsuo Ishii
</para>
</sect2>
</sect1>

View File

@ -1,29 +1,35 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgstattuple.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgstattuple">
<title>pgstattuple</title>
<indexterm zone="pgstattuple">
<primary>pgstattuple</primary>
</indexterm>
<para>
<literal>pgstattuple</literal> modules provides various functions to obtain
tuple statistics.
The <filename>pgstattuple</filename> module provides various functions to
obtain tuple-level statistics.
</para>
<sect2>
<title>Functions</title>
<itemizedlist>
<listitem>
<para>
<literal>pgstattuple()</literal> returns the relation length, percentage
of the "dead" tuples of a relation and other info. This may help users to
determine whether vacuum is necessary or not. Here is an example session:
</para>
<programlisting>
test=> \x
Expanded display is on.
<variablelist>
<varlistentry>
<term>
<function>pgstattuple(text) returns record</>
</term>
<listitem>
<para>
<function>pgstattuple</function> returns a relation's physical length,
percentage of <quote>dead</> tuples, and other info. This may help users
to determine whether vacuum is necessary or not. The argument is the
target relation's name (optionally schema-qualified).
For example:
</para>
<programlisting>
test=> SELECT * FROM pgstattuple('pg_catalog.pg_proc');
-[ RECORD 1 ]------+-------
table_len | 458752
@ -35,86 +41,111 @@ dead_tuple_len | 3157
dead_tuple_percent | 0.69
free_space | 8932
free_percent | 1.95
</programlisting>
</programlisting>
<para>
Here are explanations for each column:
The output columns are:
</para>
<table>
<title><literal>pgstattuple()</literal> column descriptions</title>
<tgroup cols="2">
<title><function>pgstattuple</function> output columns</title>
<tgroup cols="3">
<thead>
<row>
<entry>Column</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>table_len</entry>
<entry>physical relation length in bytes</entry>
<entry><structfield>table_len</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Physical relation length in bytes</entry>
</row>
<row>
<entry>tuple_count</entry>
<entry>number of live tuples</entry>
<entry><structfield>tuple_count</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Number of live tuples</entry>
</row>
<row>
<entry>tuple_len</entry>
<entry>total tuples length in bytes</entry>
<entry><structfield>tuple_len</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Total length of live tuples in bytes</entry>
</row>
<row>
<entry>tuple_percent</entry>
<entry>live tuples in %</entry>
<entry><structfield>tuple_percent</structfield></entry>
<entry><type>float8</type></entry>
<entry>Percentage of live tuples</entry>
</row>
<row>
<entry>dead_tuple_len</entry>
<entry>total dead tuples length in bytes</entry>
<entry><structfield>dead_tuple_count</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Number of dead tuples</entry>
</row>
<row>
<entry>dead_tuple_percent</entry>
<entry>dead tuples in %</entry>
<entry><structfield>dead_tuple_len</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Total length of dead tuples in bytes</entry>
</row>
<row>
<entry>free_space</entry>
<entry>free space in bytes</entry>
<entry><structfield>dead_tuple_percent</structfield></entry>
<entry><type>float8</type></entry>
<entry>Percentage of dead tuples</entry>
</row>
<row>
<entry>free_percent</entry>
<entry>free space in %</entry>
<entry><structfield>free_space</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Total free space in bytes</entry>
</row>
<row>
<entry><structfield>free_percent</structfield></entry>
<entry><type>float8</type></entry>
<entry>Percentage of free space</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<note>
<para>
<literal>pgstattuple</literal> acquires only a read lock on the relation. So
concurrent update may affect the result.
</para>
</note>
<note>
<para>
<literal>pgstattuple</literal> judges a tuple is "dead" if HeapTupleSatisfiesNow()
returns false.
</para>
</note>
</para>
</listitem>
<listitem>
<para>
<literal>pg_relpages()</literal> returns the number of pages in the relation.
<function>pgstattuple</function> acquires only a read lock on the
relation. So the results do not reflect an instantaneous snapshot;
concurrent updates will affect them.
</para>
</listitem>
<listitem>
<para>
<literal>pgstatindex()</literal> returns an array showing the information about an index:
<function>pgstattuple</function> judges a tuple is <quote>dead</> if
<function>HeapTupleSatisfiesNow</> returns false.
</para>
<programlisting>
test=> \x
Expanded display is on.
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>pgstattuple(oid) returns record</>
</term>
<listitem>
<para>
This is the same as <function>pgstattuple(text)</function>, except
that the target relation is specified by OID.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>pgstatindex(text) returns record</>
</term>
<listitem>
<para>
<function>pgstatindex</function> returns a record showing information
about a btree index. For example:
</para>
<programlisting>
test=> SELECT * FROM pgstatindex('pg_cast_oid_index');
-[ RECORD 1 ]------+------
version | 2
@ -128,31 +159,116 @@ deleted_pages | 0
avg_leaf_density | 50.27
leaf_fragmentation | 0
</programlisting>
</listitem>
</itemizedlist>
<para>
The output columns are:
</para>
<table>
<title><function>pgstatindex</function> output columns</title>
<tgroup cols="3">
<thead>
<row>
<entry>Column</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><structfield>version</structfield></entry>
<entry><type>integer</type></entry>
<entry>Btree version number</entry>
</row>
<row>
<entry><structfield>tree_level</structfield></entry>
<entry><type>integer</type></entry>
<entry>Tree level of the root page</entry>
</row>
<row>
<entry><structfield>index_size</structfield></entry>
<entry><type>integer</type></entry>
<entry>Total number of pages in index</entry>
</row>
<row>
<entry><structfield>root_block_no</structfield></entry>
<entry><type>integer</type></entry>
<entry>Location of root block</entry>
</row>
<row>
<entry><structfield>internal_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of <quote>internal</> (upper-level) pages</entry>
</row>
<row>
<entry><structfield>leaf_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of leaf pages</entry>
</row>
<row>
<entry><structfield>empty_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of empty pages</entry>
</row>
<row>
<entry><structfield>deleted_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of deleted pages</entry>
</row>
<row>
<entry><structfield>avg_leaf_density</structfield></entry>
<entry><type>float8</type></entry>
<entry>Average density of leaf pages</entry>
</row>
<row>
<entry><structfield>leaf_fragmentation</structfield></entry>
<entry><type>float8</type></entry>
<entry>Leaf page fragmentation</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
As with <function>pgstattuple</>, the results are accumulated
page-by-page, and should not be expected to represent an
instantaneous snapshot of the whole index.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>pg_relpages(text) returns integer</>
</term>
<listitem>
<para>
<function>pg_relpages</function> returns the number of pages in the
relation.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2>
<title>Usage</title>
<para>
<literal>pgstattuple</literal> may be called as a relation function and is
defined as follows:
</para>
<programlisting>
CREATE OR REPLACE FUNCTION pgstattuple(text) RETURNS pgstattuple_type
AS 'MODULE_PATHNAME', 'pgstattuple'
LANGUAGE C STRICT;
<title>Author</title>
CREATE OR REPLACE FUNCTION pgstattuple(oid) RETURNS pgstattuple_type
AS 'MODULE_PATHNAME', 'pgstattuplebyid'
LANGUAGE C STRICT;
</programlisting>
<para>
The argument is the relation name (optionally it may be qualified)
or the OID of the relation. Note that pgstattuple only returns
one row.
Tatsuo Ishii
</para>
</sect2>
</sect1>

View File

@ -1,90 +1,120 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgtrgm.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgtrgm">
<title>pg_trgm</title>
<indexterm zone="pgtrgm">
<primary>pg_trgm</primary>
</indexterm>
<para>
The <literal>pg_trgm</literal> module provides functions and index classes
for determining the similarity of text based on trigram matching.
The <filename>pg_trgm</filename> module provides functions and operators
for determining the similarity of text based on trigram matching, as
well as index operator classes that support fast searching for similar
strings.
</para>
<sect2>
<title>Trigram (or Trigraph)</title>
<title>Trigram (or Trigraph) Concepts</title>
<para>
A trigram is a set of three consecutive characters taken
from a string. A string is considered to have two spaces
prefixed and one space suffixed when determining the set
of trigrams that comprise the string.
</para>
<para>
eg. The set of trigrams in the word "cat" is " c", " ca",
"at " and "cat".
A trigram is a group of three consecutive characters taken
from a string. We can measure the similarity of two strings by
counting the number of trigrams they share. This simple idea
turns out to be very effective for measuring the similarity of
words in many natural languages.
</para>
<note>
<para>
A string is considered to have two spaces
prefixed and one space suffixed when determining the set
of trigrams contained in the string.
For example, the set of trigrams in the string
<quote><literal>cat</literal></quote> is
<quote><literal> c</literal></quote>,
<quote><literal> ca</literal></quote>,
<quote><literal>cat</literal></quote>, and
<quote><literal>at </literal></quote>.
</para>
</note>
</sect2>
<sect2>
<title>Public Functions</title>
<table>
<title><literal>pg_trgm</literal> functions</title>
<tgroup cols="2">
<title>Functions and Operators</title>
<table id="pgtrgm-func-table">
<title><filename>pg_trgm</filename> functions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Function</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>real similarity(text, text)</literal></entry>
<entry><function>similarity(text, text)</function></entry>
<entry><type>real</type></entry>
<entry>
<para>
Returns a number that indicates how closely matches the two
arguments are. A zero result indicates that the two words
are completely dissimilar, and a result of one indicates that
the two words are identical.
</para>
Returns a number that indicates how similar the two arguments are.
The range of the result is zero (indicating that the two strings are
completely dissimilar) to one (indicating that the two strings are
identical).
</entry>
</row>
<row>
<entry><literal>real show_limit()</literal></entry>
<entry><function>show_trgm(text)</function></entry>
<entry><type>text[]</type></entry>
<entry>
<para>
Returns the current similarity threshold used by the '%'
operator. This in effect sets the minimum similarity between
two words in order that they be considered similar enough to
be misspellings of each other, for example.
</para>
Returns an array of all the trigrams in the given string.
(In practice this is seldom useful except for debugging.)
</entry>
</row>
<row>
<entry><literal>real set_limit(real)</literal></entry>
<entry><function>show_limit()</function></entry>
<entry><type>real</type></entry>
<entry>
<para>
Sets the current similarity threshold that is used by the '%'
operator, and is returned by the show_limit() function.
</para>
Returns the current similarity threshold used by the <literal>%</>
operator. This sets the minimum similarity between
two words for them to be considered similar enough to
be misspellings of each other, for example.
</entry>
</row>
<row>
<entry><literal>text[] show_trgm(text)</literal></entry>
<entry><function>set_limit(real)</function></entry>
<entry><type>real</type></entry>
<entry>
<para>
Returns an array of all the trigrams of the supplied text
parameter.
</para>
Sets the current similarity threshold that is used by the <literal>%</>
operator. The threshold must be between 0 and 1 (default is 0.3).
Returns the same value passed in.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="pgtrgm-op-table">
<title><filename>pg_trgm</filename> operators</title>
<tgroup cols="3">
<thead>
<row>
<entry>Operator: <literal>text % text (returns boolean)</literal></entry>
<entry>Operator</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>text</> <literal>%</literal> <type>text</></entry>
<entry><type>boolean</type></entry>
<entry>
<para>
The '%' operator returns TRUE if its two arguments have a similarity
that is greater than the similarity threshold set by set_limit(). It
will return FALSE if the similarity is less than the current
threshold.
</para>
Returns <literal>true</> if its arguments have a similarity that is
greater than the current similarity threshold set by
<function>set_limit</>.
</entry>
</row>
</tbody>
@ -93,97 +123,111 @@
</sect2>
<sect2>
<title>Public Index Operator Class</title>
<title>Index Support</title>
<para>
The <literal>pg_trgm</literal> module comes with the
<literal>gist_trgm_ops</literal> index operator class that allows a
developer to create an index over a text column for the purpose
of very fast similarity searches.
The <filename>pg_trgm</filename> module provides GiST and GIN index
operator classes that allow you to create an index over a text column for
the purpose of very fast similarity searches. These index types support
the <literal>%</> similarity operator (and no other operators, so you may
want a regular btree index too).
</para>
<para>
To use this index, the '%' operator must be used and an appropriate
similarity threshold for the application must be set. Example:
</para>
Example:
<programlisting>
CREATE TABLE test_trgm (t text);
CREATE INDEX trgm_idx ON test_trgm USING gist (t gist_trgm_ops);
</programlisting>
or
<programlisting>
CREATE INDEX trgm_idx ON test_trgm USING gin (t gin_trgm_ops);
</programlisting>
</para>
<para>
At this point, you will have an index on the t text column that you
can use for similarity searching. Example:
At this point, you will have an index on the <structfield>t</> column that
you can use for similarity searching. A typical query is
</para>
<programlisting>
SELECT
t,
similarity(t, 'word') AS sml
FROM
test_trgm
WHERE
t % 'word'
ORDER BY
sml DESC, t;
SELECT t, similarity(t, '<replaceable>word</>') AS sml
FROM test_trgm
WHERE t % '<replaceable>word</>'
ORDER BY sml DESC, t;
</programlisting>
<para>
This will return all values in the text column that are sufficiently
similar to 'word', sorted from best match to worst. The index will
be used to make this a fast operation over very large data sets.
similar to <replaceable>word</>, sorted from best match to worst. The
index will be used to make this a fast operation even over very large data
sets.
</para>
<para>
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed elsewhere.
As a rule of thumb, a GIN index is faster to search than a GiST index, but
slower to build or update; so GIN is better suited for static data and GiST
for often-updated data.
</para>
</sect2>
<sect2>
<title>Text Search Integration</title>
<para>
Trigram matching is a very useful tool when used in conjunction
with a full text index.
with a full text index. In particular it can help to recognize
misspelled input words that will not be matched directly by the
full text search mechanism.
</para>
<para>
The first step is to generate an auxiliary table containing all
the unique words in the documents:
</para>
<programlisting>
CREATE TABLE words AS SELECT word FROM
stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
CREATE TABLE words AS SELECT word FROM
ts_stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
</programlisting>
<para>
where <structname>documents</> is a table that has a text field
<structfield>bodytext</> that we wish to search. The use of the
<literal>simple</> configuration with the <function>to_tsvector</>
function, instead of just using the already
existing vector is to avoid creating a list of already stemmed
words. This way, only the original, unstemmed words are added
to the word list.
<structfield>bodytext</> that we wish to search. The reason for using
the <literal>simple</> configuration with the <function>to_tsvector</>
function, instead of using a language-specific configuration,
is that we want a list of the original (unstemmed) words.
</para>
<para>
Next, create a trigram index on the word column:
</para>
<programlisting>
CREATE INDEX words_idx ON words USING gist(word gist_trgm_ops);
CREATE INDEX words_idx ON words USING gin(word gin_trgm_ops);
</programlisting>
<para>
or
</para>
<programlisting>
CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
</programlisting>
<para>
Now, a <literal>SELECT</literal> query similar to the example above can be
used to suggest spellings for misspelled words in user search terms. A
useful extra clause is to ensure that the similar words are also
of similar length to the misspelled word.
</para>
<para>
<note>
<para>
Since the <structname>words</> table has been generated as a separate,
static table, it will need to be periodically regenerated so that
it remains up to date with the document collection.
</para>
</note>
Now, a <command>SELECT</command> query similar to the previous example can
be used to suggest spellings for misspelled words in user search terms.
A useful extra test is to require that the selected words are also of
similar length to the misspelled word.
</para>
<note>
<para>
Since the <structname>words</> table has been generated as a separate,
static table, it will need to be periodically regenerated so that
it remains reasonably up-to-date with the document collection.
Keeping it exactly current is usually unnecessary.
</para>
</note>
</sect2>
<sect2>
<title>References</title>
<para>
GiST Development Site
<ulink url="http://www.sai.msu.su/~megera/postgres/gist/"></ulink>
@ -196,6 +240,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
<sect2>
<title>Authors</title>
<para>
Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
</para>
@ -203,7 +248,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd.,Russia
</para>
<para>
Documentation: Christopher Kings-Lynne
Documentation: Christopher Kings-Lynne
</para>
<para>
This module is sponsored by Delta-Soft Ltd., Moscow, Russia.