diff --git a/doc/src/sgml/intagg.sgml b/doc/src/sgml/intagg.sgml index f92af5485a..b6f37f752c 100644 --- a/doc/src/sgml/intagg.sgml +++ b/doc/src/sgml/intagg.sgml @@ -1,82 +1,145 @@ + intagg - + intagg - This section describes the intagg module which provides an integer aggregator and an enumerator. + The intagg module provides an integer aggregator and an + enumerator. - - Many database systems have the notion of a one to many table. Such a table usually sits between two indexed tables, as: - - -CREATE TABLE one_to_many(left INT, right INT) ; - + + + Functions - And it is used like this: + The aggregator is an aggregate function + int_array_aggregate(integer) + that produces an integer array + containing exactly the integers it is fed. + Here is a not-tremendously-useful example: - SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right) - WHERE one_to_many.left = item; +test=# select int_array_aggregate(i) from +test-# generate_series(1,10,2) i; + int_array_aggregate +--------------------- + {1,3,5,7,9} +(1 row) - This will return all the items in the right hand table for an entry + The enumerator is a function + int_array_enum(integer[]) + that returns setof integer. It is essentially the reverse + operation of the aggregator: given an array of integers, expand it + into a set of rows. For example, + + + +test=# select * from int_array_enum(array[1,3,5,7,9]); + int_array_enum +---------------- + 1 + 3 + 5 + 7 + 9 +(5 rows) + + + + + + Sample Uses + + + Many database systems have the notion of a one to many table. Such a table + usually sits between two indexed tables, for example: + + + +CREATE TABLE left (id INT PRIMARY KEY, ...); +CREATE TABLE right (id INT PRIMARY KEY, ...); +CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right); + + + + It is typically used like this: + + + + SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right) + WHERE one_to_many.left = item; + + + + This will return all the items in the right hand table for an entry in the left hand table. This is a very common construct in SQL. Now, this methodology can be cumbersome with a very large number of - entries in the one_to_many table. Depending on the order in which - data was entered, a join like this could result in an index scan + entries in the one_to_many table. Often, + a join like this would result in an index scan and a fetch for each right hand entry in the table for a particular - left hand entry. If you have a very dynamic system, there is not much you + left hand entry. If you have a very dynamic system, there is not much you can do. However, if you have some data which is fairly static, you can create a summary table with the aggregator. -CREATE TABLE summary as SELECT left, int_array_aggregate(right) - AS right FROM one_to_many GROUP BY left; +CREATE TABLE summary as + SELECT left, int_array_aggregate(right) AS right + FROM one_to_many + GROUP BY left; This will create a table with one row per left item, and an array of right items. Now this is pretty useless without some way of using - the array, thats why there is an array enumerator. + the array; that's why there is an array enumerator. You can do + -SELECT left, int_array_enum(right) FROM summary WHERE left = item; +SELECT left, int_array_enum(right) FROM summary WHERE left = item; - The above query using int_array_enum, produces the same results as: + The above query using int_array_enum produces the same results + as + -SELECT left, right FROM one_to_many WHERE left = item; +SELECT left, right FROM one_to_many WHERE left = item; - + The difference is that the query against the summary table has to get - only one row from the table, where as the query against "one_to_many" - must index scan and fetch a row for each entry. + only one row from the table, whereas the direct query against + one_to_many must index scan and fetch a row for each entry. - - On our system, an EXPLAIN shows a query with a cost of 8488 gets reduced - to a cost of 329. The query is a join between the one_to_many table, - - -SELECT right, count(right) FROM -( - SELECT left, int_array_enum(right) AS right FROM summary JOIN - (SELECT left FROM left_table WHERE left = item) AS lefts - ON (summary.left = lefts.left ) -) AS list GROUP BY right ORDER BY count DESC ; - - + + On one system, an EXPLAIN showed a query with a cost of 8488 was + reduced to a cost of 329. The original query was a join involving the + one_to_many table, which was replaced by: + + + +SELECT right, count(right) FROM + ( SELECT left, int_array_enum(right) AS right + FROM summary JOIN (SELECT left FROM left_table WHERE left = item) AS lefts + ON (summary.left = lefts.left) + ) AS list + GROUP BY right + ORDER BY count DESC; + + + + + diff --git a/doc/src/sgml/intarray.sgml b/doc/src/sgml/intarray.sgml index a7f29980e5..95121c1e9b 100644 --- a/doc/src/sgml/intarray.sgml +++ b/doc/src/sgml/intarray.sgml @@ -1,196 +1,207 @@ + + intarray - + intarray - This is an implementation of RD-tree data structure using GiST interface - of PostgreSQL. It has built-in lossy compression. - - - - Current implementation provides index support for one-dimensional array of - integers: gist__int_ops, suitable for small and medium size of arrays (used by - default), and gist__intbig_ops for indexing large arrays (we use superimposed - signature with length of 4096 bits to represent sets). There is also a - non-default gin__int_ops for GIN indexes on integer arrays. + The intarray module provides a number of useful functions + and operators for manipulating one-dimensional arrays of integers. + There is also support for indexed searches using some of the operators. - Functions + <filename>intarray</> Functions and Operators - + + <filename>intarray</> Functions - - - int icount(int[]) - the number of elements in intarray - - -test=# select icount('{1,2,3}'::int[]); - icount --------- - 3 -(1 row) - - + + + + Function + Return Type + Description + Example + Result + + - - - int[] sort(int[], 'asc' | 'desc') - sort intarray - - -test=# select sort('{1,2,3}'::int[],'desc'); - sort ---------- - {3,2,1} -(1 row) - - + + + icount(int[]) + int + number of elements in array + icount('{1,2,3}'::int[]) + 3 + - - - int[] sort(int[]) - sort in ascending order - - + + sort(int[], text dir) + int[] + sort array — dir must be asc or desc + sort('{1,2,3}'::int[], 'desc') + {3,2,1} + - - - int[] sort_asc(int[]),sort_desc(int[]) - shortcuts for sort - - + + sort(int[]) + int[] + sort in ascending order + sort(array[11,77,44]) + {11,44,77} + - - - int[] uniq(int[]) - returns unique elements - - -test=# select uniq(sort('{1,2,3,2,1}'::int[])); - uniq ---------- - {1,2,3} -(1 row) - - + + sort_asc(int[]) + int[] + sort in ascending order + + + - - - int idx(int[], int item) - returns index of first - intarray matching element to item, or '0' if matching failed. - - -test=# select idx('{1,2,3,2,1}'::int[],2); - idx ------ - 2 -(1 row) - - + + sort_desc(int[]) + int[] + sort in descending order + + + - - - int[] subarray(int[],int START [, int LEN]) - returns - part of intarray starting from element number START (from 1) and length LEN. - - -test=# select subarray('{1,2,3,2,1}'::int[],2,3); - subarray ----------- - {2,3,2} -(1 row) - - + + uniq(int[]) + int[] + remove adjacent duplicates + uniq(sort('{1,2,3,2,1}'::int[])) + {1,2,3} + - - - int[] intset(int4) - casting int4 to int[] - - -test=# select intset(1); - intset --------- - {1} -(1 row) - - + + idx(int[], int item) + int + index of first element matching item (0 if none) + idx(array[11,22,33,22,11], 22) + 2 + - - + + subarray(int[], int start, int len) + int[] + portion of array starting at position start, len elements + subarray('{1,2,3,2,1}'::int[], 2, 3) + {2,3,2} + - - Operations -
- Operations - + + subarray(int[], int start) + int[] + portion of array starting at position start + subarray('{1,2,3,2,1}'::int[], 2) + {2,3,2,1} + + + + intset(int) + int[] + make single-element array + intset(42) + {42} + + + + +
+ + + <filename>intarray</> Operators + + Operator + Returns Description + int[] && int[] - overlap - returns TRUE if arrays have at least one common element + boolean + overlap — true if arrays have at least one common element int[] @> int[] - contains - returns TRUE if left array contains right array + boolean + contains — true if left array contains right array int[] <@ int[] - contained - returns TRUE if left array is contained in right array + boolean + contained — true if left array is contained in right array # int[] - returns the number of elements in array + int + number of elements in array + + + int[] # int + int + index (same as idx function) int[] + int - push element to array ( add to end of array) + int[] + push element onto array (add it to end of array) int[] + int[] - merge of arrays (right array added to the end of left one) + int[] + array concatenation (right array added to the end of left one) int[] - int - remove entries matched by right argument from array + int[] + remove entries matching right argument from array int[] - int[] - remove right array from left + int[] + remove elements of right array from left int[] | int - returns intarray - union of arguments + int[] + union of arguments int[] | int[] - returns intarray as a union of two arrays + int[] + union of arrays - int[] & int[] - returns intersection of arrays + int[] + intersection of arrays - int[] @@ query_int - - returns TRUE if array satisfies query (like - '1&(2|3)') - + boolean + true if array satisfies query (see below) - query_int ~~ int[] - returns TRUE if array satisfies query (commutator of @@) + boolean + true if array satisfies query (commutator of @@)
+ (Before PostgreSQL 8.2, the containment operators @> and <@ were respectively called @ and ~. These names are still available, but are @@ -198,85 +209,102 @@ test=# select intset(1); are reversed from the convention formerly followed by the core geometric datatypes!) + + + The @@ and ~~ operators test whether an array + satisfies a query, which is expressed as a value of a + specialized data type query_int. A query + consists of integer values that are checked against the elements of + the array, possibly combined using the operators & + (AND), | (OR), and ! (NOT). Parentheses + can be used as needed. For example, + the query 1&(2|3) matches arrays that contain 1 + and also contain either 2 or 3. + +
+ + + Index Support + + + intarray provides index support for the + &&, @>, <@, + and @@ operators, as well as regular array equality. + The implementation uses an RD-tree data structure with + built-in lossy compression. + + + + Two GiST index operator classes are provided: + gist__int_ops (used by default) is suitable for + small and medium-size arrays, while + gist__intbig_ops uses a larger signature and is more + suitable for indexing large arrays. + + + + There is also a non-default GIN operator class + gin__int_ops. + + + + The choice between GiST and GIN indexing depends on the relative + performance characteristics of GiST and GIN, which are discussed elsewhere. + As a rule of thumb, a GIN index is faster to search than a GiST index, but + slower to build or update; so GIN is better suited for static data and GiST + for often-updated data. + Example -CREATE TABLE message (mid INT NOT NULL,sections INT[]); -CREATE TABLE message_section_map (mid INT NOT NULL,sid INT NOT NULL); +-- a message can be in one or more sections +CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...); --- create indices -CREATE unique index message_key ON message ( mid ); -CREATE unique index message_section_map_key2 ON message_section_map (sid, mid ); -CREATE INDEX message_rdtree_idx ON message USING GIST ( sections gist__int_ops); +-- create specialized index +CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops); --- select some messages with section in 1 OR 2 - OVERLAP operator -SELECT message.mid FROM message WHERE message.sections && '{1,2}'; +-- select messages in section 1 OR 2 - OVERLAP operator +SELECT message.mid FROM message WHERE message.sections && '{1,2}'; --- select messages contains in sections 1 AND 2 - CONTAINS operator +-- select messages in sections 1 AND 2 - CONTAINS operator SELECT message.mid FROM message WHERE message.sections @> '{1,2}'; --- the same, CONTAINED operator -SELECT message.mid FROM message WHERE '{1,2}' <@ message.sections; + +-- the same, using QUERY operator +SELECT message.mid FROM message WHERE message.sections @@ '1&2'::query_int; Benchmark + - subdirectory bench contains benchmark suite. + The source directory contrib/intarray/bench contains a + benchmark test suite. To run: + - cd ./bench - 1. createdb TEST - 2. psql TEST < ../_int.sql - 3. ./create_test.pl | psql TEST - 4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries - with/without RD-Tree. Run script without arguments to - see availbale options. - - a)test without RD-Tree (OR) - ./bench.pl -d TEST -c -s 1,2 -v - b)test with RD-Tree - ./bench.pl -d TEST -c -s 1,2 -v -r - - BENCHMARKS: - - Size of table <message>: 200000 - Size of table <message_section_map>: 269133 - - Distribution of messages by sections: - - section 0: 74377 messages - section 1: 16284 messages - section 50: 1229 messages - section 99: 683 messages - - old - without RD-Tree support, - new - with RD-Tree - - +----------+---------------+----------------+ - |Search set|OR, time in sec|AND, time in sec| - | +-------+-------+--------+-------+ - | | old | new | old | new | - +----------+-------+-------+--------+-------+ - | 1| 0.625| 0.101| -| -| - +----------+-------+-------+--------+-------+ - | 99| 0.018| 0.017| -| -| - +----------+-------+-------+--------+-------+ - | 1,2| 0.766| 0.133| 0.628| 0.045| - +----------+-------+-------+--------+-------+ - | 1,2,50,65| 0.794| 0.141| 0.030| 0.006| - +----------+-------+-------+--------+-------+ + cd .../bench + createdb TEST + psql TEST < ../_int.sql + ./create_test.pl | psql TEST + ./bench.pl + + + The bench.pl script has numerous options, which + are displayed when it is run without any arguments. + Authors + - All work was done by Teodor Sigaev (teodor@stack.net) and Oleg - Bartunov (oleg@sai.msu.su). See + All work was done by Teodor Sigaev (teodor@stack.net) and + Oleg Bartunov (oleg@sai.msu.su). See for additional information. Andrey Oktyabrski did a great work on adding new functions and operations. @@ -284,4 +312,3 @@ SELECT message.mid FROM message WHERE '{1,2}' <@ message.sections;
- diff --git a/doc/src/sgml/isn.sgml b/doc/src/sgml/isn.sgml index 81b3296818..8d4c8fd3d7 100644 --- a/doc/src/sgml/isn.sgml +++ b/doc/src/sgml/isn.sgml @@ -1,429 +1,299 @@ + + isn - + isn - The isn module adds data types for the following - international-standard namespaces: EAN13, UPC, ISBN (books), ISMN (music), - and ISSN (serials). This module is inspired by Garrett A. Wollman's - isbn_issn code. + The isn module provides data types for the following + international product numbering standards: EAN13, UPC, ISBN (books), ISMN + (music), and ISSN (serials). Numbers are validated on input, and correctly + hyphenated on output. - - This module validates, and automatically adds the correct - hyphenations to the numbers. Also, it supports the new ISBN-13 - numbers to be used starting in January 2007. - - - - Premises: - - - - - ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers - - - EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are) - - - some ISBN13 numbers can be displayed as ISBN - - - some ISMN13 numbers can be displayed as ISMN - - - some ISSN13 numbers can be displayed as ISSN - - - all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers - - - - - - All types are internally represented as 64 bit integers, - and internally all are consistently interchangeable. - - - - - We have two operator classes (for btree and for hash) so each data type - can be indexed for faster access. - - Data types - - - We have the following data types: - - Data types + <filename>isn</filename> data types - Data type - Description - - - - - EAN13 - - - European Article Numbers. This type will always show the EAN13-display - format. Te output function for this is ean13_out() - - - - - - ISBN13 - - - For International Standard Book Numbers to be displayed in - the new EAN13-display format. - - - - - - ISMN13 - - - For International Standard Music Numbers to be displayed in - the new EAN13-display format. - - - - - ISSN13 - - - For International Standard Serial Numbers to be displayed in the new - EAN13-display format. - - - - - ISBN - - - For International Standard Book Numbers to be displayed in the current - short-display format. - - - - - ISMN - - - For International Standard Music Numbers to be displayed in the - current short-display format. - - - - - ISSN - - - For International Standard Serial Numbers to be displayed in the - current short-display format. These types will display the short - version of the ISxN (ISxN 10) whenever it's possible, and it will - show ISxN 13 when it's impossible to show the short version. The - output function to do this is isn_out() - - - - - UPC - - - For Universal Product Codes. UPC numbers are a subset of the EAN13 - numbers (they are basically EAN13 without the first '0' digit.) - The output function to do this is also isn_out() - - - - - -
- - - - EAN13, ISBN13, - ISMN13 and ISSN13 types will always - display the long version of the ISxN (EAN13). The output function to do - this is ean13_out(). - - - The need for these types is just for displaying in different ways the same - data: ISBN13 is actually the same as - ISBN, ISMN13=ISMN and - ISSN13=ISSN. - - -
- - - Input functions - - - We have the following input functions: - - - - Input functions - - - - Function + Data type Description + - ean13_in() + EAN13 - - To take a string and return an EAN13. - + European Article Numbers, always displayed in the EAN13 display format - isbn_in() + ISBN13 - - To take a string and return valid ISBN or ISBN13 numbers. - + International Standard Book Numbers to be displayed in + the new EAN13 display format - ismn_in() + ISMN13 - - To take a string and return valid ISMN or ISMN13 numbers. - - - - - - issn_in() - - - To take a string and return valid ISSN or ISSN13 numbers. - + International Standard Music Numbers to be displayed in + the new EAN13 display format - upc_in() + ISSN13 - - To take a string and return an UPC codes. - + International Standard Serial Numbers to be displayed in the new + EAN13 display format + + + + ISBN + + International Standard Book Numbers to be displayed in the old + short display format + + + + ISMN + + International Standard Music Numbers to be displayed in the + old short display format + + + + ISSN + + International Standard Serial Numbers to be displayed in the + old short display format + + + + UPC + + Universal Product Codes
+ + + Some notes: + + + + + ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers. + + + EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some + are). + + + Some ISBN13 numbers can be displayed as ISBN. + + + Some ISMN13 numbers can be displayed as ISMN. + + + Some ISSN13 numbers can be displayed as ISSN. + + + UPC numbers are a subset of the EAN13 numbers (they are basically + EAN13 without the first 0 digit). + + + All UPC, ISBN, ISMN and ISSN numbers can be represented as EAN13 + numbers. + + + + + Internally, all these types use the same representation (a 64-bit + integer), and all are interchangeable. Multiple types are provided + to control display formatting and to permit tighter validity checking + of input that is supposed to denote one particular type of number. + + + + The ISBN, ISMN, and ISSN types will display the + short version of the number (ISxN 10) whenever it's possible, and will show + ISxN 13 format for numbers that do not fit in the short version. + The EAN13, ISBN13, ISMN13 and + ISSN13 types will always display the long version of the ISxN + (EAN13). +
Casts - We are able to cast from: + The isn module provides the following pairs of type casts: + - ISBN13 -> EAN13 + ISBN13 <=> EAN13 - ISMN13 -> EAN13 + ISMN13 <=> EAN13 - ISSN13 -> EAN13 + ISSN13 <=> EAN13 - ISBN -> EAN13 + ISBN <=> EAN13 - ISMN -> EAN13 + ISMN <=> EAN13 - ISSN -> EAN13 + ISSN <=> EAN13 - UPC -> EAN13 + UPC <=> EAN13 - ISBN <-> ISBN13 + ISBN <=> ISBN13 - ISMN <-> ISMN13 + ISMN <=> ISMN13 - ISSN <-> ISSN13 - - - - - - - C API - - The C API is implemented as: - - - extern Datum isn_out(PG_FUNCTION_ARGS); - extern Datum ean13_out(PG_FUNCTION_ARGS); - extern Datum ean13_in(PG_FUNCTION_ARGS); - extern Datum isbn_in(PG_FUNCTION_ARGS); - extern Datum ismn_in(PG_FUNCTION_ARGS); - extern Datum issn_in(PG_FUNCTION_ARGS); - extern Datum upc_in(PG_FUNCTION_ARGS); - - - - On success: - - - - - isn_out() takes any of our types and returns a string containing - the shortes possible representation of the number. - - - - - ean13_out() takes any of our types and returns the - EAN13 (long) representation of the number. - - - - - ean13_in() takes a string and return a EAN13. Which, as stated in (2) - could or could not be any of our types, but it certainly is an EAN13 - number. Only if the string is a valid EAN13 number, otherwise it fails. - - - - - isbn_in() takes a string and return an ISBN/ISBN13. Only if the string - is really a ISBN/ISBN13, otherwise it fails. - - - - - ismn_in() takes a string and return an ISMN/ISMN13. Only if the string - is really a ISMN/ISMN13, otherwise it fails. - - - - - issn_in() takes a string and return an ISSN/ISSN13. Only if the string - is really a ISSN/ISSN13, otherwise it fails. - - - - - upc_in() takes a string and return an UPC. Only if the string is - really a UPC, otherwise it fails. + ISSN <=> ISSN13 - (on failure, the functions 'ereport' the error) + When casting from EAN13 to another type, there is a run-time + check that the value is within the domain of the other type, and an error + is thrown if not. The other casts are simply relabelings that will + always succeed. - Testing functions + Functions and Operators + + + The isn module provides the standard comparison operators, + plus btree and hash indexing support for all these datatypes. In + addition there are several specialized functions. In this table, + isn means any one of the module's data types: + + - Testing functions - + <filename>isn</> functions + - Function - Description + Function + Returns + Description + - isn_weak(boolean) - Sets the weak input mode. + isn_weak(boolean) + boolean + Sets the weak input mode (returns new setting) - isn_weak() - Gets the current status of the weak mode. + isn_weak() + boolean + Gets the current status of the weak mode - make_valid() - Validates an invalid number (deleting the invalid flag). + make_valid(isn) + isn + Validates an invalid number (clears the invalid flag) - is_valid() - Checks for the invalid flag prsence. + is_valid(isn) + boolean + Checks for the presence of the invalid flag
- Weak mode is used to be able to insert invalid data to - a table. Invalid as in the check digit being wrong, not missing numbers. + Weak mode is used to be able to insert invalid data + into a table. Invalid means the check digit is wrong, not that there are + missing numbers. + Why would you want to use the weak mode? Well, it could be that you have a huge collection of ISBN numbers, and that there are so many of them that for weird reasons some have the wrong check digit (perhaps the - numbers where scanned from a printed list and the OCR got the numbers wrong, - perhaps the numbers were manually captured... who knows.) Anyway, the thing - is you might want to clean the mess up, but you still want to be able to have - all the numbers in your database and maybe use an external tool to access - the invalid numbers in the database so you can verify the information and - validate it more easily; as selecting all the invalid numbers in the table. + numbers were scanned from a printed list and the OCR got the numbers wrong, + perhaps the numbers were manually captured... who knows). Anyway, the point + is you might want to clean the mess up, but you still want to be able to + have all the numbers in your database and maybe use an external tool to + locate the invalid numbers in the database so you can verify the + information and validate it more easily; so for example you'd want to + select all the invalid numbers in the table. + When you insert invalid numbers in a table using the weak mode, the number - will be inserted with the corrected check digit, but it will be flagged - with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!) + will be inserted with the corrected check digit, but it will be displayed + with an exclamation mark (!) at the end, for example + 0-11-000322-5!. This invalid marker can be checked with + the is_valid function and cleared with the + make_valid function. + - You can also force the insertion of invalid numbers even not in the weak mode, - appending the '!' character at the end of the number. + You can also force the insertion of invalid numbers even when not in the + weak mode, by appending the ! character at the end of the + number. + + + + Another special feature is that during input, you can write + ? in place of the check digit, and the correct check digit + will be inserted automatically.
Examples + --Using the types directly: SELECT isbn('978-0-393-04002-9'); @@ -431,18 +301,18 @@ SELECT isbn13('0901690546'); SELECT issn('1436-4522'); --Casting types: --- note that you can only cast from ean13 to other type when the casted --- number would be valid in the realm of the casted type; --- thus, the following will NOT work: select isbn(ean13('0220356483481')); --- but these will: -SELECT upc(ean13('0220356483481')); -SELECT ean13(upc('220356483481')); +-- note that you can only cast from ean13 to another type when the +-- number would be valid in the realm of the target type; +-- thus, the following will NOT work: select isbn(ean13('0220356483481')); +-- but these will: +SELECT upc(ean13('0220356483481')); +SELECT ean13(upc('220356483481')); --Create a table with a single column to hold ISBN numbers: -CREATE TABLE test ( id isbn ); +CREATE TABLE test (id isbn); INSERT INTO test VALUES('9780393040029'); ---Automatically calculating check digits (observe the '?'): +--Automatically calculate check digits (observe the '?'): INSERT INTO test VALUES('220500896?'); INSERT INTO test VALUES('978055215372?'); @@ -455,10 +325,10 @@ INSERT INTO test VALUES('978-0-11-000533-4'); INSERT INTO test VALUES('9780141219307'); INSERT INTO test VALUES('2-205-00876-X'); SELECT isn_weak(false); - + SELECT id FROM test WHERE NOT is_valid(id); -UPDATE test SET id=make_valid(id) WHERE id = '2-205-00876-X!'; - +UPDATE test SET id = make_valid(id) WHERE id = '2-205-00876-X!'; + SELECT * FROM test; SELECT isbn13(id) FROM test; @@ -467,8 +337,9 @@ SELECT isbn13(id) FROM test; Bibliography + - The information to implement this module was collected through + The information to implement this module was collected from several sites, including: @@ -477,26 +348,33 @@ SELECT isbn13(id) FROM test; http://www.ismn-international.org/ http://www.wikipedia.org/ + - the prefixes used for hyphenation where also compiled from: + The prefixes used for hyphenation were also compiled from: http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html http://www.isbn-international.org/en/identifiers.html http://www.ismn-international.org/ranges.html + Care was taken during the creation of the algorithms and they were meticulously verified against the suggested algorithms in the official ISBN, ISMN, ISSN User Manuals. - + Author Germán Méndez Bravo (Kronuz), 2004 - 2006 - -
+ + This module was inspired by Garrett A. Wollman's + isbn_issn code. + + + + diff --git a/doc/src/sgml/ltree.sgml b/doc/src/sgml/ltree.sgml index f98555c31a..22bd019c7d 100644 --- a/doc/src/sgml/ltree.sgml +++ b/doc/src/sgml/ltree.sgml @@ -1,438 +1,568 @@ + ltree - + ltree - ltree is a PostgreSQL module that contains implementation - of data types, indexed access methods and queries for data organized as a - tree-like structures. + This module implements a data type ltree for representing + labels of data stored in a hierarchical tree-like structure. + Extensive facilities for searching through label trees are provided. Definitions + - A label of a node is a sequence of one or more words - separated by blank character '_' and containing letters and digits ( for - example, [a-zA-Z0-9] for C locale). The length of a label is limited by 256 - bytes. + A label is a sequence of alphanumeric characters + and underscores (for example, in C locale the characters + A-Za-z0-9_ are allowed). Labels must be less than 256 bytes + long. + - Example: 'Countries', 'Personal_Services' + Examples: 42, Personal_Services + - A label path of a node is a sequence of one or more - dot-separated labels l1.l2...ln, represents path from root to the node. The - length of a label path is limited by 65Kb, but size <= 2Kb is preferrable. - We consider it's not a strict limitation (maximal size of label path for - DMOZ catalogue - , is about 240 - bytes!) + A label path is a sequence of zero or more + labels separated by dots, for example L1.L2.L3, representing + a path from the root of a hierarchical tree to a particular node. The + length of a label path must be less than 65Kb, but keeping it under 2Kb is + preferable. In practice this is not a major limitation; for example, + the longest label path in the DMOZ catalogue () is about 240 bytes. + - Example: 'Top.Countries.Europe.Russia' + Example: Top.Countries.Europe.Russia + - We introduce several datatypes: + The ltree module provides several datatypes: + - ltree - is a datatype for label path. - - - - - ltree[] - is a datatype for arrays of ltree. + ltree stores a label path. + - lquery - - is a path expression that has regular expression in the label path and - used for ltree matching. Star symbol (*) is used to specify any number of - labels (levels) and could be used at the beginning and the end of lquery, - for example, '*.Europe.*'. + lquery represents a regular-expression-like pattern + for matching ltree values. A simple word matches that + label within a path. A star symbol (*) matches zero + or more labels. For example: + +foo Match the exact label path foo +*.foo.* Match any label path containing the label foo +*.foo Match any label path whose last label is foo + + - The following quantifiers are recognized for '*' (like in Perl): + Star symbols can also be quantified to restrict how many labels + they can match: + +*{n} Match exactly n labels +*{n,} Match at least n labels +*{n,m} Match at least n but not more than m labels +*{,m} Match at most m labels — same as *{0,m} + - - - {n} Match exactly n levels - - - {n,} Match at least n levels - - - {n,m} Match at least n but not more than m levels - - - {,m} Match at maximum m levels (eq. to {0,m}) - - + - It is possible to use several modifiers at the end of a label: + There are several modifiers that can be put at the end of a non-star + label in lquery to make it match more than just the exact match: + +@ Match case-insensitively, for example a@ matches A +* Match any label with this prefix, for example foo* matches foobar +% Match initial underscore-separated words + + The behavior of % is a bit complicated. It tries to match + words rather than the entire label. For example + foo_bar% matches foo_bar_baz but not + foo_barbaz. If combined with *, prefix + matching applies to each word separately, for example + foo_bar%* matches foo1_bar2_baz but + not foo1_br2_baz. - - - @ Do case-insensitive label matching - - - * Do prefix matching for a label - - - % Don't account word separator '_' in label matching, that is - 'Russian%' would match 'Russian_nations', but not 'Russian' - - - - + - lquery can contain logical '!' (NOT) at the beginning - of the label and '|' (OR) to specify possible alternatives for label - matching. + Also, you can write several possibly-modified labels separated with + | (OR) to match any of those labels, and you can put + ! (NOT) at the start to match any label that doesn't + match any of the alternatives. - - Example of lquery: - - + + + Here's an annotated example of lquery: + Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain - a) b) c) d) e) + a. b. c. d. e. - - A label path should + This query will match any label path that: - begin from a node with label 'Top' + begins with the label Top - and following zero or 2 labels until + and next has zero to two labels before - a node with label beginning from case-insensitive prefix 'sport' + a label beginning with the case-insensitive prefix sport - following node with label not matched 'football' or 'tennis' and + then a label not matching football nor + tennis - end on node with label beginning from 'Russ' or strictly matched - 'Spain'. + and then ends with a label beginning with Russ or + exactly matching Spain. - - ltxtquery - - is a datatype for label searching (like type 'query' for full text - searching, see contrib/tsearch). It's possible to use modifiers @,%,* at - the end of word. The meaning of modifiers are the same as for lquery. + ltxtquery represents a full-text-search-like + pattern for matching ltree values. An + ltxtquery value contains words, possibly with the + modifiers @, *, % at the end; + the modifiers have the same meanings as in lquery. + Words can be combined with & (AND), + | (OR), ! (NOT), and parentheses. + The key difference from + lquery is that ltxtquery matches words without + regard to their position in the label path. + - Example: 'Europe & Russia*@ & !Transportation' - - - Search paths contain words 'Europe' and 'Russia*' (case-insensitive) and - not 'Transportation'. Notice, the order of words as they appear in label - path is not important ! + Here's an example ltxtquery: + + Europe & Russia*@ & !Transportation + + This will match paths that contain the label Europe and + any label beginning with Russia (case-insensitive), + but not paths containing the label Transportation. + The location of these words within the path is not important. + Also, when % is used, the word can be matched to any + underscore-separated word within a label, regardless of position. + + + Note: ltxtquery allows whitespace between symbols, but + ltree and lquery do not. + - Operations + Operators and Functions + - The following operations are defined for type ltree: + Type ltree has the usual comparison operators + =, <>, + <, >, <=, >=. + Comparison sorts in the order of a tree traversal, with the children + of a node sorted by label text. In addition, there are the following + specialized operators: + + + + <type>ltree</> Operators + + + + + Operator + Returns + Description + + + + + + ltree @> ltree + boolean + is left argument an ancestor of right (or equal)? + + + + ltree <@ ltree + boolean + is left argument a descendant of right (or equal)? + + + + ltree ~ lquery + boolean + does ltree match lquery? + + + + lquery ~ ltree + boolean + does ltree match lquery? + + + + ltree ? lquery[] + boolean + does ltree match any lquery in array? + + + + lquery[] ? ltree + boolean + does ltree match any lquery in array? + + + + ltree @ ltxtquery + boolean + does ltree match ltxtquery? + + + + ltxtquery @ ltree + boolean + does ltree match ltxtquery? + + + + ltree || ltree + ltree + concatenate ltree paths + + + + ltree || text + ltree + convert text to ltree and concatenate + + + + text || ltree + ltree + convert text to ltree and concatenate + + + + ltree[] @> ltree + boolean + does array contain an ancestor of ltree? + + + + ltree <@ ltree[] + boolean + does array contain an ancestor of ltree? + + + + ltree[] <@ ltree + boolean + does array contain a descendant of ltree? + + + + ltree @> ltree[] + boolean + does array contain a descendant of ltree? + + + + ltree[] ~ lquery + boolean + does array contain any path matching lquery? + + + + lquery ~ ltree[] + boolean + does array contain any path matching lquery? + + + + ltree[] ? lquery[] + boolean + does ltree array contain any path matching any lquery? + + + + lquery[] ? ltree[] + boolean + does ltree array contain any path matching any lquery? + + + + ltree[] @ ltxtquery + boolean + does array contain any path matching ltxtquery? + + + + ltxtquery @ ltree[] + boolean + does array contain any path matching ltxtquery? + + + + ltree[] ?@> ltree + ltree + first array entry that is an ancestor of ltree; NULL if none + + + + ltree[] ?<@ ltree + ltree + first array entry that is a descendant of ltree; NULL if none + + + + ltree[] ?~ lquery + ltree + first array entry that matches lquery; NULL if none + + + + ltree[] ?@ ltxtquery + ltree + first array entry that matches ltxtquery; NULL if none + + + + +
+ + + The operators <@, @>, + @ and ~ have analogues + ^<@, ^@>, ^@, + ^~, which are the same except they do not use + indexes. These are useful only for testing purposes. + + + + The following functions are available: + + + + <type>ltree</> Functions + + + + + Function + Return Type + Description + Example + Result + + + + + + subltree(ltree, int start, int end) + ltree + subpath of ltree from position start to + position end-1 (counting from 0) + subltree('Top.Child1.Child2',1,2) + Child1 + + + + subpath(ltree, int offset, int len) + ltree + subpath of ltree starting at position + offset, length len. + If offset is negative, subpath starts that far from the + end of the path. If len is negative, leaves that many + labels off the end of the path. + subpath('Top.Child1.Child2',0,2) + Top.Child1 + + + + subpath(ltree, int offset) + ltree + subpath of ltree starting at position + offset, extending to end of path. + If offset is negative, subpath starts that far from the + end of the path. + subpath('Top.Child1.Child2',1) + Child1.Child2 + + + + nlevel(ltree) + integer + number of labels in path + nlevel('Top.Child1.Child2') + 3 + + + + index(ltree a, ltree b) + integer + position of first occurrence of b in + a; -1 if not found + index('0.1.2.3.5.4.5.6.8.5.6.8','5.6') + 6 + + + + index(ltree a, ltree b, int offset) + integer + position of first occurrence of b in + a, searching starting at offset; + negative offset means start -offset + labels from the end of the path + index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4) + 9 + + + + text2ltree(text) + ltree + cast text to ltree + + + + + + ltree2text(ltree) + text + cast ltree to text + + + + + + lca(ltree, ltree, ...) + ltree + lowest common ancestor, i.e., longest common prefix of paths + (up to 8 arguments supported) + lca('1.2.2.3','1.2.3.4.5.6') + 1.2 + + + + lca(ltree[]) + ltree + lowest common ancestor, i.e., longest common prefix of paths + lca(array['1.2.2.3'::ltree,'1.2.3']) + 1.2 + + + + +
+
+ + + Indexes + + ltree supports several types of indexes that can speed + up the indicated operators: - <,>,<=,>=,=, <> - - Have their usual meanings. Comparison is doing in the order of direct - tree traversing, children of a node are sorted lexicographic. + B-tree index over ltree: + <, <=, =, + >=, > - ltree @> ltree - - returns TRUE if left argument is an ancestor of right argument (or - equal). - - - - - ltree <@ ltree - - returns TRUE if left argument is a descendant of right argument (or - equal). - - - - - ltree ~ lquery, lquery ~ ltree - - return TRUE if node represented by ltree satisfies lquery. - - - - - ltree ? lquery[], lquery ? ltree[] - - return TRUE if node represented by ltree satisfies at least one lquery - from array. - - - - - ltree @ ltxtquery, ltxtquery @ ltree - - return TRUE if node represented by ltree satisfies ltxtquery. - - - - - ltree || ltree, ltree || text, text || ltree - - return concatenated ltree. - - - - - - Operations for arrays of ltree (ltree[]): - - - - - ltree[] @> ltree, ltree <@ ltree[] - - returns TRUE if array ltree[] contains an ancestor of ltree. - - - - - ltree @> ltree[], ltree[] <@ ltree - - returns TRUE if array ltree[] contains a descendant of ltree. - - - - - ltree[] ~ lquery, lquery ~ ltree[] - - returns TRUE if array ltree[] contains label paths matched lquery. - - - - - ltree[] ? lquery[], lquery[] ? ltree[] - - returns TRUE if array ltree[] contains label paths matched atleaset one - lquery from array. - - - - - ltree[] @ ltxtquery, ltxtquery @ ltree[] - - returns TRUE if array ltree[] contains label paths matched ltxtquery - (full text search). - - - - - ltree[] ?@> ltree, ltree ?<@ ltree[], ltree[] ?~ lquery, ltree[] ?@ ltxtquery - - - returns first element of array ltree[] satisfies corresponding condition - and NULL in vice versa. - - - - - - - Remark - - - Operations <@, @>, @ and - ~ have analogues - ^<@, ^@>, ^@, ^~, which don't use - indices! - - - - - Indices - - Various indices could be created to speed up execution of operations: - - - - - - B-tree index over ltree: <, <=, =, >=, > - - - - - GiST index over ltree: <, <=, =, >=, >, @>, <@, @, ~, ? + GiST index over ltree: + <, <=, =, + >=, >, + @>, <@, + @, ~, ? - Example: + Example of creating such an index: CREATE INDEX path_gist_idx ON test USING GIST (path); - GiST index over ltree[]: - ltree[]<@ ltree, ltree @> ltree[], @, ~, ?. + + GiST index over ltree[]: + ltree[] <@ ltree, ltree @> ltree[], + @, ~, ? - Example: + Example of creating such an index: - CREATE INDEX path_gist_idx ON test USING GIST (array_path); + CREATE INDEX path_gist_idx ON test USING GIST (array_path); - Notices: This index is lossy. + Note: This index type is lossy. - - Functions - - - - - ltree subltree(ltree, start, end) - returns subpath of ltree from start (inclusive) until the end. - - - # select subltree('Top.Child1.Child2',1,2); - subltree - -------- - Child1 - - - - - ltree subpath(ltree, OFFSET,LEN) and - ltree subpath(ltree, OFFSET) - returns subpath of ltree from OFFSET (inclusive) with length LEN. - If OFFSET is negative returns subpath starts that far from the end - of the path. If LENGTH is omitted, returns everything to the end - of the path. If LENGTH is negative, leaves that many labels off - the end of the path. - - - # select subpath('Top.Child1.Child2',1,2); - subpath - ------- - Child1.Child2 - - # select subpath('Top.Child1.Child2',-2,1); - subpath - --------- - Child1 - - - - - int4 nlevel(ltree) - returns level of the node. - - - # select nlevel('Top.Child1.Child2'); - nlevel - -------- - 3 - - - Note, that arguments start, end, OFFSET, LEN have meaning of level of the - node ! - - - - - int4 index(ltree,ltree) and - int4 index(ltree,ltree,OFFSET) - returns number of level of the first occurence of second argument in first - one beginning from OFFSET. if OFFSET is negative, than search begins from | - OFFSET| levels from the end of the path. - - - SELECT index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',3); - index - ------- - 6 - SELECT index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4); - index - ------- - 9 - - - - - ltree text2ltree(text) and - text ltree2text(text) cast functions for ltree and text. - - - - - ltree lca(ltree,ltree,...) (up to 8 arguments) and - ltree lca(ltree[]) Returns Lowest Common Ancestor (lca). - - - # select lca('1.2.2.3','1.2.3.4.5.6'); - lca - ----- - 1.2 - # select lca('{la.2.3,1.2.3.4.5.6}') is null; - ?column? - ---------- - f - - - - - - - Installation - - cd contrib/ltree - make - make install - make installcheck - - - Example - - createdb ltreetest - psql ltreetest < /usr/local/pgsql/share/contrib/ltree.sql - psql ltreetest < ltreetest.sql - -Now, we have a database ltreetest populated with a data describing hierarchy -shown below: + This example uses the following data (also available in file + contrib/ltree/ltreetest.sql in the source distribution): +CREATE TABLE test (path ltree); +INSERT INTO test VALUES ('Top'); +INSERT INTO test VALUES ('Top.Science'); +INSERT INTO test VALUES ('Top.Science.Astronomy'); +INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics'); +INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology'); +INSERT INTO test VALUES ('Top.Hobbies'); +INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy'); +INSERT INTO test VALUES ('Top.Collections'); +INSERT INTO test VALUES ('Top.Collections.Pictures'); +INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy'); +INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars'); +INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies'); +INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts'); +CREATE INDEX path_gist_idx ON test USING gist(path); +CREATE INDEX path_idx ON test USING btree(path); + - - TOP - / | \ - Science Hobbies Collections + + Now, we have a table test populated with data describing + the hierarchy shown below: + + + + Top + / | \ + Science Hobbies Collections / | \ Astronomy Amateurs_Astronomy Pictures / \ | @@ -440,13 +570,14 @@ shown below: / | \ Galaxies Stars Astronauts + - Inheritance: + We can do inheritance: ltreetest=# select path from test where path <@ 'Top.Science'; - path + path ------------------------------------ Top.Science Top.Science.Astronomy @@ -454,12 +585,14 @@ ltreetest=# select path from test where path <@ 'Top.Science'; Top.Science.Astronomy.Cosmology (4 rows) + - Matching: + Here are some examples of path matching: + ltreetest=# select path from test where path ~ '*.Astronomy.*'; - path + path ----------------------------------------------- Top.Science.Astronomy Top.Science.Astronomy.Astrophysics @@ -469,20 +602,22 @@ ltreetest=# select path from test where path ~ '*.Astronomy.*'; Top.Collections.Pictures.Astronomy.Galaxies Top.Collections.Pictures.Astronomy.Astronauts (7 rows) + ltreetest=# select path from test where path ~ '*.!pictures@.*.Astronomy.*'; - path + path ------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology (3 rows) + - Full text search: + Here are some examples of full text search: ltreetest=# select path from test where path @ 'Astro*% & !pictures@'; - path + path ------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics @@ -491,54 +626,38 @@ ltreetest=# select path from test where path @ 'Astro*% & !pictures@'; (4 rows) ltreetest=# select path from test where path @ 'Astro* & !pictures@'; - path + path ------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology (3 rows) + - Using Functions: + Path construction using functions: ltreetest=# select subpath(path,0,2)||'Space'||subpath(path,2) from test where path <@ 'Top.Science.Astronomy'; - ?column? + ?column? ------------------------------------------ Top.Science.Space.Astronomy Top.Science.Space.Astronomy.Astrophysics Top.Science.Space.Astronomy.Cosmology (3 rows) -We could create SQL-function: -CREATE FUNCTION ins_label(ltree, int4, text) RETURNS ltree + + + + We could simplify this by creating a SQL function that inserts a label + at a specified position in a path: + + +CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);' LANGUAGE SQL IMMUTABLE; - - - and previous select could be rewritten as: - - ltreetest=# select ins_label(path,2,'Space') from test where path <@ 'Top.Science.Astronomy'; - ins_label ------------------------------------------- - Top.Science.Space.Astronomy - Top.Science.Space.Astronomy.Astrophysics - Top.Science.Space.Astronomy.Cosmology -(3 rows) - - - - Or with another arguments: - - - -CREATE FUNCTION ins_label(ltree, ltree, text) RETURNS ltree -AS 'select subpath($1,0,nlevel($2)) || $3 || subpath($1,nlevel($2));' -LANGUAGE SQL IMMUTABLE; - -ltreetest=# select ins_label(path,'Top.Science'::ltree,'Space') from test where path <@ 'Top.Science.Astronomy'; - ins_label + ins_label ------------------------------------------ Top.Science.Space.Astronomy Top.Science.Space.Astronomy.Astrophysics @@ -547,224 +666,16 @@ ltreetest=# select ins_label(path,'Top.Science'::ltree,'Space') from test where - - Additional data - - To get more feeling from our ltree module you could download - dmozltree-eng.sql.gz (about 3Mb tar.gz archive containing 300,274 nodes), - available from - - dmozltree-eng.sql.gz, which is DMOZ catalogue, prepared for use with ltree. - Setup your test database (dmoz), load ltree module and issue command: - - - zcat dmozltree-eng.sql.gz| psql dmoz - - - Data will be loaded into database dmoz and all indices will be created. - - - - - Benchmarks - - All runs were performed on my IBM ThinkPad T21 (256 MB RAM, 750Mhz) using DMOZ - data, containing 300,274 nodes (see above for download link). We used some - basic queries typical for walking through catalog. - - - - Queries - - - - Q0: Count all rows (sort of base time for comparison) - - - select count(*) from dmoz; - count - -------- - 300274 - (1 row) - - - - - Q1: Get direct children (without inheritance) - - - select path from dmoz where path ~ 'Top.Adult.Arts.Animation.*{1}'; - path - ----------------------------------- - Top.Adult.Arts.Animation.Cartoons - Top.Adult.Arts.Animation.Anime - (2 rows) - - - - - Q2: The same as Q1 but with counting of successors - - - select path as parentpath , (select count(*)-1 from dmoz where path <@ - p.path) as count from dmoz p where path ~ 'Top.Adult.Arts.Animation.*{1}'; - parentpath | count - -----------------------------------+------- - Top.Adult.Arts.Animation.Cartoons | 2 - Top.Adult.Arts.Animation.Anime | 61 - (2 rows) - - - - - Q3: Get all parents - - - select path from dmoz where path @> 'Top.Adult.Arts.Animation' order by - path asc; - path - -------------------------- - Top - Top.Adult - Top.Adult.Arts - Top.Adult.Arts.Animation - (4 rows) - - - - - Q4: Get all parents with counting of children - - - select path, (select count(*)-1 from dmoz where path <@ p.path) as count - from dmoz p where path @> 'Top.Adult.Arts.Animation' order by path asc; - path | count - --------------------------+-------- - Top | 300273 - Top.Adult | 4913 - Top.Adult.Arts | 339 - Top.Adult.Arts.Animation | 65 - (4 rows) - - - - - Q5: Get all children with levels - - - select path, nlevel(path) - nlevel('Top.Adult.Arts.Animation') as level - from dmoz where path ~ 'Top.Adult.Arts.Animation.*{1,2}' order by path asc; - path | level - ------------------------------------------------+------- - Top.Adult.Arts.Animation.Anime | 1 - Top.Adult.Arts.Animation.Anime.Fan_Works | 2 - Top.Adult.Arts.Animation.Anime.Games | 2 - Top.Adult.Arts.Animation.Anime.Genres | 2 - Top.Adult.Arts.Animation.Anime.Image_Galleries | 2 - Top.Adult.Arts.Animation.Anime.Multimedia | 2 - Top.Adult.Arts.Animation.Anime.Resources | 2 - Top.Adult.Arts.Animation.Anime.Titles | 2 - Top.Adult.Arts.Animation.Cartoons | 1 - Top.Adult.Arts.Animation.Cartoons.AVS | 2 - Top.Adult.Arts.Animation.Cartoons.Members | 2 - (11 rows) - - - - - - - Timings - -+---------------------------------------------+ -|Query|Rows|Time (ms) index|Time (ms) no index| -|-----+----+---------------+------------------| -| Q0| 1| NA| 1453.44| -|-----+----+---------------+------------------| -| Q1| 2| 0.49| 1001.54| -|-----+----+---------------+------------------| -| Q2| 2| 1.48| 3009.39| -|-----+----+---------------+------------------| -| Q3| 4| 0.55| 906.98| -|-----+----+---------------+------------------| -| Q4| 4| 24385.07| 4951.91| -|-----+----+---------------+------------------| -| Q5| 11| 0.85| 1003.23| -+---------------------------------------------+ - - - Timings without indices were obtained using operations which doesn't use - indices (see above) - - - - - Remarks - - We didn't run full-scale tests, also we didn't present (yet) data for - operations with arrays of ltree (ltree[]) and full text searching. We'll - appreciate your input. So far, below some (rather obvious) results: - - - - - Indices does help execution of queries - - - - - Q4 performs bad because one needs to read almost all data from the HDD - - - - - - - Some Backgrounds - - The approach we use for ltree is much like one we used in our other GiST based - contrib modules (intarray, tsearch, tree, btree_gist, rtree_gist). Theoretical - background is available in papers referenced from our GiST development page - (). - - - A hierarchical data structure (tree) is a set of nodes. Each node has a - signature (LPS) of a fixed size, which is a hashed label path of that node. - Traversing a tree we could *certainly* prune branches if - - - LQS (bitwise AND) LPS != LQS - - - where LQS is a signature of lquery or ltxtquery, obtained in the same way as - LPS. - - - ltree[]: - - - For array of ltree LPS is a bitwise OR-ed signatures of *ALL* children - reachable from that node. Signatures are stored in RD-tree, implemented using - GiST, which provides indexed access. - - - ltree: - - - For ltree we store LPS in a B-tree, implemented using GiST. Each node entry is - represented by (left_bound, signature, right_bound), so that we could speedup - operations <, <=, =, >=, > using left_bound, right_bound and prune branches of - a tree using signature. - - Authors + - All work was done by Teodor Sigaev (teodor@stack.net) and - Oleg Bartunov (oleg@sai.msu.su). See + All work was done by Teodor Sigaev (teodor@stack.net) and + Oleg Bartunov (oleg@sai.msu.su). See for - additional information. Authors would like to thank Eugeny Rodichev for + additional information. Authors would like to thank Eugeny Rodichev for helpful discussions. Comments and bug reports are welcome. +
diff --git a/doc/src/sgml/oid2name.sgml b/doc/src/sgml/oid2name.sgml index 6fb4c181c4..eb91827c7b 100644 --- a/doc/src/sgml/oid2name.sgml +++ b/doc/src/sgml/oid2name.sgml @@ -1,70 +1,157 @@ + + oid2name - + oid2name - This utility allows administrators to examine the file structure used by - PostgreSQL. To make use of it, you need to be familiar with the file - structure, which is described in . + oid2name is a utility program that helps administrators to + examine the file structure used by PostgreSQL. To make use of it, you need + to be familiar with the database file structure, which is described in + . - + + + + The name oid2name is historical, and is actually rather + misleading, since most of the time when you use it, you will really + be concerned with tables' filenode numbers (which are the file names + visible in the database directories). Be sure you understand the + difference between table OIDs and table filenodes! + + + Overview + - oid2name connects to the database and extracts OID, - filenode, and table name information. You can also have it show database - OIDs and tablespace OIDs. + oid2name connects to a target database and + extracts OID, filenode, and/or table name information. You can also have + it show database OIDs or tablespace OIDs. The program is controlled by + a large number of command-line switches, as shown in + . - - When displaying specific tables, you can select which tables to show by - using -o, -f and -t. The first switch takes an OID, the second takes - a filenode, and the third takes a tablename (actually, it's a LIKE - pattern, so you can use things like "foo%"). Note that you can use as many - of these switches as you like, and the listing will include all objects - matched by any of the switches. Also note that these switches can only - show objects in the database given in -d. - - - If you don't give any of -o, -f or -t it will dump all the tables in the - database given in -d. If you don't give -d, it will show a database - listing. Alternatively you can give -s to get a tablespace listing. - - - Additional switches + +
+ <application>oid2name</> switches + + + Switch + Description + + + - -i - include indexes and sequences in the database listing. + -o oid + show info for table with OID oid + - -x - display more information about each object shown: tablespace name, - schema name, OID. - + -f filenode + show info for table with filenode filenode + + + -t tablename_pattern + show info for table(s) matching tablename_pattern + + + + -s + show tablespace OIDs + + -S - also show system objects (those in information_schema, pg_toast - and pg_catalog schemas) + include system objects (those in + information_schema, pg_toast + and pg_catalog schemas) + + + -i + include indexes and sequences in the listing + + + + -x + display more information about each object shown: tablespace name, + schema name, and OID + + + -q - don't display headers(useful for scripting) + omit headers (useful for scripting) + + + + -d database + database to connect to + + + + -H host + database server's host + + + + -p port + database server's port + + + + -U username + username to connect as + + + + -P password + password (deprecated — putting this on the command line + is a security hazard)
+ + + To display specific tables, select which tables to show by + using -o, -f and/or -t. + -o takes an OID, + -f takes a filenode, + and -t takes a tablename (actually, it's a LIKE + pattern, so you can use things like foo%). + You can use as many + of these switches as you like, and the listing will include all objects + matched by any of the switches. But note that these switches can only + show objects in the database given by -d. + + + + If you don't give any of -o, -f or -t, + but do give -d, it will list all tables in the database + named by -d. In this mode, the -S and + -i switches control what gets listed. + + + + If you don't give -d either, it will show a listing of database + OIDs. Alternatively you can give -s to get a tablespace + listing. +
- + Examples - + +$ # what's in this database server, anyway? $ oid2name All databases: Oid Database Name Tablespace @@ -83,7 +170,8 @@ All tablespaces: 155151 fastdisk 155152 bigdisk -$ cd $PGDATA/17228 +$ # OK, let's look into database alvherre +$ cd $PGDATA/base/17228 $ # get top 10 db objects in the default tablespace, ordered by size $ ls -lS * | head -10 @@ -98,6 +186,7 @@ $ ls -lS * | head -10 -rw------- 1 alvherre alvherre 163840 sep 14 09:50 16699 -rw------- 1 alvherre alvherre 122880 sep 6 17:51 16751 +$ # I wonder what file 155173 is ... $ oid2name -d alvherre -f 155173 From database "alvherre": Filenode Table Name @@ -112,7 +201,7 @@ From database "alvherre": 155173 accounts 1155291 accounts_pkey -$ # you can also mix the options, and have more details +$ # you can mix the options, and get more details with -x $ oid2name -d alvherre -t accounts -f 1155291 -x From database "alvherre": Filenode Table Name Oid Schema Tablespace @@ -157,7 +246,7 @@ $ ls -d 155151/* 155151/17228/ 155151/PG_VERSION $ # Oh, what was database 17228 again? -$ oid2name +$ oid2name All databases: Oid Database Name Tablespace ---------------------------------- @@ -178,28 +267,25 @@ From database "alvherre": Filenode Table Name ---------------------- 155156 foo - -$ # end of sample session. + + + + Limitations - You can also get approximate size data for each object using psql. For - example, - - - SELECT relpages, relfilenode, relname FROM pg_class ORDER BY relpages DESC; - - - Each page is typically 8k. Relpages is updated by VACUUM. + oid2name requires a running database server with + non-corrupt system catalogs. It is therefore of only limited use + for recovering from catastrophic database corruption situations. - + Author + - b. palmer, bpalmer@crimelabs.net + B. Palmer bpalmer@crimelabs.net
- diff --git a/doc/src/sgml/pageinspect.sgml b/doc/src/sgml/pageinspect.sgml index c7ebe1f730..e398733d01 100644 --- a/doc/src/sgml/pageinspect.sgml +++ b/doc/src/sgml/pageinspect.sgml @@ -1,124 +1,170 @@ + pageinspect - + pageinspect - The functions in this module allow you to inspect the contents of data pages - at a low level, for debugging purposes. + The pageinspect module provides functions that allow you to + inspect the contents of database pages at a low level, which is useful for + debugging purposes. All of these functions may be used only by superusers. - Functions included + Functions - - - - get_raw_page reads one block of the named table and returns a copy as a - bytea field. This allows a single time-consistent copy of the block to be - made. Use of this functions is restricted to superusers. - - + + + + get_raw_page(text, int) returns bytea + - - - page_header shows fields which are common to all PostgreSQL heap and index - pages. Use of this function is restricted to superusers. - - - A page image obtained with get_raw_page should be passed as argument: - - -regression=# SELECT * FROM page_header(get_raw_page('pg_class',0)); - lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid - -----------+-----+-------+-------+-------+---------+----------+---------+----------- - 0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0 - - - The returned columns correspond to the fields in the PageHeaderData struct. - See src/include/storage/bufpage.h for more details. - - + + + get_raw_page reads the specified block of the named + table and returns a copy as a bytea value. This allows a + single time-consistent copy of the block to be obtained. + + + - - - heap_page_items shows all line pointers on a heap page. For those line - pointers that are in use, tuple headers are also shown. All tuples are - shown, whether or not the tuples were visible to an MVCC snapshot at the - time the raw page was copied. Use of this function is restricted to - superusers. - - - A heap page image obtained with get_raw_page should be passed as argument: - - - test=# SELECT * FROM heap_page_items(get_raw_page('pg_class',0)); - - - See src/include/storage/itemid.h and src/include/access/htup.h for - explanations of the fields returned. - - + + + page_header(bytea) returns record + - - - bt_metap() returns information about the btree index metapage: - - - test=> SELECT * FROM bt_metap('pg_cast_oid_index'); - -[ RECORD 1 ]----- - magic | 340322 - version | 2 - root | 1 - level | 0 - fastroot | 1 - fastlevel | 0 - - + + + page_header shows fields that are common to all + PostgreSQL heap and index pages. + - - - bt_page_stats() shows information about single btree pages: - - - test=> SELECT * FROM bt_page_stats('pg_cast_oid_index', 1); - -[ RECORD 1 ]-+----- - blkno | 1 - type | l - live_items | 256 - dead_items | 0 - avg_item_size | 12 - page_size | 8192 - free_size | 4056 - btpo_prev | 0 - btpo_next | 0 - btpo | 0 - btpo_flags | 3 - - + + A page image obtained with get_raw_page should be + passed as argument. For example: + + +test=# SELECT * FROM page_header(get_raw_page('pg_class', 0)); + lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid +-----------+-----+-------+-------+-------+---------+----------+---------+----------- + 0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0 + - - - bt_page_items() returns information about specific items on btree pages: + + The returned columns correspond to the fields in the + PageHeaderData struct. + See src/include/storage/bufpage.h for details. - - test=> SELECT * FROM bt_page_items('pg_cast_oid_index', 1); - itemoffset | ctid | itemlen | nulls | vars | data - ------------+---------+---------+-------+------+------------- - 1 | (0,1) | 12 | f | f | 23 27 00 00 - 2 | (0,2) | 12 | f | f | 24 27 00 00 - 3 | (0,3) | 12 | f | f | 25 27 00 00 - 4 | (0,4) | 12 | f | f | 26 27 00 00 - 5 | (0,5) | 12 | f | f | 27 27 00 00 - 6 | (0,6) | 12 | f | f | 28 27 00 00 - 7 | (0,7) | 12 | f | f | 29 27 00 00 - 8 | (0,8) | 12 | f | f | 2a 27 00 00 - - - + + + + + + heap_page_items(bytea) returns setof record + + + + + heap_page_items shows all line pointers on a heap + page. For those line pointers that are in use, tuple headers are also + shown. All tuples are shown, whether or not the tuples were visible to + an MVCC snapshot at the time the raw page was copied. + + + A heap page image obtained with get_raw_page should + be passed as argument. For example: + + +test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0)); + + + See src/include/storage/itemid.h and + src/include/access/htup.h for explanations of the fields + returned. + + + + + + + bt_metap(text) returns record + + + + + bt_metap returns information about a btree + index's metapage. For example: + + +test=# SELECT * FROM bt_metap('pg_cast_oid_index'); +-[ RECORD 1 ]----- +magic | 340322 +version | 2 +root | 1 +level | 0 +fastroot | 1 +fastlevel | 0 + + + + + + + bt_page_stats(text, int) returns record + + + + + bt_page_stats returns summary information about + single pages of btree indexes. For example: + + +test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1); +-[ RECORD 1 ]-+----- +blkno | 1 +type | l +live_items | 256 +dead_items | 0 +avg_item_size | 12 +page_size | 8192 +free_size | 4056 +btpo_prev | 0 +btpo_next | 0 +btpo | 0 +btpo_flags | 3 + + + + + + + bt_page_items(text, int) returns setof record + + + + + bt_page_items returns detailed information about + all of the items on a btree index page. For example: + + +test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1); + itemoffset | ctid | itemlen | nulls | vars | data +------------+---------+---------+-------+------+------------- + 1 | (0,1) | 12 | f | f | 23 27 00 00 + 2 | (0,2) | 12 | f | f | 24 27 00 00 + 3 | (0,3) | 12 | f | f | 25 27 00 00 + 4 | (0,4) | 12 | f | f | 26 27 00 00 + 5 | (0,5) | 12 | f | f | 27 27 00 00 + 6 | (0,6) | 12 | f | f | 28 27 00 00 + 7 | (0,7) | 12 | f | f | 29 27 00 00 + 8 | (0,8) | 12 | f | f | 2a 27 00 00 + + + + - + diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml index 0e9dcfab44..be089f8836 100644 --- a/doc/src/sgml/pgbench.sgml +++ b/doc/src/sgml/pgbench.sgml @@ -1,436 +1,544 @@ + pgbench - + pgbench - pgbench is a simple program to run a benchmark test. - pgbench is a client application of PostgreSQL and runs - with PostgreSQL only. It performs lots of small and simple transactions - including SELECT/UPDATE/INSERT operations then calculates number of - transactions successfully completed within a second (transactions - per second, tps). Targeting data includes a table with at least 100k - tuples. - - - Example outputs from pgbench look like: - - -number of clients: 4 -number of transactions per client: 100 -number of processed transactions: 400/400 -tps = 19.875015(including connections establishing) -tps = 20.098827(excluding connections establishing) - - Similar program called "JDBCBench" already exists, but it requires - Java that may not be available on every platform. Moreover some - people concerned about the overhead of Java that might lead - inaccurate results. So I decided to write in pure C, and named - it "pgbench." + pgbench is a simple program for running benchmark + tests on PostgreSQL. It runs the same sequence of SQL + commands over and over, possibly in multiple concurrent database sessions, + and then calculates the average transaction rate (transactions per second). + By default, pgbench tests a scenario that is + loosely based on TPC-B, involving five SELECT, + UPDATE, and INSERT commands per transaction. + However, it is easy to test other cases by writing your own transaction + script files. - Features of pgbench: + Typical output from pgbench looks like: + + +transaction type: TPC-B (sort of) +scaling factor: 10 +number of clients: 10 +number of transactions per client: 1000 +number of transactions actually processed: 10000/10000 +tps = 85.184871 (including connections establishing) +tps = 85.296346 (excluding connections establishing) + + + The first four lines just report some of the most important parameter + settings. The next line reports the number of transactions completed + and intended (the latter being just the product of number of clients + and number of transactions); these will be equal unless the run + failed before completion. The last two lines report the TPS rate, + figured with and without counting the time to start database sessions. - - - - pgbench is written in C using libpq only. So it is very portable - and easy to install. - - - - - pgbench can simulate concurrent connections using asynchronous - capability of libpq. No threading is required. - - - Overview - - - (optional)Initialize database by: - -pgbench -i <dbname> - - - where <dbname> is the name of database. pgbench uses four tables - accounts, branches, history and tellers. These tables will be - destroyed. Be very careful if you have tables having same - names. Default test data contains: - - -table # of tuples + + + The default TPC-B-like transaction test requires specific tables to be + set up beforehand. pgbench should be invoked with + the -i (initialize) option to create and populate these + tables. (When you are testing a custom script, you don't need this + step, but will instead need to do whatever setup your test needs.) + Initialization looks like: + + +pgbench -i other-options dbname + + + where dbname is the name of the already-created + database to test in. (You may also need -h, + -p, and/or -U options to specify how to + connect to the database server.) + + + + + pgbench -i creates four tables accounts, + branches, history, and + tellers, destroying any existing tables of these names. + Be very careful to use another database if you have tables having these + names! + + + + + At the default scale factor of 1, the tables initially + contain this many rows: + + +table # of rows ------------------------- branches 1 tellers 10 accounts 100000 history 0 - - - You can increase the number of tuples by using -s option. branches, - tellers and accounts tables are created with a fillfactor which is - set using -F option. See below. - - - - Run the benchmark test - -pgbench <dbname> - - - The default configuration is: - - - number of clients: 1 - number of transactions per client: 10 - - - + + + You can (and, for most purposes, probably should) increase the number + of rows by using the -s (scale factor) option. The + -F (fillfactor) option might also be used at this point. + - - <literal>pgbench</literal> options + + Once you have done the necessary setup, you can run your benchmark + with a command that doesn't include -i, that is + + +pgbench options dbname + + + In nearly all cases, you'll need some options to make a useful test. + The most important options are -c (number of clients), + -t (number of transactions), and -f (specify + a custom script file). See below for a full list. + + + + shows options that are used + during database initialization, while + shows options that are used + while running benchmarks, and + shows options that are useful + in both cases. + + +
+ <application>pgbench</application> initialization options - Parameter + Option Description + - -h hostname + -i - - hostname where the backend is running. If this option - is omitted, pgbench will connect to the localhost via - Unix domain socket. - + Required to invoke initialization mode. - -p port + -s scale_factor - - the port number that the backend is accepting. default is - libpq's default, usually 5432. - + Multiply the number of rows generated by the scale factor. + For example, -s 100 will imply 10,000,000 rows + in the accounts table. Default is 1. - -c number_of_clients + -F fillfactor - - Number of clients simulated. default is 1. - + Create the accounts, tellers and + branches tables with the given fillfactor. + Default is 100. + + + + +
+ + + <application>pgbench</application> benchmarking options + + + + Option + Description + + + + + + -c clients + + Number of clients simulated, that is, number of concurrent database + sessions. Default is 1. - -t number_of_transactions + -t transactions - - Number of transactions each client runs. default is 10. - - - - - -s scaling_factor - - - this should be used with -i (initialize) option. - number of tuples generated will be multiple of the - scaling factor. For example, -s 100 will imply 10M - (10,000,000) tuples in the accounts table. - default is 1. - - - NOTE: scaling factor should be at least - as large as the largest number of clients you intend - to test; else you'll mostly be measuring update contention. - Regular (not initializing) runs using one of the - built-in tests will detect scale based on the number of - branches in the database. For custom (-f) runs it can - be manually specified with this parameter. - - - - - -D varname=value - - - Define a variable. It can be refered to by a script - provided by using -f option. Multiple -D options are allowed. - - - - - -U login - - - Specify db user's login name if it is different from - the Unix login name. - - - - - -P password - - - Specify the db password. CAUTION: using this option - might be a security hole since ps command will - show the password. Use this for TESTING PURPOSE ONLY. - - - - - -n - - - No vacuuming and cleaning the history table prior to the - test is performed. - - - - - -v - - - Do vacuuming before testing. This will take some time. - With neither -n nor -v, pgbench will vacuum tellers and - branches tables only. - - - - - -S - - - Perform select only transactions instead of TPC-B. - + Number of transactions each client runs. Default is 10. -N - - Do not update "branches" and "tellers". This will - avoid heavy update contention on branches and tellers, - while it will not make pgbench supporting TPC-B like - transactions. - + Do not update tellers and branches. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. - -f filename + -S - - Read transaction script from file. Detailed - explanation will appear later. - + Perform select-only transactions instead of TPC-B-like test. + + + + -f filename + + Read transaction script from filename. + See below for details. + -N, -S, and -f + are mutually exclusive. + + + + -n + + No vacuuming is performed before running the test. + This option is necessary + if you are running a custom test scenario that does not include + the standard tables accounts, + branches, history, and + tellers. + + + + -v + + Vacuum all four standard tables before running the test. + With neither -n nor -v, pgbench will vacuum + tellers and branches tables, and + will remove all entries in history. + + + + -D varname=value + + Define a variable for use by a custom script (see below). + Multiple -D options are allowed. -C - - Establish connection for each transaction, rather than - doing it just once at beginning of pgbench in the normal - mode. This is useful to measure the connection overhead. - + Establish a new connection for each transaction, rather than + doing it just once per client thread. + This is useful to measure the connection overhead. -l - - Write the time taken by each transaction to a logfile, - with the name "pgbench_log.xxx", where xxx is the PID - of the pgbench process. The format of the log is: - - - client_id transaction_no time file_no time-epoch time-us - - - where time is measured in microseconds, , the file_no is - which test file was used (useful when multiple were - specified with -f), and time-epoch/time-us are a - UNIX epoch format timestamp followed by an offset - in microseconds (suitable for creating a ISO 8601 - timestamp with a fraction of a second) of when - the transaction completed. - - - Here are example outputs: - - - 0 199 2241 0 1175850568 995598 - 0 200 2465 0 1175850568 998079 - 0 201 2513 0 1175850569 608 - 0 202 2038 0 1175850569 2663 - + Write the time taken by each transaction to a logfile. + See below for details. - -F fillfactor + -s scale_factor - - Create tables(accounts, tellers and branches) with the given - fillfactor. Default is 100. This should be used with -i - (initialize) option. - + Report the specified scale factor in pgbench's + output. With the built-in tests, this is not necessary; the + correct scale factor will be detected by counting the number of + rows in the branches table. However, when testing + custom benchmarks (-f option), the scale factor + will be reported as 1 unless this option is used. -d - - debug option. - + Print debugging output.
+ + + <application>pgbench</application> common options + + + + Option + Description + + + + + + -h hostname + database server's host + + + -p port + database server's port + + + -U login + username to connect as + + + -P password + password (deprecated — putting this on the command line + is a security hazard) + + + +
- What is the "transaction" actually performed in pgbench? + What is the <quote>transaction</> actually performed in pgbench? + + + The default transaction script issues seven commands per transaction: + + - begin; - - update accounts set abalance = abalance + :delta where aid = :aid; - - select abalance from accounts where aid = :aid; - - update tellers set tbalance = tbalance + :delta where tid = :tid; - - update branches set bbalance = bbalance + :delta where bid = :bid; - - insert into history(tid,bid,aid,delta) values(:tid,:bid,:aid,:delta); - - end; + BEGIN; + UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid; + SELECT abalance FROM accounts WHERE aid = :aid; + UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid; + UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid; + INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); + END; + - If you specify -N, (4) and (5) aren't included in the transaction. + If you specify -N, steps 4 and 5 aren't included in the + transaction. If you specify -S, only the SELECT is + issued. - Script file + Custom Scripts + - pgbench has support for reading a transaction script - from a specified file (-f option). This file should - include SQL commands in each line. SQL command consists of multiple lines - are not supported. Empty lines and lines begging with "--" will be ignored. - - - Multiple -f options are allowed. In this case each - transaction is assigned randomly chosen script. - - - SQL commands can include "meta command" which begins with "\" (back - slash). A meta command takes some arguments separted by white - spaces. Currently following meta command is supported: + pgbench has support for running custom + benchmark scenarios by replacing the default transaction script + (described above) with a transaction script read from a file + (-f option). In this case a transaction + counts as one execution of a script file. You can even specify + multiple scripts (multiple -f options), in which + case a random one of the scripts is chosen each time a client session + starts a new transaction. - - - - \set name operand1 [ operator operand2 ] - - Sets the calculated value using "operand1" "operator" - "operand2" to variable "name". If "operator" and "operand2" - are omitted, the value of operand1 is set to variable "name". - - - Example: - - + + The format of a script file is one SQL command per line; multi-line + SQL commands are not supported. Empty lines and lines beginning with + -- are ignored. Script file lines can also be + meta commands, which are interpreted by pgbench + itself, as described below. + + + + There is a simple variable-substitution facility for script files. + Variables can be set by the command-line -D option, + explained above, or by the meta commands explained below. + In addition to any variables preset by -D command-line options, + the variable scale is preset to the current scale factor. + Once set, a variable's + value can be inserted into a SQL command by writing + :variablename. When running more than + one client session, each session has its own set of variables. + + + + Script file meta commands begin with a backslash (\). + Arguments to a meta command are separated by white space. + These meta commands are supported: + + + + + + \set varname operand1 [ operator operand2 ] + + + + + Sets variable varname to a calculated integer value. + Each operand is either an integer constant or a + :variablename reference to a variable + having an integer value. The operator can be + +, -, *, or /. + + + + Example: + \set ntellers 10 * :scale - - - - - \setrandom name min max - - Assigns random integer to name between min and max - - - Example: - - -\setrandom aid 1 100000 - - - - - Variables can be referred to in SQL comands by adding ":" in front - of the varible name. - - - Example: - - -SELECT abalance FROM accounts WHERE aid = :aid - - - Variables can also be defined by using -D option. - - - - - \sleep num [us|ms|s] - Causes script execution to sleep for the - specified duration of microseconds (us), milliseconds (ms) or the default - seconds (s). - - - Example: - - -\setrandom millisec 1000 2500 -\sleep : millisec ms - - - - + + + + + + + + \setrandom varname min max + + + + + Sets variable varname to a random integer value + between the limits min and max. + Each limit can be either an integer constant or a + :variablename reference to a variable + having an integer value. + + + + Example: + +\setrandom aid 1 :naccounts + + + + + + + + \sleep number [ us | ms | s ] + + + + + Causes script execution to sleep for the specified duration in + microseconds (us), milliseconds (ms) or seconds + (s). If the unit is omitted then seconds are the default. + number can be either an integer constant or a + :variablename reference to a variable + having an integer value. + + + + Example: + +\sleep 10 ms + + + + + - - Examples - Example, TPC-B like benchmark can be defined as follows(scaling - factor = 1): - - + As an example, the full definition of the built-in TPC-B-like + transaction is: + + \set nbranches :scale \set ntellers 10 * :scale \set naccounts 100000 * :scale \setrandom aid 1 :naccounts \setrandom bid 1 :nbranches \setrandom tid 1 :ntellers -\setrandom delta 1 10000 -BEGIN -UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid -SELECT abalance FROM accounts WHERE aid = :aid -UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid -UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid -INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, 'now') -END - - - If you want to automatically set the scaling factor from the number of - tuples in branches table, use -s option and shell command like this: +\setrandom delta -5000 5000 +BEGIN; +UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid; +SELECT abalance FROM accounts WHERE aid = :aid; +UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid; +UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid; +INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); +END; + + + This script allows each iteration of the transaction to reference + different, randomly-chosen rows. (This example also shows why it's + important for each client session to have its own variables — + otherwise they'd not be independently touching different rows.) - -pgbench -s $(psql -At -c "SELECT count(*) FROM branches") -f tpc_b.sql - + + + + + Per-transaction logging + - Notice that -f option does not execute vacuum and clearing history - table before starting benchmark. + With the -l option, pgbench writes the time + taken by each transaction to a logfile. The logfile will be named + pgbench_log.nnn, where + nnn is the PID of the pgbench process. + The format of the log is: + + + client_id transaction_no time file_no time_epoch time_us + + + where time is the elapsed transaction time in microseconds, + file_no identifies which script file was used + (useful when multiple scripts were specified with -f), + and time_epoch/time_us are a + UNIX epoch format timestamp and an offset + in microseconds (suitable for creating a ISO 8601 + timestamp with fractional seconds) showing when + the transaction completed. + + + + Here are example outputs: + + 0 199 2241 0 1175850568 995598 + 0 200 2465 0 1175850568 998079 + 0 201 2513 0 1175850569 608 + 0 202 2038 0 1175850569 2663 + + + + + + Good Practices + + + It is very easy to use pgbench to produce completely + meaningless numbers. Here are some guidelines to help you get useful + results. + + + + In the first place, never believe any test that runs + for only a few seconds. Increase the -t setting enough + to make the run last at least a few minutes, so as to average out noise. + In some cases you could need hours to get numbers that are reproducible. + It's a good idea to try the test run a few times, to find out if your + numbers are reproducible or not. + + + + For the default TPC-B-like test scenario, the initialization scale factor + (-s) should be at least as large as the largest number of + clients you intend to test (-c); else you'll mostly be + measuring update contention. There are only -s rows in + the branches table, and every transaction wants to + update one of them, so -c values in excess of -s + will undoubtedly result in lots of transactions blocked waiting for + other transactions. + + + + The default test scenario is also quite sensitive to how long it's been + since the tables were initialized: accumulation of dead rows and dead space + in the tables changes the results. To understand the results you must keep + track of the total number of updates and when vacuuming happens. If + autovacuum is enabled it can result in unpredictable changes in measured + performance. + + + + A limitation of pgbench is that it can itself become + the bottleneck when trying to test a large number of client sessions. + This can be alleviated by running pgbench on a different + machine from the database server, although low network latency will be + essential. It might even be useful to run several pgbench + instances concurrently, on several client machines, against the same + database server.
- diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml index e4ecc8f78c..f2507df124 100644 --- a/doc/src/sgml/pgbuffercache.sgml +++ b/doc/src/sgml/pgbuffercache.sgml @@ -1,85 +1,138 @@ + + pg_buffercache - + pg_buffercache - The pg_buffercache module provides a means for examining - what's happening to the buffercache at any given time without having to - restart or rebuild the server with debugging code added. The intent is to - do for the buffercache what pg_locks does for locks. + The pg_buffercache module provides a means for + examining what's happening in the shared buffer cache in real time. + - This module consists of a C function pg_buffercache_pages() - that returns a set of records, plus a view pg_buffercache - to wrapper the function. + The module provides a C function pg_buffercache_pages + that returns a set of records, plus a view + pg_buffercache that wraps the function for + convenient use. + - By default public access is REVOKED from both of these, just in case there + By default public access is revoked from both of these, just in case there are security issues lurking. - Notes + The <structname>pg_buffercache</structname> view + - The definition of the columns exposed in the view is: + The definitions of the columns exposed by the view are: - - Column | references | Description - ----------------+----------------------+------------------------------------ - bufferid | | Id, 1..shared_buffers. - relfilenode | pg_class.relfilenode | Refilenode of the relation. - reltablespace | pg_tablespace.oid | Tablespace oid of the relation. - reldatabase | pg_database.oid | Database for the relation. - relblocknumber | | Offset of the page in the relation. - isdirty | | Is the page dirty? - usagecount | | Page LRU count - + + + <structname>pg_buffercache</> Columns + + + + + Name + Type + References + Description + + + + + + bufferid + integer + + ID, in the range 1..shared_buffers + + + + relfilenode + oid + pg_class.relfilenode + Relfilenode of the relation + + + + reltablespace + oid + pg_tablespace.oid + Tablespace OID of the relation + + + + reldatabase + oid + pg_database.oid + Database OID of the relation + + + + relblocknumber + bigint + + Page number within the relation + + + + isdirty + boolean + + Is the page dirty? + + + + usagecount + smallint + + Page LRU count + + + + +
+ There is one row for each buffer in the shared cache. Unused buffers are - shown with all fields null except bufferid. + shown with all fields null except bufferid. Shared system + catalogs are shown as belonging to database zero. + - Because the cache is shared by all the databases, there are pages from - relations not belonging to the current database. + Because the cache is shared by all the databases, there will normally be + pages from relations not belonging to the current database. This means + that there may not be matching join rows in pg_class for + some rows, or that there could even be incorrect joins. If you are + trying to join against pg_class, it's a good idea to + restrict the join to rows having reldatabase equal to + the current database's OID or zero. + - When the pg_buffercache view is accessed, internal buffer manager locks are - taken, and a copy of the buffer cache data is made for the view to display. - This ensures that the view produces a consistent set of results, while not - blocking normal buffer activity longer than necessary. Nonetheless there + When the pg_buffercache view is accessed, internal buffer + manager locks are taken for long enough to copy all the buffer state + data that the view will display. + This ensures that the view produces a consistent set of results, while not + blocking normal buffer activity longer than necessary. Nonetheless there could be some impact on database performance if this view is read often.
Sample output + - regression=# \d pg_buffercache; - View "public.pg_buffercache" - Column | Type | Modifiers - ----------------+----------+----------- - bufferid | integer | - relfilenode | oid | - reltablespace | oid | - reldatabase | oid | - relblocknumber | bigint | - isdirty | boolean | - usagecount | smallint | - - View definition: - SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase, - p.relblocknumber, p.isdirty, p.usagecount - FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid, - reltablespace oid, reldatabase oid, relblocknumber bigint, - isdirty boolean, usagecount smallint); - regression=# SELECT c.relname, count(*) AS buffers - FROM pg_class c INNER JOIN pg_buffercache b - ON b.relfilenode = c.relfilenode INNER JOIN pg_database d - ON (b.reldatabase = d.oid AND d.datname = current_database()) + FROM pg_buffercache b INNER JOIN pg_class c + ON b.relfilenode = c.relfilenode AND + b.reldatabase IN (0, (SELECT oid FROM pg_database + WHERE datname = current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10; relname | buffers @@ -95,26 +148,23 @@ pg_depend | 22 pg_depend_reference_index | 20 (10 rows) - - regression=# Authors - - - - Mark Kirkwood markir@paradise.net.nz - - - - Design suggestions: Neil Conway neilc@samurai.com - - - Debugging advice: Tom Lane tgl@sss.pgh.pa.us - - + + + Mark Kirkwood markir@paradise.net.nz + + + + Design suggestions: Neil Conway neilc@samurai.com + + + + Debugging advice: Tom Lane tgl@sss.pgh.pa.us +
diff --git a/doc/src/sgml/pgfreespacemap.sgml b/doc/src/sgml/pgfreespacemap.sgml index 66748c3ef4..bc821ead6b 100644 --- a/doc/src/sgml/pgfreespacemap.sgml +++ b/doc/src/sgml/pgfreespacemap.sgml @@ -1,184 +1,203 @@ + + pg_freespacemap - + pg_freespacemap - This module provides a means for examining the free space map (FSM). It - consists of two C functions: pg_freespacemap_relations() - and pg_freespacemap_pages() that return a set - of records, plus two views pg_freespacemap_relations and - pg_freespacemap_pages for more user-friendly access to - the functions. + The pg_freespacemap module provides a means for examining the + free space map (FSM). It provides two C functions: + pg_freespacemap_relations and + pg_freespacemap_pages that each return a set of + records, plus two views pg_freespacemap_relations + and pg_freespacemap_pages that wrap the functions + for convenient use. + - The module provides the ability to examine the contents of the free space - map, without having to restart or rebuild the server with additional - debugging code. - - - By default public access is REVOKED from the functions and views, just in - case there are security issues present in the code. + By default public access is revoked from the functions and views, just in + case there are security issues lurking. - Notes + The <filename>pg_freespacemap</> views + - The definitions for the columns exposed in the views are: + The definitions of the columns exposed by the views are: - pg_freespacemap_relations - + <structname>pg_freespacemap_relations</> Columns + + - Column - references + Name + Type + References Description + - reltablespace - pg_tablespace.oid - Tablespace oid of the relation. + reltablespace + oid + pg_tablespace.oid + Tablespace OID of the relation - reldatabase - pg_database.oid - Database oid of the relation. + reldatabase + oid + pg_database.oid + Database OID of the relation - relfilenode - pg_class.relfilenode - Relfilenode of the relation. + relfilenode + oid + pg_class.relfilenode + Relfilenode of the relation - avgrequest + avgrequest + integer Moving average of free space requests (NULL for indexes) - interestingpages + interestingpages + integer - Count of pages last reported as containing useful free space. + Count of pages last reported as containing useful free space - storedpages + storedpages + integer - Count of pages actually stored in free space map. + Count of pages actually stored in free space map - nextpage + nextpage + integer - Page index (from 0) to start next search at. + Page index (from 0) to start next search at +
- pg_freespacemap_pages - + <structname>pg_freespacemap_pages</> Columns + + - Column - references + Name + Type + References Description + - reltablespace - pg_tablespace.oid - Tablespace oid of the relation. + reltablespace + oid + pg_tablespace.oid + Tablespace OID of the relation - reldatabase - pg_database.oid - Database oid of the relation. + reldatabase + oid + pg_database.oid + Database OID of the relation - relfilenode - pg_class.relfilenode - Relfilenode of the relation. + relfilenode + oid + pg_class.relfilenode + Relfilenode of the relation - relblocknumber + relblocknumber + bigint - Page number in the relation. + Page number within the relation - bytes + bytes + integer - Free bytes in the page, or NULL for an index page (see below). + Free bytes in the page, or NULL for an index page (see below) +
- For pg_freespacemap_relations, there is one row for each - relation in the free space map. storedpages is the - number of pages actually stored in the map, while - interestingpages is the number of pages the last VACUUM - thought had useful amounts of free space. + For pg_freespacemap_relations, there is one row + for each relation in the free space map. + storedpages is the number of pages actually + stored in the map, while interestingpages is the + number of pages the last VACUUM thought had useful amounts of + free space. + - If storedpages is consistently less than interestingpages - then it'd be a good idea to increase max_fsm_pages. Also, - if the number of rows in pg_freespacemap_relations is - close to max_fsm_relations, then you should consider - increasing max_fsm_relations. + If storedpages is consistently less than + interestingpages then it'd be a good idea to increase + max_fsm_pages. Also, if the number of rows in + pg_freespacemap_relations is close to + max_fsm_relations, then you should consider increasing + max_fsm_relations. + - For pg_freespacemap_pages, there is one row for each page - in the free space map. The number of rows for a relation will match the - storedpages column in - pg_freespacemap_relations. + For pg_freespacemap_pages, there is one row for + each page in the free space map. The number of rows for a relation will + match the storedpages column in + pg_freespacemap_relations. + For indexes, what is tracked is entirely-unused pages, rather than free space within pages. Therefore, the average request size and free bytes within a page are not meaningful, and are shown as NULL. + - Because the map is shared by all the databases, it will include relations - not belonging to the current database. + Because the map is shared by all the databases, there will normally be + entries for relations not belonging to the current database. This means + that there may not be matching join rows in pg_class for + some rows, or that there could even be incorrect joins. If you are + trying to join against pg_class, it's a good idea to + restrict the join to rows having reldatabase equal to + the current database's OID or zero. + - When either of the views are accessed, internal free space map locks are - taken, and a copy of the map data is made for them to display. - This ensures that the views produce a consistent set of results, while not - blocking normal activity longer than necessary. Nonetheless there + When either of the views is accessed, internal free space map locks are + taken for long enough to copy all the state data that the view will display. + This ensures that the views produce a consistent set of results, while not + blocking normal activity longer than necessary. Nonetheless there could be some impact on database performance if they are read often.
- Sample output - pg_freespacemap_relations - -regression=# \d pg_freespacemap_relations -View "public.pg_freespacemap_relations" - Column | Type | Modifiers -------------------+---------+----------- - reltablespace | oid | - reldatabase | oid | - relfilenode | oid | - avgrequest | integer | - interestingpages | integer | - storedpages | integer | - nextpage | integer | -View definition: - SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.avgrequest, p.interestingpages, p.storedpages, p.nextpage - FROM pg_freespacemap_relations() p(reltablespace oid, reldatabase oid, relfilenode oid, avgrequest integer, interestingpages integer, storedpages integer, nextpage integer); + Sample output + regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages FROM pg_freespacemap_relations r INNER JOIN pg_class c - ON c.relfilenode = r.relfilenode INNER JOIN pg_database d - ON r.reldatabase = d.oid AND (d.datname = current_database()) + ON r.relfilenode = c.relfilenode AND + r.reldatabase IN (0, (SELECT oid FROM pg_database + WHERE datname = current_database())) ORDER BY r.storedpages DESC LIMIT 10; relname | avgrequest | interestingpages | storedpages ---------------------------------+------------+------------------+------------- @@ -193,31 +212,14 @@ regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages pg_class_relname_nsp_index | | 10 | 10 pg_proc | 302 | 8 | 8 (10 rows) - - - - - Sample output - pg_freespacemap_pages - -regression=# \d pg_freespacemap_pages - View "public.pg_freespacemap_pages" - Column | Type | Modifiers -----------------+---------+----------- - reltablespace | oid | - reldatabase | oid | - relfilenode | oid | - relblocknumber | bigint | - bytes | integer | -View definition: - SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.relblocknumber, p.bytes - FROM pg_freespacemap_pages() p(reltablespace oid, reldatabase oid, relfilenode oid, relblocknumber bigint, bytes integer); regression=# SELECT c.relname, p.relblocknumber, p.bytes FROM pg_freespacemap_pages p INNER JOIN pg_class c - ON c.relfilenode = p.relfilenode INNER JOIN pg_database d - ON (p.reldatabase = d.oid AND d.datname = current_database()) + ON p.relfilenode = c.relfilenode AND + p.reldatabase IN (0, (SELECT oid FROM pg_database + WHERE datname = current_database())) ORDER BY c.relname LIMIT 10; - relname | relblocknumber | bytes + relname | relblocknumber | bytes --------------+----------------+------- a_star | 0 | 8040 abstime_tbl | 0 | 7908 @@ -235,8 +237,10 @@ regression=# SELECT c.relname, p.relblocknumber, p.bytes Author + Mark Kirkwood markir@paradise.net.nz +
diff --git a/doc/src/sgml/pgrowlocks.sgml b/doc/src/sgml/pgrowlocks.sgml index 140b33387f..6fe56aa4df 100644 --- a/doc/src/sgml/pgrowlocks.sgml +++ b/doc/src/sgml/pgrowlocks.sgml @@ -1,115 +1,122 @@ + pgrowlocks - + pgrowlocks - The pgrowlocks module provides a function to show row + The pgrowlocks module provides a function to show row locking information for a specified table. Overview - -pgrowlocks(text) RETURNS pgrowlocks_type - + + +pgrowlocks(text) returns setof record + + - The parameter is a name of table. And pgrowlocks_type is - defined as: + The parameter is the name of a table. The result is a set of records, + with one row for each locked row within the table. The output columns + are: - -CREATE TYPE pgrowlocks_type AS ( - locked_row TID, -- row TID - lock_type TEXT, -- lock type - locker XID, -- locking XID - multi bool, -- multi XID? - xids xid[], -- multi XIDs - pids INTEGER[] -- locker's process id -); - - pgrowlocks_type - + <function>pgrowlocks</> output columns + + + + + Name + Type + Description + + + - locked_row - tuple ID(TID) of each locked rows + locked_row + tid + Tuple ID (TID) of locked row - lock_type - "Shared" for shared lock, "Exclusive" for exclusive lock + lock_type + text + Shared for shared lock, or + Exclusive for exclusive lock - locker - transaction ID of locker (Note 1) + locker + xid + Transaction ID of locker, or multixact ID if multi-transaction - multi - "t" if locker is a multi transaction, otherwise "f" + multi + boolean + True if locker is a multi-transaction - xids - XIDs of lockers (Note 2) + xids + xid[] + Transaction IDs of lockers (more than one if multi-transaction) - pids - process ids of locking backends + pids + integer[] + Process IDs of locking backends (more than one if multi-transaction) +
- - Note1: If the locker is multi transaction, it represents the multi ID. - - - Note2: If the locker is multi, multiple data are shown. - - The calling sequence for pgrowlocks is as follows: - pgrowlocks grabs AccessShareLock for the target table and - reads each row one by one to get the row locking information. You should - notice that: + pgrowlocks takes AccessShareLock for the + target table and reads each row one by one to collect the row locking + information. This is not very speedy for a large table. Note that: + - if the table is exclusive locked by someone else, - pgrowlocks will be blocked. + If the table as a whole is exclusive-locked by someone else, + pgrowlocks will be blocked. - pgrowlocks may show incorrect information if there's a - new lock or a lock is freeed while its execution. + pgrowlocks is not guaranteed to produce a + self-consistent snapshot. It is possible that a new row lock is taken, + or an old lock is freed, during its execution. + - pgrowlocks does not show the contents of locked rows. If - you want to take a look at the row contents at the same time, you could do - something like this: - + pgrowlocks does not show the contents of locked + rows. If you want to take a look at the row contents at the same time, you + could do something like this: + -SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p WHERE p.locked_ row = a.ctid; +SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p + WHERE p.locked_row = a.ctid; + + Be aware however that (as of PostgreSQL 8.3) such a + query will be very inefficient. +
- Example - - pgrowlocks returns the following columns: - - - Here is a sample execution of pgrowlocks: - + Sample output + test=# SELECT * FROM pgrowlocks('t1'); - locked_row | lock_type | locker | multi | xids | pids + locked_row | lock_type | locker | multi | xids | pids ------------+-----------+--------+-------+-----------+--------------- (0,1) | Shared | 19 | t | {804,805} | {29066,29068} (0,2) | Shared | 19 | t | {804,805} | {29066,29068} @@ -117,7 +124,14 @@ test=# SELECT * FROM pgrowlocks('t1'); (0,4) | Exclusive | 804 | f | {804} | {29066} (4 rows) - -
+ + Author + + + Tatsuo Ishii + + + + diff --git a/doc/src/sgml/pgstattuple.sgml b/doc/src/sgml/pgstattuple.sgml index eaa3a54703..ad52dcd442 100644 --- a/doc/src/sgml/pgstattuple.sgml +++ b/doc/src/sgml/pgstattuple.sgml @@ -1,29 +1,35 @@ + pgstattuple - + pgstattuple - pgstattuple modules provides various functions to obtain - tuple statistics. + The pgstattuple module provides various functions to + obtain tuple-level statistics. Functions - - - - pgstattuple() returns the relation length, percentage - of the "dead" tuples of a relation and other info. This may help users to - determine whether vacuum is necessary or not. Here is an example session: - - -test=> \x -Expanded display is on. + + + + pgstattuple(text) returns record + + + + + pgstattuple returns a relation's physical length, + percentage of dead tuples, and other info. This may help users + to determine whether vacuum is necessary or not. The argument is the + target relation's name (optionally schema-qualified). + For example: + + test=> SELECT * FROM pgstattuple('pg_catalog.pg_proc'); -[ RECORD 1 ]------+------- table_len | 458752 @@ -35,86 +41,111 @@ dead_tuple_len | 3157 dead_tuple_percent | 0.69 free_space | 8932 free_percent | 1.95 - + + - Here are explanations for each column: + The output columns are: - + - <literal>pgstattuple()</literal> column descriptions - + <function>pgstattuple</function> output columns + Column + Type Description + - table_len - physical relation length in bytes + table_len + bigint + Physical relation length in bytes - tuple_count - number of live tuples + tuple_count + bigint + Number of live tuples - tuple_len - total tuples length in bytes + tuple_len + bigint + Total length of live tuples in bytes - tuple_percent - live tuples in % + tuple_percent + float8 + Percentage of live tuples - dead_tuple_len - total dead tuples length in bytes + dead_tuple_count + bigint + Number of dead tuples - dead_tuple_percent - dead tuples in % + dead_tuple_len + bigint + Total length of dead tuples in bytes - free_space - free space in bytes + dead_tuple_percent + float8 + Percentage of dead tuples - free_percent - free space in % + free_space + bigint + Total free space in bytes + + free_percent + float8 + Percentage of free space + +
- - - - pgstattuple acquires only a read lock on the relation. So - concurrent update may affect the result. - - - - - pgstattuple judges a tuple is "dead" if HeapTupleSatisfiesNow() - returns false. - - - -
- - - pg_relpages() returns the number of pages in the relation. + pgstattuple acquires only a read lock on the + relation. So the results do not reflect an instantaneous snapshot; + concurrent updates will affect them. - - - pgstatindex() returns an array showing the information about an index: + pgstattuple judges a tuple is dead if + HeapTupleSatisfiesNow returns false. - -test=> \x -Expanded display is on. + + + + + + pgstattuple(oid) returns record + + + + + This is the same as pgstattuple(text), except + that the target relation is specified by OID. + + + + + + + pgstatindex(text) returns record + + + + + pgstatindex returns a record showing information + about a btree index. For example: + + test=> SELECT * FROM pgstatindex('pg_cast_oid_index'); -[ RECORD 1 ]------+------ version | 2 @@ -128,31 +159,116 @@ deleted_pages | 0 avg_leaf_density | 50.27 leaf_fragmentation | 0 - -
+ + + The output columns are: + + + + <function>pgstatindex</function> output columns + + + + Column + Type + Description + + + + + + version + integer + Btree version number + + + + tree_level + integer + Tree level of the root page + + + + index_size + integer + Total number of pages in index + + + + root_block_no + integer + Location of root block + + + + internal_pages + integer + Number of internal (upper-level) pages + + + + leaf_pages + integer + Number of leaf pages + + + + empty_pages + integer + Number of empty pages + + + + deleted_pages + integer + Number of deleted pages + + + + avg_leaf_density + float8 + Average density of leaf pages + + + + leaf_fragmentation + float8 + Leaf page fragmentation + + + + +
+ + + As with pgstattuple, the results are accumulated + page-by-page, and should not be expected to represent an + instantaneous snapshot of the whole index. + + + + + + + pg_relpages(text) returns integer + + + + + pg_relpages returns the number of pages in the + relation. + + + +
- Usage - - pgstattuple may be called as a relation function and is - defined as follows: - - - CREATE OR REPLACE FUNCTION pgstattuple(text) RETURNS pgstattuple_type - AS 'MODULE_PATHNAME', 'pgstattuple' - LANGUAGE C STRICT; + Author - CREATE OR REPLACE FUNCTION pgstattuple(oid) RETURNS pgstattuple_type - AS 'MODULE_PATHNAME', 'pgstattuplebyid' - LANGUAGE C STRICT; - - The argument is the relation name (optionally it may be qualified) - or the OID of the relation. Note that pgstattuple only returns - one row. + Tatsuo Ishii
- diff --git a/doc/src/sgml/pgtrgm.sgml b/doc/src/sgml/pgtrgm.sgml index b7061481a9..5b630f203d 100644 --- a/doc/src/sgml/pgtrgm.sgml +++ b/doc/src/sgml/pgtrgm.sgml @@ -1,90 +1,120 @@ + + pg_trgm - + pg_trgm - The pg_trgm module provides functions and index classes - for determining the similarity of text based on trigram matching. + The pg_trgm module provides functions and operators + for determining the similarity of text based on trigram matching, as + well as index operator classes that support fast searching for similar + strings. - Trigram (or Trigraph) + Trigram (or Trigraph) Concepts + - A trigram is a set of three consecutive characters taken - from a string. A string is considered to have two spaces - prefixed and one space suffixed when determining the set - of trigrams that comprise the string. - - - eg. The set of trigrams in the word "cat" is " c", " ca", - "at " and "cat". + A trigram is a group of three consecutive characters taken + from a string. We can measure the similarity of two strings by + counting the number of trigrams they share. This simple idea + turns out to be very effective for measuring the similarity of + words in many natural languages. + + + + A string is considered to have two spaces + prefixed and one space suffixed when determining the set + of trigrams contained in the string. + For example, the set of trigrams in the string + cat is + c, + ca, + cat, and + at . + + - Public Functions - - <literal>pg_trgm</literal> functions - + Functions and Operators + +
+ <filename>pg_trgm</filename> functions + Function + Returns Description + - real similarity(text, text) + similarity(text, text) + real - - Returns a number that indicates how closely matches the two - arguments are. A zero result indicates that the two words - are completely dissimilar, and a result of one indicates that - the two words are identical. - + Returns a number that indicates how similar the two arguments are. + The range of the result is zero (indicating that the two strings are + completely dissimilar) to one (indicating that the two strings are + identical). - real show_limit() + show_trgm(text) + text[] - - Returns the current similarity threshold used by the '%' - operator. This in effect sets the minimum similarity between - two words in order that they be considered similar enough to - be misspellings of each other, for example. - + Returns an array of all the trigrams in the given string. + (In practice this is seldom useful except for debugging.) - real set_limit(real) + show_limit() + real - - Sets the current similarity threshold that is used by the '%' - operator, and is returned by the show_limit() function. - + Returns the current similarity threshold used by the % + operator. This sets the minimum similarity between + two words for them to be considered similar enough to + be misspellings of each other, for example. - text[] show_trgm(text) + set_limit(real) + real - - Returns an array of all the trigrams of the supplied text - parameter. - + Sets the current similarity threshold that is used by the % + operator. The threshold must be between 0 and 1 (default is 0.3). + Returns the same value passed in. + + +
+ + + <filename>pg_trgm</filename> operators + + - Operator: text % text (returns boolean) + Operator + Returns + Description + + + + + + text % text + boolean - - The '%' operator returns TRUE if its two arguments have a similarity - that is greater than the similarity threshold set by set_limit(). It - will return FALSE if the similarity is less than the current - threshold. - + Returns true if its arguments have a similarity that is + greater than the current similarity threshold set by + set_limit. @@ -93,97 +123,111 @@ - Public Index Operator Class + Index Support + - The pg_trgm module comes with the - gist_trgm_ops index operator class that allows a - developer to create an index over a text column for the purpose - of very fast similarity searches. + The pg_trgm module provides GiST and GIN index + operator classes that allow you to create an index over a text column for + the purpose of very fast similarity searches. These index types support + the % similarity operator (and no other operators, so you may + want a regular btree index too). + - To use this index, the '%' operator must be used and an appropriate - similarity threshold for the application must be set. Example: - + Example: + CREATE TABLE test_trgm (t text); CREATE INDEX trgm_idx ON test_trgm USING gist (t gist_trgm_ops); +or + +CREATE INDEX trgm_idx ON test_trgm USING gin (t gin_trgm_ops); + + + - At this point, you will have an index on the t text column that you - can use for similarity searching. Example: + At this point, you will have an index on the t column that + you can use for similarity searching. A typical query is -SELECT - t, - similarity(t, 'word') AS sml -FROM - test_trgm -WHERE - t % 'word' -ORDER BY - sml DESC, t; +SELECT t, similarity(t, 'word') AS sml + FROM test_trgm + WHERE t % 'word' + ORDER BY sml DESC, t; This will return all values in the text column that are sufficiently - similar to 'word', sorted from best match to worst. The index will - be used to make this a fast operation over very large data sets. + similar to word, sorted from best match to worst. The + index will be used to make this a fast operation even over very large data + sets. + + + + The choice between GiST and GIN indexing depends on the relative + performance characteristics of GiST and GIN, which are discussed elsewhere. + As a rule of thumb, a GIN index is faster to search than a GiST index, but + slower to build or update; so GIN is better suited for static data and GiST + for often-updated data. Text Search Integration + Trigram matching is a very useful tool when used in conjunction - with a full text index. + with a full text index. In particular it can help to recognize + misspelled input words that will not be matched directly by the + full text search mechanism. + The first step is to generate an auxiliary table containing all the unique words in the documents: + -CREATE TABLE words AS SELECT word FROM - stat('SELECT to_tsvector(''simple'', bodytext) FROM documents'); +CREATE TABLE words AS SELECT word FROM + ts_stat('SELECT to_tsvector(''simple'', bodytext) FROM documents'); + where documents is a table that has a text field - bodytext that we wish to search. The use of the - simple configuration with the to_tsvector - function, instead of just using the already - existing vector is to avoid creating a list of already stemmed - words. This way, only the original, unstemmed words are added - to the word list. + bodytext that we wish to search. The reason for using + the simple configuration with the to_tsvector + function, instead of using a language-specific configuration, + is that we want a list of the original (unstemmed) words. + Next, create a trigram index on the word column: + -CREATE INDEX words_idx ON words USING gist(word gist_trgm_ops); +CREATE INDEX words_idx ON words USING gin(word gin_trgm_ops); + - or - - -CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops); - - - Now, a SELECT query similar to the example above can be - used to suggest spellings for misspelled words in user search terms. A - useful extra clause is to ensure that the similar words are also - of similar length to the misspelled word. - - - - - Since the words table has been generated as a separate, - static table, it will need to be periodically regenerated so that - it remains up to date with the document collection. - - + Now, a SELECT query similar to the previous example can + be used to suggest spellings for misspelled words in user search terms. + A useful extra test is to require that the selected words are also of + similar length to the misspelled word. + + + + Since the words table has been generated as a separate, + static table, it will need to be periodically regenerated so that + it remains reasonably up-to-date with the document collection. + Keeping it exactly current is usually unnecessary. + + References + GiST Development Site @@ -196,6 +240,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops); Authors + Oleg Bartunov oleg@sai.msu.su, Moscow, Moscow University, Russia @@ -203,7 +248,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops); Teodor Sigaev teodor@sigaev.ru, Moscow, Delta-Soft Ltd.,Russia - Documentation: Christopher Kings-Lynne + Documentation: Christopher Kings-Lynne This module is sponsored by Delta-Soft Ltd., Moscow, Russia.