postgresql/contrib/pageinspect/expected/gist.out

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

134 lines
5.4 KiB
Plaintext
Raw Normal View History

-- The gist_page_opaque_info() function prints the page's LSN. Normally,
-- that's constant 1 (GistBuildLSN) on every page of a freshly built GiST
-- index. But with wal_level=minimal, the whole relation is dumped to WAL at
-- the end of the transaction if it's smaller than wal_skip_threshold, which
-- updates the LSNs. Wrap the tests on gist_page_opaque_info() in the
-- same transaction with the CREATE INDEX so that we see the LSNs before
-- they are possibly overwritten at end of transaction.
BEGIN;
-- Create a test table and GiST index.
CREATE TABLE test_gist AS SELECT point(i,i) p, i::text t FROM
generate_series(1,1000) i;
CREATE INDEX test_gist_idx ON test_gist USING gist (p);
-- Page 0 is the root, the rest are leaf pages
SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 0));
lsn | nsn | rightlink | flags
-----+-----+------------+-------
0/1 | 0/0 | 4294967295 | {}
(1 row)
SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 1));
lsn | nsn | rightlink | flags
-----+-----+------------+--------
0/1 | 0/0 | 4294967295 | {leaf}
(1 row)
SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
lsn | nsn | rightlink | flags
-----+-----+-----------+--------
0/1 | 0/0 | 1 | {leaf}
(1 row)
COMMIT;
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gist_idx');
pageinspect: Fix gist_page_items() with included columns Non-leaf pages of GiST indexes contain key attributes, leaf pages contain both key and non-key attributes, and gist_page_items() ignored the handling of non-key attributes. This caused a few problems when using gist_page_items() on a GiST index with INCLUDE: - On a non-leaf page, the function would crash. - On a leaf page, the function would work, but miss to display all the values for included attributes. This commit fixes gist_page_items() to handle such cases in a more appropriate way, and now displays the values of key and non-key attributes for each item separately in a style consistent with what ruleutils.c would generate for the attribute list, depending on the page type dealt with. In a way similar to how a record is displayed, values would be double-quoted for key or non-key attributes if required. ruleutils.c did not provide a routine able to control if non-key attributes should be displayed, so an extended() routine for index definitions is added to work around the leaf and non-leaf page differences. While on it, this commit fixes a third problem related to the amount of data reported for key attributes. The code originally relied on BuildIndexValueDescription() (used for error reports on constraints) that would not print all the data stored in the index but the index opclass's input type, so this limited the amount of information available. This switch makes gist_page_items() much cheaper as there is no need to run ACL checks for each item printed, which is not an issue anyway as superuser rights are required to execute the functions of pageinspect. Opclasses whose data cannot be displayed can rely on gist_page_items_bytea(). The documentation of this function was slightly incorrect for the output results generated on HEAD and v15, so adjust it on these branches. Author: Alexander Lakhin, Michael Paquier Discussion: https://postgr.es/m/17884-cb8c326522977acb@postgresql.org Backpatch-through: 14
2023-05-19 05:37:58 +02:00
itemoffset | ctid | itemlen | dead | keys
------------+-----------+---------+------+-------------------------------
1 | (1,65535) | 40 | f | (p)=("(185,185),(1,1)")
2 | (2,65535) | 40 | f | (p)=("(370,370),(186,186)")
3 | (3,65535) | 40 | f | (p)=("(555,555),(371,371)")
4 | (4,65535) | 40 | f | (p)=("(740,740),(556,556)")
5 | (5,65535) | 40 | f | (p)=("(870,870),(741,741)")
6 | (6,65535) | 40 | f | (p)=("(1000,1000),(871,871)")
(6 rows)
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 1), 'test_gist_idx') LIMIT 5;
pageinspect: Fix gist_page_items() with included columns Non-leaf pages of GiST indexes contain key attributes, leaf pages contain both key and non-key attributes, and gist_page_items() ignored the handling of non-key attributes. This caused a few problems when using gist_page_items() on a GiST index with INCLUDE: - On a non-leaf page, the function would crash. - On a leaf page, the function would work, but miss to display all the values for included attributes. This commit fixes gist_page_items() to handle such cases in a more appropriate way, and now displays the values of key and non-key attributes for each item separately in a style consistent with what ruleutils.c would generate for the attribute list, depending on the page type dealt with. In a way similar to how a record is displayed, values would be double-quoted for key or non-key attributes if required. ruleutils.c did not provide a routine able to control if non-key attributes should be displayed, so an extended() routine for index definitions is added to work around the leaf and non-leaf page differences. While on it, this commit fixes a third problem related to the amount of data reported for key attributes. The code originally relied on BuildIndexValueDescription() (used for error reports on constraints) that would not print all the data stored in the index but the index opclass's input type, so this limited the amount of information available. This switch makes gist_page_items() much cheaper as there is no need to run ACL checks for each item printed, which is not an issue anyway as superuser rights are required to execute the functions of pageinspect. Opclasses whose data cannot be displayed can rely on gist_page_items_bytea(). The documentation of this function was slightly incorrect for the output results generated on HEAD and v15, so adjust it on these branches. Author: Alexander Lakhin, Michael Paquier Discussion: https://postgr.es/m/17884-cb8c326522977acb@postgresql.org Backpatch-through: 14
2023-05-19 05:37:58 +02:00
itemoffset | ctid | itemlen | dead | keys
------------+-------+---------+------+---------------------
1 | (0,1) | 40 | f | (p)=("(1,1),(1,1)")
2 | (0,2) | 40 | f | (p)=("(2,2),(2,2)")
3 | (0,3) | 40 | f | (p)=("(3,3),(3,3)")
4 | (0,4) | 40 | f | (p)=("(4,4),(4,4)")
5 | (0,5) | 40 | f | (p)=("(5,5),(5,5)")
(5 rows)
-- gist_page_items_bytea prints the raw key data as a bytea. The output of that is
-- platform-dependent (endianness), so omit the actual key data from the output.
SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist_idx', 0));
itemoffset | ctid | itemlen
------------+-----------+---------
1 | (1,65535) | 40
2 | (2,65535) | 40
3 | (3,65535) | 40
4 | (4,65535) | 40
5 | (5,65535) | 40
6 | (6,65535) | 40
(6 rows)
-- Suppress the DETAIL message, to allow the tests to work across various
-- page sizes and architectures.
\set VERBOSITY terse
-- Failures with non-GiST index.
2022-03-16 03:19:39 +01:00
CREATE INDEX test_gist_btree on test_gist(t);
SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
ERROR: "test_gist_btree" is not a GiST index
SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_idx');
ERROR: input page is not a valid GiST page
pageinspect: Add more sanity checks to prevent out-of-bound reads A couple of code paths use the special area on the page passed by the function caller, expecting to find some data in it. However, feeding an incorrect page can lead to out-of-bound reads when trying to access the page special area (like a heap page that has no special area, leading PageGetSpecialPointer() to grab a pointer outside the allocated page). The functions used for hash and btree indexes have some protection already against that, while some other functions using a relation OID as argument would make sure that the access method involved is correct, but functions taking in input a raw page without knowing the relation the page is attached to would run into problems. This commit improves the set of checks used in the code paths of BRIN, btree (including one check if a leaf page is found with a non-zero level), GIN and GiST to verify that the page given in input has a special area size that fits with each access method, which is done though PageGetSpecialSize(), becore calling PageGetSpecialPointer(). The scope of the checks done is limited to work with pages that one would pass after getting a block with get_raw_page(), as it is possible to craft byteas that could bypass existing code paths. Having too many checks would also impact the usability of pageinspect, as the existing code is very useful to look at the content details in a corrupted page, so the focus is really to avoid out-of-bound reads as this is never a good thing even with functions whose execution is limited to superusers. The safest approach could be to rework the functions so as these fetch a block using a relation OID and a block number, but there are also cases where using a raw page is useful. Tests are added to cover all the code paths that needed such checks, and an error message for hash indexes is reworded to fit better with what this commit adds. Reported-By: Alexander Lakhin Author: Julien Rouhaud, Michael Paquier Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru Backpatch-through: 10
2022-03-27 10:53:40 +02:00
-- Failure with various modes.
-- invalid page size
2022-03-16 03:19:39 +01:00
SELECT gist_page_items_bytea('aaa'::bytea);
ERROR: invalid page size
SELECT gist_page_items('aaa'::bytea, 'test_gist_idx'::regclass);
ERROR: invalid page size
SELECT gist_page_opaque_info('aaa'::bytea);
ERROR: invalid page size
pageinspect: Add more sanity checks to prevent out-of-bound reads A couple of code paths use the special area on the page passed by the function caller, expecting to find some data in it. However, feeding an incorrect page can lead to out-of-bound reads when trying to access the page special area (like a heap page that has no special area, leading PageGetSpecialPointer() to grab a pointer outside the allocated page). The functions used for hash and btree indexes have some protection already against that, while some other functions using a relation OID as argument would make sure that the access method involved is correct, but functions taking in input a raw page without knowing the relation the page is attached to would run into problems. This commit improves the set of checks used in the code paths of BRIN, btree (including one check if a leaf page is found with a non-zero level), GIN and GiST to verify that the page given in input has a special area size that fits with each access method, which is done though PageGetSpecialSize(), becore calling PageGetSpecialPointer(). The scope of the checks done is limited to work with pages that one would pass after getting a block with get_raw_page(), as it is possible to craft byteas that could bypass existing code paths. Having too many checks would also impact the usability of pageinspect, as the existing code is very useful to look at the content details in a corrupted page, so the focus is really to avoid out-of-bound reads as this is never a good thing even with functions whose execution is limited to superusers. The safest approach could be to rework the functions so as these fetch a block using a relation OID and a block number, but there are also cases where using a raw page is useful. Tests are added to cover all the code paths that needed such checks, and an error message for hash indexes is reworded to fit better with what this commit adds. Reported-By: Alexander Lakhin Author: Julien Rouhaud, Michael Paquier Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru Backpatch-through: 10
2022-03-27 10:53:40 +02:00
-- invalid special area size
SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist', 0));
ERROR: input page is not a valid GiST page
SELECT gist_page_items_bytea(get_raw_page('test_gist', 0));
ERROR: input page is not a valid GiST page
SELECT gist_page_items_bytea(get_raw_page('test_gist_btree', 0));
ERROR: input page is not a valid GiST page
2022-03-16 03:19:39 +01:00
\set VERBOSITY default
-- Tests with all-zero pages.
SHOW block_size \gset
SELECT gist_page_items_bytea(decode(repeat('00', :block_size), 'hex'));
gist_page_items_bytea
-----------------------
(0 rows)
SELECT gist_page_items(decode(repeat('00', :block_size), 'hex'), 'test_gist_idx'::regclass);
gist_page_items
-----------------
(0 rows)
SELECT gist_page_opaque_info(decode(repeat('00', :block_size), 'hex'));
gist_page_opaque_info
-----------------------
(1 row)
pageinspect: Fix gist_page_items() with included columns Non-leaf pages of GiST indexes contain key attributes, leaf pages contain both key and non-key attributes, and gist_page_items() ignored the handling of non-key attributes. This caused a few problems when using gist_page_items() on a GiST index with INCLUDE: - On a non-leaf page, the function would crash. - On a leaf page, the function would work, but miss to display all the values for included attributes. This commit fixes gist_page_items() to handle such cases in a more appropriate way, and now displays the values of key and non-key attributes for each item separately in a style consistent with what ruleutils.c would generate for the attribute list, depending on the page type dealt with. In a way similar to how a record is displayed, values would be double-quoted for key or non-key attributes if required. ruleutils.c did not provide a routine able to control if non-key attributes should be displayed, so an extended() routine for index definitions is added to work around the leaf and non-leaf page differences. While on it, this commit fixes a third problem related to the amount of data reported for key attributes. The code originally relied on BuildIndexValueDescription() (used for error reports on constraints) that would not print all the data stored in the index but the index opclass's input type, so this limited the amount of information available. This switch makes gist_page_items() much cheaper as there is no need to run ACL checks for each item printed, which is not an issue anyway as superuser rights are required to execute the functions of pageinspect. Opclasses whose data cannot be displayed can rely on gist_page_items_bytea(). The documentation of this function was slightly incorrect for the output results generated on HEAD and v15, so adjust it on these branches. Author: Alexander Lakhin, Michael Paquier Discussion: https://postgr.es/m/17884-cb8c326522977acb@postgresql.org Backpatch-through: 14
2023-05-19 05:37:58 +02:00
-- Test gist_page_items with included columns.
-- Non-leaf pages contain only the key attributes, and leaf pages contain
-- the included attributes.
ALTER TABLE test_gist ADD COLUMN i int DEFAULT NULL;
CREATE INDEX test_gist_idx_inc ON test_gist
USING gist (p) INCLUDE (t, i);
-- Mask the value of the key attribute to avoid alignment issues.
SELECT regexp_replace(keys, '\(p\)=\("(.*?)"\)', '(p)=("<val>")') AS keys_nonleaf_1
FROM gist_page_items(get_raw_page('test_gist_idx_inc', 0), 'test_gist_idx_inc')
WHERE itemoffset = 1;
keys_nonleaf_1
----------------
(p)=("<val>")
(1 row)
SELECT keys AS keys_leaf_1
FROM gist_page_items(get_raw_page('test_gist_idx_inc', 1), 'test_gist_idx_inc')
WHERE itemoffset = 1;
keys_leaf_1
------------------------------------------------------
(p) INCLUDE (t, i)=("(1,1),(1,1)") INCLUDE (1, null)
(1 row)
DROP TABLE test_gist;