postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	2040bb4a0b	Clean up manipulations of hash indexes' hasho_flag field. Standardize on testing a hash index page's type by doing (opaque->hasho_flag & LH_PAGE_TYPE) == LH_xxx_PAGE Various places were taking shortcuts like opaque->hasho_flag & LH_BUCKET_PAGE which while not actually wrong, is still bad practice because it encourages use of opaque->hasho_flag & LH_UNUSED_PAGE which is wrong (LH_UNUSED_PAGE == 0, so the above is constant false). hash_xlog.c's hash_mask() contained such an incorrect test. This also ensures that we mask out the additional flag bits that hasho_flag has accreted since 9.6. pgstattuple's pgstat_hash_page(), for one, was failing to do that and was thus actively broken. Also fix assorted comments that hadn't been updated to reflect the extended usage of hasho_flag, and fix some macros that were testing just "(hasho_flag & bit)" to use the less dangerous, project-approved form "((hasho_flag & bit) != 0)". Coverity found the bug in hash_mask(); I noted the one in pgstat_hash_page() through code reading.	2017-04-14 17:04:25 -04:00
Alvaro Herrera	8bf74967da	Reduce the number of pallocs() in BRIN Instead of allocating memory in brin_deform_tuple and brin_copy_tuple over and over during a scan, allow reuse of previously allocated memory. This is said to make for a measurable performance improvement. Author: Jinyu Zhang, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/495deb78.4186.1500dacaa63.Coremail.beijing_pg@163.com	2017-04-07 19:08:43 -03:00
Robert Haas	633e15ea0f	Fix pageinspect failures on hash indexes. Make every page in a hash index which isn't all-zeroes have a valid special space, so that tools like pageinspect don't error out. Also, make pageinspect cope with all-zeroes pages, because _hash_alloc_buckets can leave behind large numbers of those until they're consumed by splits. Ashutosh Sharma and Robert Haas, reviewed by Amit Kapila. Original trouble report from Jeff Janes. Discussion: http://postgr.es/m/CAMkU=1y6NjKmqbJ8wLMhr=F74WzcMALYWcVFhEpm7i=mV=XsOg@mail.gmail.com	2017-04-05 14:18:15 -04:00
Peter Eisentraut	193f5f9e91	pageinspect: Add bt_page_items function with bytea argument Author: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>	2017-04-04 23:52:55 -04:00
Robert Haas	ea69a0dead	Expand hash indexes more gradually. Since hash indexes typically have very few overflow pages, adding a new splitpoint essentially doubles the on-disk size of the index, which can lead to large and abrupt increases in disk usage (and perhaps long delays on occasion). To mitigate this problem to some degree, divide larger splitpoints into four equal phases. This means that, for example, instead of growing from 4GB to 8GB all at once, a hash index will now grow from 4GB to 5GB to 6GB to 7GB to 8GB, which is perhaps still not as smooth as we'd like but certainly an improvement. This changes the on-disk format of the metapage, so bump HASH_VERSION from 2 to 3. This will force a REINDEX of all existing hash indexes, but that's probably a good idea anyway. First, hash indexes from pre-10 versions of PostgreSQL could easily be corrupted, and we don't want to confuse corruption carried over from an older release with any corruption caused despite the new write-ahead logging in v10. Second, it will let us remove some backward-compatibility code added by commit `293e24e507`. Mithun Cy, reviewed by Amit Kapila, Jesper Pedersen and me. Regression test outputs updated by me. Discussion: http://postgr.es/m/CAD__OuhG6F1gQLCgMQNnMNgoCvOLQZz9zKYJQNYvYmmJoM42gA@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoYty0jCf-pa+m+vYUJ716+AxM7nv_syvyanyf5O-L_i2A@mail.gmail.com	2017-04-03 23:46:33 -04:00
Alvaro Herrera	ce96ce60ca	Remove direct uses of ItemPointer.{ip_blkid,ip_posid} There are no functional changes here; this simply encapsulates knowledge of the ItemPointerData struct so that a future patch can change things without more breakage. All direct users of ip_blkid and ip_posid are changed to use existing macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber respectively. For callers where that's inappropriate (because they Assert that the itempointer is is valid-looking), add ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck, which lack the assertion but are otherwise identical. Author: Pavan Deolasee Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com	2017-03-28 19:02:23 -03:00
Peter Eisentraut	fef2bcdcba	pageinspect: Add page_checksum function Author: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>	2017-03-17 10:55:17 -04:00
Peter Eisentraut	a02731cb10	pageinspect: Add test for page_header function	2017-03-17 09:23:39 -04:00
Robert Haas	c11453ce0a	hash: Add write-ahead logging support. The warning about hash indexes not being write-ahead logged and their use being discouraged has been removed. "snapshot too old" is now supported for tables with hash indexes. Most importantly, barring bugs, hash indexes will now be crash-safe and usable on standbys. This commit doesn't yet add WAL consistency checking for hash indexes, as we now have for other index types; a separate patch has been submitted to cure that lack. Amit Kapila, reviewed and slightly modified by me. The larger patch series of which this is a part has been reviewed and tested by Álvaro Herrera, Ashutosh Sharma, Mark Kirkwood, Jeff Janes, and Jesper Pedersen. Discussion: http://postgr.es/m/CAA4eK1JOBX=YU33631Qh-XivYXtPSALh514+jR8XeD7v+K3r_Q@mail.gmail.com	2017-03-14 13:27:02 -04:00
Noah Misch	3a0d473192	Use wrappers of PG_DETOAST_DATUM_PACKED() more. This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.	2017-03-12 19:35:34 -04:00
Stephen Frost	c08d82f38e	Add relkind checks to certain contrib modules The contrib extensions pageinspect, pg_visibility and pgstattuple only work against regular relations which have storage. They don't work against foreign tables, partitioned (parent) tables, views, et al. Add checks to the user-callable functions to return a useful error message to the user if they mistakenly pass an invalid relation to a function which doesn't accept that kind of relation. In passing, improve some of the existing checks to use ereport() instead of elog(), add a function to consolidate common checks where appropriate, and add some regression tests. Author: Amit Langote, with various changes by me Reviewed by: Michael Paquier and Corey Huinker Discussion: https://postgr.es/m/ab91fd9d-4751-ee77-c87b-4dd704c1e59c@lab.ntt.co.jp	2017-03-09 16:34:25 -05:00
Robert Haas	b4316928d5	Fix incorrect typecast. Ashutosh Sharma, per a report from Mithun Cy. Discussion: http://postgr.es/m/CAD__OujgqNNnCujeFTmKpjNu+W4smS8Hbi=RcWAhf1ZUs3H4WA@mail.gmail.com	2017-02-22 12:05:42 +05:30
Robert Haas	fc8219dc54	pageinspect: Fix hash_bitmap_info not to read the underlying page. It did that to verify that the page was an overflow page rather than anything else, but that means that checking the status of all the overflow bits requires reading the entire index. So don't do that. The new code validates that the page is not a primary bucket page or bitmap page by looking at the metapage, so that using this on large numbers of pages can be reasonably efficient. Ashutosh Sharma, per a complaint from me, and with further modifications by me.	2017-02-09 14:34:34 -05:00
Robert Haas	293e24e507	Cache hash index's metapage in rel->rd_amcache. This avoids a very significant amount of buffer manager traffic and contention when scanning hash indexes, because it's no longer necessary to lock and pin the metapage for every scan. We do need some way of figuring out when the cache is too stale to use any more, so that when we lock the primary bucket page to which the cached metapage points us, we can tell whether a split has occurred since we cached the metapage data. To do that, we use the hash_prevblkno field in the primary bucket page, which would otherwise always be set to InvalidBuffer. This patch contains code so that it will continue working (although less efficiently) with hash indexes built before this change, but perhaps we should consider bumping the hash version and ripping out the compatibility code. That decision can be made later, though. Mithun Cy, reviewed by Jesper Pedersen, Amit Kapila, and by me. Before committing, I made a number of cosmetic changes to the last posted version of the patch, adjusted _hash_getcachedmetap to be more careful about order of operation, and made some necessary updates to the pageinspect documentation and regression tests.	2017-02-07 12:35:45 -05:00
Robert Haas	871ec0e336	pageinspect: More type-sanity surgery on the new hash index code. Uniformly expose unsigned quantities using the next-wider signed integer type (since we have no unsigned types at the SQL level). At the SQL level, this results a change to report itemoffset as int4 rather than int2. Also at the SQL level, report one value that is an OID as type oid. Under the hood, uniformly use macros that match the SQL output type as to both width and signedness.	2017-02-03 16:28:13 -05:00
Tom Lane	14e9b18fed	In pageinspect/hashfuncs.c, avoid crashes on alignment-picky machines. On machines with MAXALIGN = 8, the payload of a bytea is not maxaligned, since it will start 4 bytes into a palloc'd value. On alignment-picky hardware, this will cause failures in accesses to 8-byte-wide values within the page. We already encountered this problem when we introduced GIN index inspection functions, and fixed it in commit `84ad68d64`. Make use of the same function for hash indexes. A small difficulty is that up to now contrib/pageinspect has not shared any functions at all across files. To support that, introduce a common header file "pageinspect.h" for the module. Also, move get_page_from_raw() out of ginfuncs.c, where it didn't especially belong, and put it in rawpage.c which seems a more natural home. Per buildfarm. Discussion: https://postgr.es/m/17311.1486134714@sss.pgh.pa.us	2017-02-03 11:34:47 -05:00
Robert Haas	29e312bc13	pageinspect: Remove platform-dependent values from hash tests. Per a report from Tom Lane, the ffactor reported by hash_metapage_info and the free_size reported by hash_page_stats vary by platform. Ashutosh Sharma and Robert Haas	2017-02-03 11:06:41 -05:00
Tom Lane	c6eeb67dcc	Fix a bunch more portability bugs in commit `08bf6e529`. It seems like somebody used a dartboard while choosing integer widths for the various values taken and returned by these functions ... and then threw a fresh set of darts while writing the SQL declarations. This patch brings the C code into line with what the SQL declarations say, which is enough to make it not dump core on the particular 32-bit machine I'm testing on. But I think we could do with another round of looking at what the datum widths should be. For instance, it's not all that sensible that hash_bitmap_info decided to use int64 to represent a BlockNumber input when get_raw_page doesn't do it that way. There's also a remaining problem that the expected outputs from the test script are platform-dependent, but I'll leave that issue for somebody else. Per buildfarm.	2017-02-02 23:11:08 -05:00
Robert Haas	ed807fda6d	pageinspect: Try to fix some bugs in previous commit. Commit `08bf6e5295` seems not to have used the correct GetDatum and PG_GETARG_ macros for the SQL types in some cases, and some of the SQL types seem to have been poorly chosen, too. Try to fix it. I'm not sure if this is the reason why the buildfarm is currently unhappy with this code, but it seems like a good place to start. Buildfarm unhappiness reported by Tom Lane.	2017-02-02 22:32:06 -05:00
Robert Haas	08bf6e5295	pageinspect: Support hash indexes. Patch by Jesper Pedersen and Ashutosh Sharma, with some error handling improvements by me. Tests from Peter Eisentraut. Reviewed by Álvaro Herrera, Michael Paquier, Jesper Pedersen, Jeff Janes, Peter Eisentraut, Amit Kapila, Mithun Cy, and me. Discussion: http://postgr.es/m/e2ac6c58-b93f-9dd9-f4e6-d6d30add7fdf@redhat.com	2017-02-02 14:19:32 -05:00
Peter Eisentraut	f21a563d25	Move some things from builtins.h to new header files This avoids that builtins.h has to include additional header files.	2017-01-20 20:29:53 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Tom Lane	367b99bbb1	Fix gin_leafpage_items(). On closer inspection, commit `84ad68d64` broke gin_leafpage_items(), because the aligned copy of the page got palloc'd in a short-lived context whereas it needs to be in the SRF's multi_call_memory_ctx. This was not exposed by the regression test, because the regression test doesn't actually exercise the function in a meaningful way. Fix the code bug, and extend the test in what I hope is a portable fashion.	2016-11-04 12:11:54 -04:00
Peter Eisentraut	84ad68d645	pageinspect: Fix unaligned struct access in GIN functions The raw page data that is passed into the functions will not be aligned at 8-byte boundaries. Casting that to a struct and accessing int64 fields will result in unaligned access. On most platforms, you get away with it, but it will result on a crash on pickier platforms such as ia64 and sparc64.	2016-11-04 10:05:37 -04:00
Peter Eisentraut	00a86856c1	pageinspect: Make page test more portable Choose test data that makes the output independent of endianness.	2016-11-02 08:45:17 -04:00
Tom Lane	14ee35799f	Fix portability bug in gin_page_opaque_info(). Somebody apparently thought that "if Int32GetDatum is good, Int64GetDatum must be better". Per buildfarm failures now that Peter has added some regression tests here.	2016-11-02 00:09:27 -04:00
Peter Eisentraut	f7c9a6e083	pageinspect: Make btree test more portable Choose test data that makes the output independent of endianness and alignment.	2016-11-01 22:02:39 -04:00
Peter Eisentraut	adfb81d9e1	pageinspect: Add tests	2016-11-01 14:02:16 -04:00
Heikki Linnakangas	8a2f08fbea	Fix typo in comment. Daniel Gustafsson	2016-10-26 11:10:13 +03:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Robert Haas	e3b607cd7a	Update pageinspect extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-09 17:18:09 -04:00
Robert Haas	8826d85078	Tweak a few more things in preparation for upcoming pgindent run. These adjustments adjust code and comments in minor ways to prevent pgindent from mangling them. Among other things, I tried to avoid situations where pgindent would emit "a +b" instead of "a + b", and I tried to avoid having it break up inline comments across multiple lines.	2016-05-03 10:52:25 -04:00
Heikki Linnakangas	d22b85fbd4	Remove unused macros. CHECK_PAGE_OFFSET_RANGE() has been unused forever. CHECK_RELATION_BLOCK_RANGE() has been unused in pgstatindex.c ever since bt_page_stats() and bt_page_items() functions were moved from pgstattuple to pageinspect module. It still exists in pageinspect/btreefuncs.c. Daniel Gustafsson	2016-05-02 10:07:49 +03:00
Kevin Grittner	a343e223a5	Revert no-op changes to BufferGetPage() The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas	2016-04-20 08:31:19 -05:00
Kevin Grittner	8b65cf4c5e	Modify BufferGetPage() to prepare for "snapshot too old" feature This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.	2016-04-08 14:30:10 -05:00
Peter Eisentraut	339025c68f	Replace printf format %i by %d see also `ce8d7bb644`	2016-04-08 12:42:58 -04:00
Peter Eisentraut	8b737f9084	Fix printf format	2016-04-08 12:34:33 -04:00
Alvaro Herrera	3e1338475f	Add missing checks to some of pageinspect's BRIN functions brin_page_type() and brin_metapage_info() did not enforce being called by superuser, like other pageinspect functions that take bytea do. Since they don't verify the passed page thoroughly, it is possible to use them to read the server memory with a carefully crafted bytea value, up to a file kilobytes from where the input bytea is located. Have them throw errors if called by a non-superuser. Report and initial patch: Andreas Seltenreich Security: CVE-2016-3065	2016-03-28 10:57:42 -03:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Teodor Sigaev	0271e27c10	Add forgotten file in commit `d6061f83a1`	2015-11-25 16:59:07 +03:00
Teodor Sigaev	d6061f83a1	Improve pageinspect module Now pageinspect can show data stored in the heap tuple. Nikolay Shaplov	2015-11-25 16:31:55 +03:00
Alvaro Herrera	94d626ff5a	Use materialize SRF mode in brin_page_items This function was using the single-value-per-call mechanism, but the code relied on a relcache entry that wasn't kept open across calls. This manifested as weird errors in buildfarm during the short time that the "brin-1" isolation test lived. Backpatch to 9.5, where it was introduced.	2015-08-13 13:02:10 -03:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Alvaro Herrera	db5f98ab4f	Improve BRIN infra, minmax opclass and regression test The minmax opclass was using the wrong support functions when cross-datatypes queries were run. Instead of trying to fix the pg_amproc definitions (which apparently is not possible), use the already correct pg_amop entries instead. This requires jumping through more hoops (read: extra syscache lookups) to obtain the underlying functions to execute, but it is necessary for correctness. Author: Emre Hasegeli, tweaked by Álvaro Review: Andreas Karlsson Also change BrinOpcInfo to record each stored type's typecache entry instead of just the OID. Turns out that the full type cache is necessary in brin_deform_tuple: the original code used the indexed type's byval and typlen properties to extract the stored tuple, which is correct in Minmax; but in other implementations that want to store something different, that's wrong. The realization that this is a bug comes from Emre also, but I did not use his patch. I also adopted Emre's regression test code (with smallish changes), which is more complete.	2015-05-07 13:02:22 -03:00
Alvaro Herrera	e491bd2ee3	Move BRIN page type to page's last two bytes ... which is the usual convention among AMs, so that pg_filedump and similar utilities can tell apart pages of different AMs. It was also the intent of the original code, but I failed to realize that alignment considerations would move the whole thing to the previous-to-last word in the page. The new definition of the associated macro makes surrounding code a bit leaner, too. Per note from Heikki at http://www.postgresql.org/message-id/546A16EF.9070005@vmware.com	2015-03-10 12:27:15 -03:00
Tom Lane	e1a11d9311	Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[]. This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me	2015-02-21 15:13:06 -05:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Alvaro Herrera	b52cb4690e	pageinspect/BRIN: minor tweaks Michael Paquier Double-dash additions suggested by Peter Geoghegan	2014-12-02 12:20:50 -03:00

1 2 3

112 Commits