postgresql

Commit Graph

Author	SHA1	Message	Date
Michael Paquier	c96581abe4	Fix inconsistencies and typos in the tree, take 11 This fixes various typos in docs and comments, and removes some orphaned definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com	2019-08-19 16:21:39 +09:00
Andres Freund	fb3b098fe8	Remove fmgr.h includes from headers that don't really need it. Most of the fmgr.h includes were obsoleted by `352a24a1f9`. A few others can be obsoleted using the underlying struct type in an implementation detail. Author: Andres Freund Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de	2019-08-16 10:35:31 -07:00
Peter Geoghegan	9c02cf5661	Remove block number field from nbtree stack. The initial value of the nbtree stack downlink block number field recorded during an initial descent of the tree wasn't actually used. Both _bt_getstackbuf() callers overwrote the value with their own value. Remove the block number field from the stack struct, and add a child block number argument to _bt_getstackbuf() in its place. This makes the overall design of _bt_getstackbuf() clearer. Author: Peter Geoghegan Reviewed-By: Anastasia Lubennikova Discussion: https://postgr.es/m/CAH2-Wzmx+UbXt2YNOUCZ-a04VdXU=S=OHuAuD7Z8uQq-PXTYUg@mail.gmail.com	2019-08-14 11:32:35 -07:00
Peter Geoghegan	68ef887842	Remove obsolete nbtree README commentary. Commit `d2086b08b0` removed almost all cases where nbtree must release a read buffer lock and acquire a write buffer lock instead, so remaining cases in which that's still necessary are not notable enough to appear in the nbtree README. More importantly, holding on to a buffer pin in cases where nbtree must trade a read lock for a write lock is very unlikely to save any I/O. This seems to have been a long overlooked throwback to a time when nbtree cared about write-ordering dependencies, and performed synchronous buffer writes. It hasn't worked that way in many years.	2019-08-13 17:16:44 -07:00
Peter Geoghegan	af0ba49809	Use PageIndexTupleOverwrite() within nbtree. Use the PageIndexTupleOverwrite() bufpage.c routine within nbtree instead of deleting a tuple and re-inserting its replacement. This makes the intent of affected code slightly clearer. It also makes CREATE INDEX slightly faster, since there is no longer a need to shift every leaf page's line pointer array back and forth during index builds. Author: Peter Geoghegan, Anastasia Lubennikova Reviewed-By: Anastasia Lubennikova Discussion: https://postgr.es/m/CAH2-Wz=Zk=B9+Vwm376WuO7YTjFc2SSskifQm4Nme3RRRPtOSQ@mail.gmail.com	2019-08-13 11:54:26 -07:00
Michael Paquier	66bde49d96	Fix inconsistencies and typos in the tree, take 10 This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com	2019-08-13 13:53:41 +09:00
Heikki Linnakangas	1169fcf129	Fix predicate-locking of HOT updated rows. In serializable mode, heap_hot_search_buffer() incorrectly acquired a predicate lock on the root tuple, not the returned tuple that satisfied the visibility checks. As explained in README-SSI, the predicate lock does not need to be copied or extended to other tuple versions, but for that to work, the correct, visible, tuple version must be locked in the first place. The original SSI commit had this bug in it, but it was fixed back in 2013, in commit `81fbbfe335`. But unfortunately, it was reintroduced a few months later in commit `b89e151054`. Wising up from that, add a regression test to cover this, so that it doesn't get reintroduced again. Also, move the code that sets 't_self', so that it happens at the same time that the other HeapTuple fields are set, to make it more clear that all the code in the loop operate on the "current" tuple in the chain, not the root tuple. Bug spotted by Andres Freund, analysis and original fix by Thomas Munro, test case and some additional changes to the fix by Heikki Linnakangas. Backpatch to all supported versions (9.4). Discussion: https://www.postgresql.org/message-id/20190731210630.nqhszuktygwftjty%40alap3.anarazel.de	2019-08-07 12:40:49 +03:00
Michael Paquier	8548ddc61b	Fix inconsistencies and typos in the tree, take 9 This addresses more issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com	2019-08-05 12:14:58 +09:00
Peter Eisentraut	fd6ec93bf8	Add error codes to some corruption log messages In some cases we have elog(ERROR) while corruption is certain and we can give a clear error code ERRCODE_DATA_CORRUPTED or ERRCODE_INDEX_CORRUPTED. Author: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://www.postgresql.org/message-id/flat/25F6C686-6442-4A6B-BAF8-A6F7B84B16DE@yandex-team.ru	2019-08-01 11:15:26 +02:00
Andres Freund	6384e87be2	Remove superfluous semicolon. Author: Andres Freund	2019-07-30 22:09:53 -07:00
Michael Paquier	eb43f3d193	Fix inconsistencies and typos in the tree This is numbered take 8, and addresses again a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/b137b5eb-9c95-9c2f-586e-38aba7d59788@gmail.com	2019-07-29 12:28:30 +09:00
Heikki Linnakangas	6655a7299d	Use full 64-bit XID for checking if a deleted GiST page is old enough. Otherwise, after a deleted page gets even older, it becomes unrecyclable again. B-tree has the same problem, and has had since time immemorial, but let's at least fix this in GiST, where this is new. Backpatch to v12, where GiST page deletion was introduced. Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru	2019-07-24 20:24:07 +03:00
Heikki Linnakangas	9eb5607e69	Refactor checks for deleted GiST pages. The explicit check in gistScanPage() isn't currently really necessary, as a deleted page is always empty, so the loop would fall through without doing anything, anyway. But it's a marginal optimization, and it gives a nice place to attach a comment to explain how it works. Backpatch to v12, where GiST page deletion was introduced. Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru	2019-07-24 20:24:05 +03:00
Michael Paquier	23bccc823d	Fix inconsistencies and typos in the tree This is numbered take 7, and addresses a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/dff75442-2468-f74f-568c-6006e141062f@gmail.com	2019-07-22 10:01:50 +09:00
Peter Geoghegan	d004147eb3	Fix nbtree metapage cache upgrade bug. Commit `857f9c36cd`, which taught nbtree VACUUM to avoid unnecessary index scans, bumped the nbtree version number from 2 to 3, while adding the ability for nbtree indexes to be upgraded on-the-fly. Various assertions that assumed that an nbtree index was always on version 2 had to be changed to accept any supported version (version 2 or 3 on Postgres 11). However, a few assertions were missed in the initial commit, all of which were in code paths that cache a local copy of the metapage metadata, where the index had been expected to be on the current version (no longer version 2) as a generic sanity check. Rather than simply update the assertions, follow-up commit `0a64b45152` intentionally made the metapage caching code update the per-backend cached metadata version without changing the on-disk version at the same time. This could even happen when the planner needed to determine the height of a B-Tree for costing purposes. The assertions only fail on Postgres v12 when upgrading from v10, because they were adjusted to use the authoritative shared memory metapage by v12's commit `dd299df8`. To fix, remove the cache-only upgrade mechanism entirely, and update the assertions themselves to accept any supported version (go back to using the cached version in v12). The fix is almost a full revert of commit `0a64b45152` on the v11 branch. VACUUM only considers the authoritative metapage, and never bothers with a locally cached version, whereas everywhere else isn't interested in the metapage fields that were added by commit `857f9c36cd`. It seems unlikely that this bug has affected any user on v11. Reported-By: Christoph Berg Bug: #15896 Discussion: https://postgr.es/m/15896-5b25e260fdb0b081%40postgresql.org Backpatch: 11-, where VACUUM was taught to avoid unnecessary index scans.	2019-07-18 13:22:56 -07:00
Tom Lane	d97b714a21	Avoid using lcons and list_delete_first where it's easy to do so. Formerly, lcons was about the same speed as lappend, but with the new List implementation, that's not so; with a long List, data movement imposes an O(N) cost on lcons and list_delete_first, but not lappend. Hence, invent list_delete_last with semantics parallel to list_delete_first (but O(1) cost), and change various places to use lappend and list_delete_last where this can be done without much violence to the code logic. There are quite a few places that construct result lists using lcons not lappend. Some have semantic rationales for that; I added comments about it to a couple that didn't have them already. In many such places though, I think the coding is that way only because back in the dark ages lcons was faster than lappend. Hence, switch to lappend where this can be done without causing semantic changes. In ExecInitExprRec(), this results in aggregates and window functions that are in the same plan node being executed in a different order than before. Generally, the executions of such functions ought to be independent of each other, so this shouldn't result in visibly different query results. But if you push it, as one regression test case does, you can show that the order is different. The new order seems saner; it's closer to the order of the functions in the query text. And we never documented or promised anything about this, anyway. Also, in gistfinishsplit(), don't bother building a reverse-order list; it's easy now to iterate backwards through the original list. It'd be possible to go further towards removing uses of lcons and list_delete_first, but it'd require more extensive logic changes, and I'm not convinced it's worth it. Most of the remaining uses deal with queues that probably never get long enough to be worth sweating over. (Actually, I doubt that any of the changes in this patch will have measurable performance effects either. But better to have good examples than bad ones in the code base.) Patch by me, thanks to David Rowley and Daniel Gustafsson for review. Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us	2019-07-17 11:15:34 -04:00
Michael Paquier	0896ae561b	Fix inconsistencies and typos in the tree This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com	2019-07-16 13:23:53 +09:00
Peter Geoghegan	bfdbac2ab3	Correct nbtsplitloc.c comment. The logic just added by commit `e3899ffd` falls back on a 50:50 page split in the event of a new item that's just to the right of our provisional "many duplicates" split point. Fix a comment that incorrectly claimed that the new item had to be just to the left of our provisional split point. Backpatch: 12-, just like commit `e3899ffd`.	2019-07-15 14:35:06 -07:00
Peter Geoghegan	e3899ffd8b	Fix pathological nbtree split point choice issue. Specific ever-decreasing insertion patterns could cause successive unbalanced nbtree page splits. Problem cases involve a large group of duplicates to the left, and ever-decreasing insertions to the right. To fix, detect the situation by considering the newitem offset before performing a split using nbtsplitloc.c's "many duplicates" strategy. If the new item was inserted just to the right of our provisional "many duplicates" split point, infer ever-decreasing insertions and fall back on a 50:50 (space delta optimal) split. This seems to barely affect cases that already had acceptable space utilization. An alternative fix also seems possible. Instead of changing nbtsplitloc.c split choice logic, we could instead teach _bt_truncate() to generate a new value for new high keys by interpolating from the lastleft and firstright key values. That would certainly be a more elegant fix, but it isn't suitable for backpatching. Discussion: https://postgr.es/m/CAH2-WznCNvhZpxa__GqAa1fgQ9uYdVc=_apArkW2nc-K3O7_NA@mail.gmail.com Backpatch: 12-, where the nbtree page split enhancements were introduced.	2019-07-15 13:19:13 -07:00
Tom Lane	1cff1b95ab	Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit `d0b4399d8`) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell " pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically doesn't require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us	2019-07-15 13:41:58 -04:00
Thomas Munro	67b9b3ca32	Provide XLogRecGetFullXid(). In order to be able to work with FullTransactionId values during replay without increasing the size of the WAL, infer the epoch. In general we can't do that safely, but during replay we can because we know that nextFullXid can't advance concurrently. Prevent frontend code from seeing this new function, due to the above restriction. Perhaps in future it will be possible to extract the value entirely from independent WAL records, and then this restriction can be lifted. Author: Thomas Munro, based on earlier code from Andres Freund Discussion: https://postgr.es/m/CA%2BhUKG%2BmLmuDjMi6o1dxkKvGRL56Y2Rz%2BiXAcrZV03G9ZuFQ8Q%40mail.gmail.com	2019-07-15 17:04:29 +12:00
Alexander Korotkov	c085e1c1cb	Add support for <-> (box, point) operator to GiST box_ops Index-based calculation of this operator is exact. So, signature of gist_bbox_distance() function is changes so that caller is responsible for setting *recheck flag. Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Tom Lane, Alexander Korotkov	2019-07-14 15:09:15 +03:00
Michael Paquier	fa19a08d71	Fix variable initialization when using buffering build with GiST This can cause valgrind to complain, as the flag marking a buffer as a temporary copy was not getting initialized. While on it, fill in with zeros newly-created buffer pages. This does not matter when loading a block from a temporary file, but it makes the push of an index tuple into a new buffer page safer. This has been introduced by `1d27dcf`, so backpatch all the way down to 9.4. Author: Alexander Lakhin Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org Backpatch-through: 9.4	2019-07-10 15:14:54 +09:00
Robert Haas	554106b116	tableam: Provide helper functions for relation sizing. Most block-based table AMs will need the exact same implementation of the relation_size callback as the heap, and if they use a standard page layout, they will likely need an implementation of the relation_estimate_size callback that is very similar to that of the heap. Rearrange to facilitate code reuse. Patch by me, reviewed by Michael Paquier, Daniel Gustafsson, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoZ6DBPnP1E-vRpQZUJQijJFD54F+SR_pxGiAAS-MyrigA@mail.gmail.com	2019-07-08 14:51:53 -04:00
Michael Paquier	6b8548964b	Fix inconsistencies in the code This addresses a couple of issues in the code: - Typos and inconsistencies in comments and function declarations. - Removal of unreferenced function declarations. - Removal of unnecessary compile flags. - A cleanup error in regressplans.sh. Author: Alexander Lakhin Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com	2019-07-08 13:15:09 +09:00
Peter Eisentraut	7e9a4c5c3d	Use consistent style for checking return from system calls Use if (something() != 0) error ... instead of just if (something) error ... The latter is not incorrect, but it's a bit confusing and not the common style. Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com	2019-07-07 15:28:49 +02:00
Amit Kapila	78d41f6c9b	Add missing assertions for required table am callbacks. Reported-by: Ashwin Agrawal Author: Ashwin Agrawal Reviewed-by: Amit Kapila Backpatch-through: 12, where it was introduced Discussion: https://postgr.es/m/CALfoeisgdZhYDrJOukaBzvXfJOK2FQ0szVMK7dzmcy6w93iDUA@mail.gmail.com	2019-07-06 11:41:23 +05:30
Peter Eisentraut	6a1cd8b923	Unwind some workarounds for lack of portable int64 format specifier Because there is no portable int64/uint64 format specifier and we can't stick macros like INT64_FORMAT into the middle of a translatable string, we have been using various workarounds that put the number to be printed into a string buffer first. Now that we always use our own sprintf(), we can rely on %lld and %llu to work, so we can use those. This patch undoes this workaround in a few places where it was egregiously verbose. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/CAH2-Wz%3DWbNxc5ob5NJ9yqo2RMJ0q4HXDS30GVCobeCvC9A1L9A%40mail.gmail.com	2019-07-04 17:01:43 +02:00
David Rowley	8abc13a889	Use appendStringInfoString and appendPQExpBufferStr where possible This changes various places where appendPQExpBuffer was used in places where it was possible to use appendPQExpBufferStr, and likewise for appendStringInfo and appendStringInfoString. This is really just a stylistic improvement, but there are also small performance gains to be had from doing this. Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com	2019-07-04 13:01:13 +12:00
Peter Geoghegan	66c5bd3a6f	Remove obsolete nbtree "get root" comment. Remove a very old Berkeley era comment that doesn't seem to have anything to do with the current locking considerations within _bt_getroot(). Discussion: https://postgr.es/m/CAH2-WzmA2H+rL-xxF5o6QhMD+9x6cJTnz2Mr3Li_pbPBmqoTBQ@mail.gmail.com	2019-07-01 22:28:08 -07:00
Michael Paquier	c74d49d41c	Fix many typos and inconsistencies Author: Alexander Lakhin Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com	2019-07-01 10:00:23 +09:00
Peter Eisentraut	21f428ebde	Don't call data type input functions in GUC check hooks Instead of calling pg_lsn_in() in check_recovery_target_lsn and timestamptz_in() in check_recovery_target_time, reorganize the respective code so that we don't raise any errors in the check hooks. The previous code tried to use PG_TRY/PG_CATCH to handle errors in a way that is not safe, so now the code contains no ereport() calls and can operate safely within the GUC error handling system. Moreover, since the interpretation of the recovery_target_time string may depend on the time zone, we cannot do the final processing of that string until all the GUC processing is done. Instead, check_recovery_target_time() now does some parsing for syntax checking, but the actual conversion to a timestamptz value is done later in the recovery code that uses it. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/20190611061115.njjwkagvxp4qujhp%40alap3.anarazel.de	2019-06-30 10:27:43 +02:00
Peter Eisentraut	f2f0082ef5	Update comment Function was renamed/replaced in `c2fe139c20` but the header comment was not updated.	2019-06-27 15:57:14 +02:00
Michael Paquier	ce59b75d44	Add toast-level reloption for vacuum_index_cleanup `a96c41f` has introduced the option for heap, but it still lacked the variant to control the behavior for toast relations. While on it, refactor the tests so as they stress more scenarios with the various values that vacuum_index_cleanup can use. It would be useful to couple those tests with pageinspect to check that pages are actually cleaned up, but this is left for later. Author: Masahiko Sawada, Michael Paquier Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAD21AoCqs8iN04RX=i1KtLSaX5RrTEM04b7NHYps4+rqtpWNEg@mail.gmail.com	2019-06-25 09:09:27 +09:00
Thomas Munro	89ff7c08ee	Remove unnecessary comment. Author: Vik Fearing Discussion: https://postgr.es/m/150d3e9f-c7ec-3fb3-4fdb-def47c4144af%402ndquadrant.com	2019-06-23 22:19:59 +12:00
Magnus Hagander	66013fe730	Fix typo Author: Daniel Gustafsson	2019-06-19 14:59:26 +02:00
Michael Paquier	3c28fd2281	Fix description of WAL record XLOG_BTREE_META_CLEANUP This record uses one metadata buffer and registers some data associated to the buffer, but when parsing the record for its description a direct access to the record data was done, but there is none. This leads usually to an incorrect description, but can also cause crashes like in pg_waldump. Instead, fix things so as the parsing uses the data associated to the metadata block. This is an oversight from `3d92796`, so backpatch down to 11. Author: Michael Paquier Description: https://postgr.es/m/20190617013059.GA3153@paquier.xyz Backpatch-through: 11	2019-06-19 11:02:19 +09:00
Andres Freund	23224563d9	Fix memory corruption/crash in ANALYZE. This fixes an embarrassing oversight I (Andres) made in `737a292b`, namely missing two place where liverows/deadrows were used when converting those variables to pointers, leading to incrementing the pointer, rather than the value. It's not that actually that easy to trigger a crash: One needs tuples deleted by the current transaction, followed by a tuple deleted in another session, all in one page. Which is presumably why this hasn't been noticed before. Reported-By: Steve Singer Author: Steve Singer Discussion: https://postgr.es/m/c7988239-d42c-ddc4-41db-171b23b35e4f@ssinger.info	2019-06-18 15:51:04 -07:00
Alvaro Herrera	8b21b416ed	Avoid spurious deadlocks when upgrading a tuple lock This puts back reverted commit `de87a084c0`, with some bug fixes. When two (or more) transactions are waiting for transaction T1 to release a tuple-level lock, and transaction T1 upgrades its lock to a higher level, a spurious deadlock can be reported among the waiting transactions when T1 finishes. The simplest example case seems to be: T1: select id from job where name = 'a' for key share; Y: select id from job where name = 'a' for update; -- starts waiting for T1 Z: select id from job where name = 'a' for key share; T1: update job set name = 'b' where id = 1; Z: update job set name = 'c' where id = 1; -- starts waiting for T1 T1: rollback; At this point, transaction Y is rolled back on account of a deadlock: Y holds the heavyweight tuple lock and is waiting for the Xmax to be released, while Z holds part of the multixact and tries to acquire the heavyweight lock (per protocol) and goes to sleep; once T1 releases its part of the multixact, Z is awakened only to be put back to sleep on the heavyweight lock that Y is holding while sleeping. Kaboom. This can be avoided by having Z skip the heavyweight lock acquisition. As far as I can see, the biggest downside is that if there are multiple Z transactions, the order in which they resume after T1 finishes is not guaranteed. Backpatch to 9.6. The patch applies cleanly on 9.5, but the new tests don't work there (because isolationtester is not smart enough), so I'm not going to risk it. Author: Oleksii Kliukin Discussion: https://postgr.es/m/B9C9D7CD-EB94-4635-91B6-E558ACEC0EC3@hintbits.com Discussion: https://postgr.es/m/2815.1560521451@sss.pgh.pa.us	2019-06-18 18:23:16 -04:00
Michael Paquier	3412030205	Fix more typos and inconsistencies in the tree Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com	2019-06-17 16:13:16 +09:00
Alvaro Herrera	9d20b0ec8f	Revert "Avoid spurious deadlocks when upgrading a tuple lock" This reverts commits `3da73d6839` and `de87a084c0`. This code has some tricky corner cases that I'm not sure are correct and not properly tested anyway, so I'm reverting the whole thing for next week's releases (reintroducing the deadlock bug that we set to fix). I'll try again afterwards. Discussion: https://postgr.es/m/E1hbXKQ-0003g1-0C@gemulon.postgresql.org	2019-06-16 22:24:21 -04:00
Alvaro Herrera	3da73d6839	Silence compiler warning Introduced in `de87a084c0`.	2019-06-14 11:33:40 -04:00
Michael Paquier	f43608bda2	Fix typos and inconsistencies in code comments Author: Alexander Lakhin Discussion: https://postgr.es/m/dec6aae8-2d63-639f-4d50-20e229fb83e3@gmail.com	2019-06-14 09:34:34 +09:00
Alvaro Herrera	de87a084c0	Avoid spurious deadlocks when upgrading a tuple lock When two (or more) transactions are waiting for transaction T1 to release a tuple-level lock, and transaction T1 upgrades its lock to a higher level, a spurious deadlock can be reported among the waiting transactions when T1 finishes. The simplest example case seems to be: T1: select id from job where name = 'a' for key share; Y: select id from job where name = 'a' for update; -- starts waiting for X Z: select id from job where name = 'a' for key share; T1: update job set name = 'b' where id = 1; Z: update job set name = 'c' where id = 1; -- starts waiting for X T1: rollback; At this point, transaction Y is rolled back on account of a deadlock: Y holds the heavyweight tuple lock and is waiting for the Xmax to be released, while Z holds part of the multixact and tries to acquire the heavyweight lock (per protocol) and goes to sleep; once X releases its part of the multixact, Z is awakened only to be put back to sleep on the heavyweight lock that Y is holding while sleeping. Kaboom. This can be avoided by having Z skip the heavyweight lock acquisition. As far as I can see, the biggest downside is that if there are multiple Z transactions, the order in which they resume after X finishes is not guaranteed. Backpatch to 9.6. The patch applies cleanly on 9.5, but the new tests don't work there (because isolationtester is not smart enough), so I'm not going to risk it. Author: Oleksii Kliukin Discussion: https://postgr.es/m/B9C9D7CD-EB94-4635-91B6-E558ACEC0EC3@hintbits.com	2019-06-13 17:28:24 -04:00
Andres Freund	fff2a7d7bd	Don't access catalogs to validate GUCs when not connected to a DB. Vignesh found this bug in the check function for default_table_access_method's check hook, but that was just copied from older GUCs. Investigation by Michael and me then found the bug in further places. When not connected to a database (e.g. in a walsender connection), we cannot perform (most) GUC checks that need database access. Even when only shared tables are needed, unless they're nailed (c.f. RelationCacheInitializePhase2()), they cannot be accessed without pg_class etc. being present. Fix by extending the existing IsTransactionState() checks to also check for MyDatabaseOid. Reported-By: Vignesh C, Michael Paquier, Andres Freund Author: Vignesh C, Andres Freund Discussion: https://postgr.es/m/CALDaNm1KXK9gbZfY-p_peRFm_XrBh1OwQO1Kk6Gig0c0fVZ2uw%40mail.gmail.com Backpatch: 9.4-	2019-06-10 23:34:50 -07:00
Noah Misch	31d250e049	Update stale comments, and fix comment typos.	2019-06-08 10:12:26 -07:00
Amit Kapila	92c4abc736	Fix assorted inconsistencies. There were a number of issues in the recent commits which include typos, code and comments mismatch, leftover function declarations. Fix them. Reported-by: Alexander Lakhin Author: Alexander Lakhin, Amit Kapila and Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/ef0c0232-0c1d-3a35-63d4-0ebd06e31387@gmail.com	2019-06-08 08:16:38 +05:30
Alvaro Herrera	e8bdea58f9	Fix message style Mark one message not for translation, and prefer "cannot" over "may not", per commentary from Robert Haas. Discussion: https://postgr.es/m/20190430145813.GA29872@alvherre.pgsql	2019-06-06 12:57:57 -04:00
Michael Paquier	1fb6f62a84	Fix typos in various places Author: Andrea Gelmini Reviewed-by: Michael Paquier, Justin Pryzby Discussion: https://postgr.es/m/20190528181718.GA39034@glet	2019-06-03 13:44:03 +09:00
Alvaro Herrera	d22f885f89	Fix double-phrase typo in message New in `147e3722f7`.	2019-05-31 10:08:37 -04:00

1 2 3 4 5 ...

3561 Commits