postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-07-16 04:41:08 +02:00

Author	SHA1	Message	Date
Noah Misch	6dd690be36	Process EXTRA_INSTALL serially, during the first temp-install. This closes a race condition in "make -j check-world"; the symptom was EEXIST errors. Back-patch to v10, before which parallel check-world had worse problems. Discussion: https://postgr.es/m/20181224221601.GA3227827@rfd.leadboat.com	2018-12-31 13:55:01 -08:00
Noah Misch	95fae739af	Send EXTRA_INSTALL errors to install.log, not stderr. We already redirected other temp-install stderr and all temp-install stdout in this way. Back-patch to v10, like the next commit. Discussion: https://postgr.es/m/20181224221601.GA3227827@rfd.leadboat.com	2018-12-31 13:54:59 -08:00
Noah Misch	ca01a6748d	pg_regress: Promptly detect failed postmaster startup. Detect it the way pg_ctl's wait_for_postmaster() does. When pg_regress spawned a postmaster that failed startup, we were detecting that only with "pg_regress: postmaster did not respond within 60 seconds". Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. Discussion: https://postgr.es/m/20181231172922.GA199150@gust.leadboat.com	2018-12-31 13:51:15 -08:00
Peter Eisentraut	025cc86b77	pg_rewind: Add missing newline to error message	2018-12-29 13:03:10 +01:00
Michael Paquier	6a19384e74	Improve description of DEFAULT_XLOG_SEG_SIZE in pg_config.h This was incorrectly referring to --walsegsize, and its description is rewritten in a clearer way. Author: Ian Barwick, Tom Lane Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/08534fc6-119a-c498-254e-d5acc4e6bf85@2ndquadrant.com	2018-12-29 08:24:47 +09:00
Tom Lane	d58e01f8ab	Fix latent problem with pg_jrand48(). POSIX specifies that jrand48() returns a signed 32-bit value (in the range [-2^31, 2^31)), but our code was returning an unsigned 32-bit value (in the range [0, 2^32)). This doesn't actually matter to any existing call site, because they all cast the "long" result to int32 or uint32; but it will doubtless bite somebody in the future. To fix, cast the arithmetic result to int32 explicitly before the compiler widens it to long (if widening is needed). While at it, upgrade this file's far-short-of-project-style comments. Had there been some peer pressure to document pg_jrand48() properly, maybe this thinko wouldn't have gotten committed to begin with. Backpatch to v10 where pg_jrand48() was added, just in case somebody back-patches a fix that uses it and depends on the standard behavior. Discussion: https://postgr.es/m/17235.1545951602@sss.pgh.pa.us	2018-12-28 14:08:30 -05:00
Alexander Korotkov	fcdda202bc	Reduce length of GIN predicate locking isolation test suite Isolation test suite of GIN predicate locking was criticized for being too slow, especially under Valgrind. This commit is intended to accelerate it. Tests are simplified in the following ways. 1) Amount of data is reduced. We're now close to the minimal amount of data, which produces at least one posting tree and at least two pages of entry tree. 2) Three isolation tests are merged into one. 3) Only one tuple is queried from posting tree. So, locking of index is the same, but tuple locks are not propagated to relation lock. Also, it is faster. 4) Test cases itself are simplified. Now each test case run just one INSERT and one SELECT involving GIN, which either conflict or not. Discussion: https://postgr.es/m/20181204000740.ok2q53nvkftwu43a%40alap3.anarazel.de Reported-by: Andres Freund Tested-by: Andrew Dunstan Author: Alexander Korotkov Backpatch-through: 11	2018-12-28 03:33:32 +03:00
Alexander Korotkov	fd7c081955	Remove entry tree root conflict checking from GIN predicate locking According to README we acquire predicate locks on entry tree leafs and posting tree roots. However, when ginFindLeafPage() is going to lock leaf in exclusive mode, then it checks root for conflicts regardless whether it's a entry or posting tree. Assuming that we never place predicate lock on entry tree root (excluding corner case when root is leaf), this check is redundant. This commit removes this check. Now, root conflict checking is controlled by separate argument of ginFindLeafPage(). Discussion: https://postgr.es/m/CAPpHfdv7rrDyy%3DMgsaK-L9kk0AH7az0B-mdC3w3p0FSb9uoyEg%40mail.gmail.com Author: Alexander Korotkov Backpatch-through: 11	2018-12-27 04:20:21 +03:00
Michael Paquier	b30b9dce1f	Ignore inherited temp relations from other sessions when truncating Inheritance trees can include temporary tables if the parent is permanent, which makes possible the presence of multiple temporary children from different sessions. Trying to issue a TRUNCATE on the parent in this scenario causes a failure, so similarly to any other queries just ignore such cases, which makes TRUNCATE work transparently. This makes truncation behave similarly to any other DML query working on the parent table with queries which need to be issues on children. A set of isolation tests is added to cover basic cases. Reported-by: Zhou Digoal Author: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org Backpatch-through: 9.4	2018-12-27 10:17:09 +09:00
Tom Lane	47c93ace9f	Fix portability failure introduced in commits `d2b0b60e7` et al. I made a frontend fprintf() format use %m, forgetting that that's only safe in HEAD not the back branches; prior to `96bf88d52` and `d6c55de1f`, it would work on glibc platforms but not elsewhere. Revert to using %s ... strerror(errno) as the code did before. We could have left HEAD as-is, but for code consistency across branches, I chose to apply this patch there too. Per Coverity and a few buildfarm members.	2018-12-26 15:30:22 -05:00
Michael Paquier	a016f59d59	Prioritize history files when archiving At the end of recovery for the post-promotion process, a new history file is created followed by the last partial segment of the previous timeline. Based on the timing, the archiver would first try to archive the last partial segment and then the history file. This can delay the detection of a new timeline taken, particularly depending on the time it takes to transfer the last partial segment as it delays the moment the history file of the new timeline gets archived. This can cause promoted standbys to use the same timeline as one already taken depending on the circumstances if multiple instances look at archives at the same location. This commit changes the order of archiving so as history files are archived in priority over other file types, which reduces the likelihood of the same timeline being taken (still not reducing the window to zero), and it makes the archiver behave more consistently with the startup process doing its post-promotion business. Author: David Steele Reviewed-by: Michael Paquier, Kyotaro Horiguchi Discussion: https://postgr.es/m/929068cf-69e1-bba2-9dc0-e05986aed471@pgmasters.net Backpatch-through: 9.5	2018-12-24 20:25:49 +09:00
Michael Paquier	ff9c222669	Disable WAL-skipping optimization for COPY on views and foreign tables COPY can skip writing WAL when loading data on a table which has been created in the same transaction as the one loading the data, however this cannot work on views or foreign table as this would result in trying to flush relation files which do not exist. So disable the optimization so as commands are able to work the same way with any configuration of wal_level. Tests are added to cover the different cases, which need to have wal_level set to minimal to allow the problem to show up, and that is not the default configuration. Reported-by: Luis M. Carril, Etsuro Fujita Author: Amit Langote, Michael Paquier Reviewed-by: Etsuro Fujita Discussion: https://postgr.es/m/15552-c64aa14c5c22f63c@postgresql.org Backpatch-through: 10, where support for COPY on views has been added, while v11 has added support for COPY on foreign tables.	2018-12-23 16:43:47 +09:00
Peter Eisentraut	018923ccc1	Fix ancient compiler warnings and typos in !HAVE_SYMLINK code This has never been correct since this code was introduced.	2018-12-22 07:25:20 +01:00
Alexander Korotkov	8193d64072	Check for conflicting queries during replay of gistvacuumpage() `013ebc0a7b` implements so-called GiST microvacuum. That is gistgettuple() marks index tuples as dead when kill_prior_tuple is set. Later, when new tuple insertion claims page space, those dead index tuples are physically deleted from page. When this deletion is replayed on standby, it might conflict with read-only queries. But `013ebc0a7b` doesn't handle this. That may lead to disappearance of some tuples from read-only snapshots on standby. This commit implements resolving of conflicts between replay of GiST microvacuum and standby queries. On the master we implement new WAL record type XLOG_GIST_DELETE, which comprises necessary information. On stable releases we've to be tricky to keep WAL compatibility. Information required for conflict processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So, PostgreSQL version, which doesn't know about conflict processing, will just ignore that. Reported-by: Andres Freund Diagnosed-by: Andres Freund Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de Author: Alexander Korotkov Backpatch-through: 9.6	2018-12-21 02:37:31 +03:00
Alvaro Herrera	053ad56d27	Fix lock level used for partition when detaching it For probably bogus reasons, we acquire only AccessShareLock on the partition when we try to detach it from its parent partitioned table. This can cause ugly things to happen if another transaction is doing any sort of DDL to the partition concurrently. Upgrade that lock to ShareUpdateExclusiveLock, which per discussion seems to be the minimum needed. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com	2018-12-20 16:42:13 -03:00
Alvaro Herrera	9bb2ce5ec7	DETACH PARTITION: hold locks on indexes until end of transaction When a partition is detached from its parent, we acquire locks on all attached indexes to also detach them ... but we release those locks immediately. This is a violation of the policy of keeping locks on user objects to the end of the transaction. Bug introduced in `8b08f7d482`. It's unclear that there are any ill effects possible, but it's clearly wrong nonetheless. It's likely that bad behavior is possible, but mostly because the relation that the index is for is only locked with AccessShareLock, which is an older bug that shall be fixed separately. While touching that line of code, close the index opened with index_open() using index_close() instead of relation_close(). No difference in practice, but let's be consistent. Unearthed by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com	2018-12-20 11:11:13 -03:00
Greg Stark	128ce8e1a1	Fix ADD IF NOT EXISTS used in conjunction with ALTER TABLE ONLY The flag for IF NOT EXISTS was only being passed down in the normal recursing case. It's been this way since originally added in 9.6 in commit `2cd40adb85` so backpatch back to 9.6.	2018-12-19 19:40:25 -05:00
Peter Geoghegan	741c4c4142	Correct obsolete nbtree recovery comments. Commit `40dae7ec53`, which made the handling of interrupted nbtree page splits more robust, removed an nbtree-specific end-of-recovery cleanup step. This meant that it was no longer possible to complete an interrupted page split during recovery. However, a reference to recovery as a reason for using a NULL stack while inserting into a parent page was missed. Remove the reference. Remove a similar obsolete reference to recovery that was introduced much more recently, as part of the btree fastpath optimization enhancement that made it into Postgres 11 (commit `2b272734`, and follow-up commits). Backpatch: 11-, where the fastpath optimization was introduced.	2018-12-18 16:59:49 -08:00
Tom Lane	ad425aaf06	Fix ancient thinko in mergejoin cost estimation. "rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples, which is sensible if it's to be multiplied by total-inner-tuples or a cost value corresponding to scanning all the inner tuples. But in reality it was (mostly) multiplied by inner_rows or a related cost, numbers that take into account the possibility of stopping short of scanning the whole inner relation thanks to a limited key range in the outer relation. This'd still make sense if we could expect that stopping short would result in a proportional decrease in the number of tuples that have to be rescanned. It does not, however. The argument that establishes the validity of our estimate for that number is independent of whether we scan all of the inner relation or stop short, and experimentation also shows that stopping short doesn't reduce the number of rescanned tuples. So the correct calculation is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply that by inner_rows or a corresponding cost value. Most of the time this doesn't make much difference, but if we have both a high rescan rate (due to lots of duplicate values) and an outer key range much smaller than the inner key range, then the error can be significant, leading to a large underestimate of the cost associated with rescanning. Per report from Vijaykumar Jain. This thinko appears to go all the way back to the introduction of the rescan estimation logic in commit `70fba7043`, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com	2018-12-18 11:19:38 -05:00
Michael Paquier	d0960eb8ab	Include ALTER INDEX SET STATISTICS in pg_dump The new grammar pattern of ALTER INDEX SET STATISTICS able to use column numbers on top of the existing column names introduced by commit `5b6d13e` forgot to add support for the feature in pg_dump, so defining statistics on index columns was missing from the dumps, potentially causing silent planning problems with a subsequent restore. pg_dump ought to not use column names in what it generates as these are automatically generated by the server and could conflict with real relation attributes with matching patterns. "expr" and "exprN", N incremented automatically after the creation of the first one, are used as default attribute names for index expressions, and that could easily match what is defined in other relations, causing the dumps to fail if some of those attributes are renamed at some point. So to avoid any problems, the new grammar with column numbers gets used. Reported-by: Ronan Dunklau Author: Michael Paquier Reviewed-by: Tom Lane, Adrien Nayrat, Amul Sul Discussion: https://postgr.es/m/CAARsnT3UQ4V=yDNW468w8RqHfYiY9mpn2r_c5UkBJ97NAApUEw@mail.gmail.com Backpatch-through: 11, where the new syntax has been introduced.	2018-12-18 09:28:56 +09:00
Amit Kapila	f88dd4fa43	Remove extra semicolons. Reported-by: David Rowley Author: David Rowley Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/CAKJS1f8EneeYyzzvdjahVZ6gbAHFkHbSFB5m_C0Y6TUJs9Dgdg@mail.gmail.com	2018-12-17 14:31:50 +05:30
Michael Paquier	e83e0988dc	Fix use-after-free bug when renaming constraints This is an oversight from recent commit `b13fd344`. While on it, tweak the previous test with a better name for the renamed primary key. Detected by buildfarm member prion which forces relation cache release with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous commit.	2018-12-17 12:43:39 +09:00
Michael Paquier	25b8094d33	Make constraint rename issue relcache invalidation on target relation When a constraint gets renamed, it may have associated with it a target relation (for example domain constraints don't have one). Not invalidating the target relation cache when issuing the renaming can result in issues with subsequent commands that refer to the old constraint name using the relation cache, causing various failures. One pattern spotted was using CREATE TABLE LIKE after a constraint renaming. Reported-by: Stuart <sfbarbee@gmail.com> Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org	2018-12-17 10:36:03 +09:00
Tom Lane	b054498c74	Make error handling in parallel pg_upgrade less bogus. reap_child() basically ignored the possibility of either an error in waitpid() itself or a child process failure on signal. We don't really need to do more than report and crash hard, but proceeding as though nothing is wrong is definitely Not Acceptable. The error report for nonzero child exit status was pretty off-point, as well. Noted while fooling around with child-process failure detection logic elsewhere. It's been like this a long time, so back-patch to all supported branches.	2018-12-16 14:51:47 -05:00
Tom Lane	b1894a6076	Improve detection of child-process SIGPIPE failures. Commit `ffa4cbd62` added logic to detect SIGPIPE failure of a COPY child process, but it only worked correctly if the SIGPIPE occurred in the immediate child process. Depending on the shell in use and the complexity of the shell command string, we might instead get back an exit code of 128 + SIGPIPE, representing a shell error exit reporting SIGPIPE in the child process. We could just hack up ClosePipeToProgram() to add the extra case, but it seems like this is a fairly general issue deserving a more general and better-documented solution. I chose to add a couple of functions in src/common/wait_error.c, which is a natural place to know about wait-result encodings, that will test for either a specific child-process signal type or any child-process signal failure. Then, adjust other places that were doing ad-hoc tests of this type to use the common functions. In RestoreArchivedFile, this fixes a race condition affecting whether the process will report an error or just silently proc_exit(1): before, that depended on whether the intermediate shell got SIGTERM'd itself or reported a child process failing on SIGTERM. Like the previous patch, back-patch to v10; we could go further but there seems no real need to. Per report from Erik Rijkers. Discussion: https://postgr.es/m/f3683f87ab1701bea5d86a7742b22432@xs4all.nl	2018-12-16 14:32:14 -05:00
Tom Lane	171cae1579	Fix bogus logic for skipping unnecessary partcollation dependencies. The idea here is to not call recordDependencyOn for the default collation, since we know that's pinned. But what the code actually did was to record the partition key's dependency on the opclass twice, instead. Evidently introduced by sloppy coding in commit `2186b608b`. Back-patch to v10 where that came in.	2018-12-13 15:11:16 -05:00
Alexander Korotkov	dd951dc34a	Prevent GIN deleted pages from being reclaimed too early When GIN vacuum deletes a posting tree page, it assumes that no concurrent searchers can access it, thanks to ginStepRight() locking two pages at once. However, since 9.4 searches can skip parts of posting trees descending from the root. That leads to the risk that page is deleted and reclaimed before concurrent search can access it. This commit prevents the risk of above by waiting for every transaction, which might wait to reference this page, to finish. Due to binary compatibility we can't change GinPageOpaqueData to store corresponding transaction id. Instead we reuse page header pd_prune_xid field, which is unused in index pages. Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Andrey Borodin, Alexander Korotkov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4	2018-12-13 06:39:53 +03:00
Alexander Korotkov	225b5c9c48	Prevent deadlock in ginRedoDeletePage() On standby ginRedoDeletePage() can work concurrently with read-only queries. Those queries can traverse posting tree in two ways. 1) Using rightlinks by ginStepRight(), which locks the next page before unlocking its left sibling. 2) Using downlinks by ginFindLeafPage(), which locks at most one page at time. Original lock order was: page, parent, left sibling. That lock order can deadlock with ginStepRight(). In order to prevent deadlock this commit changes lock order to: left sibling, page, parent. Note, that position of parent in locking order seems insignificant, because we only lock one page at time while traversing downlinks. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov Backpatch-through: 9.4	2018-12-13 06:15:23 +03:00
Alexander Korotkov	9aa94d8536	Fix deadlock in GIN vacuum introduced by `218f51584d` Before `218f51584d` if posting tree page is about to be deleted, then the whole posting tree is locked by LockBufferForCleanup() on root preventing all the concurrent inserts. `218f51584d` reduced locking to the subtree containing page to be deleted. However, due to concurrent parent split, inserter doesn't always holds pins on all the pages constituting path from root to the target leaf page. That could cause a deadlock between GIN vacuum process and GIN inserter. And we didn't find non-invasive way to fix this. This commit reverts VACUUM behavior to lock the whole posting tree before delete any page. However, we keep another useful change by `218f51584d`: the tree is locked only if there are pages to be deleted. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Andrey Borodin, Peter Geoghegan Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov, based on ideas from Andrey Borodin and Peter Geoghegan Reviewed-by: Andrey Borodin Backpatch-through: 10	2018-12-13 06:15:23 +03:00
Tom Lane	7465871879	Repair bogus EPQ plans generated for postgres_fdw foreign joins. postgres_fdw's postgresGetForeignPlan() assumes without checking that the outer_plan it's given for a join relation must have a NestLoop, MergeJoin, or HashJoin node at the top. That's been wrong at least since commit `4bbf6edfb` (which could cause insertion of a Sort node on top) and it seems like a pretty unsafe thing to Just Assume even without that. Through blind good fortune, this doesn't seem to have any worse consequences today than strange EXPLAIN output, but it's clearly trouble waiting to happen. To fix, test the node type explicitly before touching Join-specific fields, and avoid jamming the new tlist into a node type that can't do projection. Export a new support function from createplan.c to avoid building low-level knowledge about the latter into FDWs. Back-patch to 9.6 where the faulty coding was added. Note that the associated regression test cases don't show any changes before v11, apparently because the tests back-patched with `4bbf6edfb` don't actually exercise the problem case before then (there's no top-level Sort in those plans). Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us	2018-12-12 16:08:30 -05:00
Tom Lane	302d4eee93	Repair bogus handling of multi-assignment Params in upper plan levels. Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com	2018-12-12 13:49:41 -05:00
Tom Lane	4e33da5f0a	Fix test_rls_hooks to assign expression collations properly. This module overlooked this necessary fixup step on the results of transformWhereClause(). It accidentally worked anyway, because the constructed expression involved type "name" which is not collatable, but it fell over while I was experimenting with changing "name" to be collatable. Back-patch, not because there's any live bug here in back branches, but because somebody might use this code as a model for some real application and then not understand why it doesn't work.	2018-12-11 11:48:00 -05:00
Noah Misch	73822b8c97	Raise some timeouts to 180s, in test code. Slow runs of buildfarm members chipmunk, hornet and mandrill saw the shorter timeouts expire. The 180s timeout in poll_query_until has been trouble-free since `2a0f89cd71` introduced it two years ago, so use 180s more widely. Back-patch to 9.6, where the first of these timeouts was introduced. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/20181209001601.GC2973271@rfd.leadboat.com	2018-12-10 20:15:55 -08:00
Tom Lane	62999b9325	Add stack depth checks to key recursive functions in backend/nodes/*.c. Although copyfuncs.c has a check_stack_depth call in its recursion, equalfuncs.c, outfuncs.c, and readfuncs.c lacked one. This seems unwise. Likewise fix planstate_tree_walker(), in branches where that exists. Discussion: https://postgr.es/m/30253.1544286631@sss.pgh.pa.us	2018-12-10 11:12:43 -05:00
Tom Lane	a628e0c5b4	Make TupleDescInitBuiltinEntry throw error for unsupported types. Previously, it would just pass back a partially-uninitialized tupdesc, which doesn't seem like a safe or useful behavior. Backpatch to v10 where this code came in. Discussion: https://postgr.es/m/30830.1544384975@sss.pgh.pa.us	2018-12-10 10:38:49 -05:00
Tom Lane	aedd3d4dbd	Fix misapplication of pgstat_count_truncate to wrong relation. The stanza of ExecuteTruncate[Guts] that truncates a target table's toast relation re-used the loop local variable "rel" to reference the toast rel. This was safe enough when written, but commit `d42358efb` added code below that that supposed "rel" still pointed to the parent table. Therefore, the stats counter update was applied to the wrong relcache entry (the toast rel not the user rel); and if we were unlucky and that relcache entry had been flushed during reindex_relation, very bad things could ensue. (I'm surprised that CLOBBER_CACHE_ALWAYS testing hasn't found this. I'm even more surprised that the problem wasn't detected during the development of d42358efb; it must not have been tested in any case with a toast table, as the incorrect stats counts are very obvious.) To fix, replace use of "rel" in that code branch with a more local variable. Adjust test cases added by `d42358efb` so that some of them use tables with toast tables. Per bug #15540 from Pan Bian. Back-patch to 9.5 where `d42358efb` came in. Discussion: https://postgr.es/m/15540-01078812338195c0@postgresql.org	2018-12-07 12:12:00 -05:00
Tom Lane	6bc8193193	Clean up sloppy coding in publicationcmds.c's OpenTableList(). Remove dead code (which would be incorrect if it weren't dead), per report from Pan Bian. Add a CHECK_FOR_INTERRUPTS in the inner loop over child relations, because there's little point in having one in the outer loop if there's not one here too. Minor stylistic adjustments and comment improvements. Seems to be aboriginal to this code (cf commit `665d1fad9`). Back-patch to v10 where that came in, not because any of this is significant, but just to keep the branches looking similar. Discussion: https://postgr.es/m/15539-06d00ef6b1e2e1bb@postgresql.org	2018-12-07 11:02:39 -05:00
Tom Lane	d8e1de899c	Improve our response to invalid format strings, and detect more cases. Places that are testing for *printf failure ought to include the format string in their error reports, since bad-format-string is one of the more likely causes of such failure. This both makes it easier to find and repair the mistake, and provides at least some useful info to the user who stumbles across such a problem. Also, tighten snprintf.c to report EINVAL for an invalid flag or final character in a format %-spec (including the case where the %-spec is missing a final character altogether). This seems like better project policy, and it also allows removing an instruction or two from the hot code path. Back-patch the error reporting change in pvsnprintf, since it should be harmless and may be helpful; but not the snprintf.c change. Per discussion of bug #15511 from Ertuğrul Kahveci, which reported an invalid translated format string. These changes don't fix that error, but they should improve matters next time we make such a mistake. Discussion: https://postgr.es/m/15511-1d8b6a0bc874112f@postgresql.org	2018-12-06 15:08:44 -05:00
Alvaro Herrera	37798a8e83	Don't mark partitioned indexes invalid unnecessarily When an indexes is created on a partitioned table using ONLY (don't recurse to partitions), it gets marked invalid until index partitions are attached for each table partition. But there's no reason to do this if there are no partitions ... and moreover, there's no way to get the index to become valid afterwards, because all partitions that get created/attached get their own index partition already attached to the parent index, so there's no chance to do ALTER INDEX ... ATTACH PARTITION that would make the parent index valid. Fix by not marking the index as invalid to begin with. This is very similar to `9139aa1942`, but the pg_dump aspect does not appear to be relevant until we add FKs that can point to PKs on partitioned tables. (I tried to cause the pg_upgrade test to break by leaving some of these bogus tables around, but wasn't able to.) Making this change means that an index that was supposed to be invalid in the insert_conflict regression test is no longer invalid; reorder the DDL so that the test continues to verify the behavior we want it to. Author: Álvaro Herrera Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20181203225019.2vvdef2ybnkxt364@alvherre.pgsql	2018-12-05 13:31:55 -03:00
Michael Paquier	19516afdf1	Fix various checksum check problems for pg_verify_checksums and base backups Three issues are fixed in this patch: - Base backups forgot to ignore files specific to EXEC_BACKEND, leading to spurious warnings when checksums are enabled, per analysis from me. - pg_verify_checksums forgot about files specific to EXEC_BACKEND, leading to failures of the tool on any such build, particularly Windows. This error was originally found by newly-introduced TAP tests in various buildfarm members using EXEC_BACKEND. - pg_verify_checksums forgot to count for temporary files and temporary paths, which could be valid relation files, without checksums, per report from Andres Freund. More tests are added to cover this case. A new test case which emulates corruption for a file in a different tablespace is added, coming from from Michael Banck, while I have coded the main code and refactored the test code. Author: Michael Banck, Michael Paquier Reviewed-by: Stephen Frost, David Steele Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz	2018-11-30 10:34:56 +09:00
Michael Paquier	85036308dc	Switch pg_verify_checksums back to a blacklist This basically reverts commit `d55241af70`, leaving around a portion of the regression tests still adapted with empty relation files, and corrupted cases. This is also proving to be failing to check properly relation files located in a non-default tablespace path. Per discussion with various folks, including Stephen Frost, David Steele, Andres Freund, Michael Banck and myself. Reported-by: Michael Banck Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz Backpatch-through: 11	2018-11-30 10:15:06 +09:00
Tom Lane	0ff56de46e	Ensure static libraries have correct mod time even if ranlib messes it up. In at least Apple's version of ranlib, the output file is updated to have a mod time equal to the max of the timestamps of its components, and that data only has seconds precision. On a filesystem with sub-second file timestamp precision --- say, APFS --- this can result in the finished static library appearing older than its input files, which causes useless rebuilds and possible outright failures in parallel makes. We've only seen this reported in the field from people using Apple's ranlib with a non-Apple make, because Apple's make doesn't know about sub-second timestamps either so it doesn't decide rebuilds are needed. But Apple's ranlib presumably shares code with at least some BSDen, so it's not that unlikely that the same problem could arise elsewhere. To fix, just "touch" the output file after ranlib finishes. We seem to need this in only one place. There are other calls of ranlib in our makefiles, but they are working on intermediate files whose timestamps are not actually important, or else on an installed static library for which sub-second timestamp precision is unlikely to matter either. (Also, so far as I can tell, Apple's ranlib doesn't mess up the file timestamp in the latter usage anyhow.) In passing, change "ranlib" to "$(RANLIB)" in one place that was bypassing the make macro for no good reason. Per bug #15525 from Jack Kelly (via Alyssa Ross). Back-patch to all supported branches. Discussion: https://postgr.es/m/15525-a30da084f17a1faa@postgresql.org	2018-11-29 15:53:44 -05:00
Thomas Munro	f7001b00a0	Fix minor typo in dsa.c. Author: Takeshi Ideriha Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A6F3BF22D%40G01JPEXMBKW04	2018-11-29 14:14:52 +13:00
Michael Paquier	bad41764a4	Fix handling of synchronous replication for stopping WAL senders This fixes an oversight from `c6c3334` which forgot that if a subset of WAL senders are stopping and in a sync state, other WAL senders could still be waiting for a WAL position to be synced while committing a transaction. However the subset of stopping senders would not release waiters, potentially breaking synchronous replication guarantees. This commit makes sure that even WAL senders stopping are able to release waiters and are tracked properly. On 9.4, this can also trigger an assertion failure when setting for example max_wal_senders to 1 where a WAL sender is not able to find itself as in synchronous state when the instance stops. Reported-by: Paul Guo Author: Paul Guo, Michael Paquier Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com Backpatch-through: 9.4	2018-11-29 09:12:40 +09:00
Peter Geoghegan	95c4571812	Have BufFileSize() ereport() on FileSize() failure. Move the responsibility for checking for and reporting a failure from the only current BufFileSize() caller, logtape.c, to BufFileSize() itself. Code within buffile.c is generally responsible for interfacing with fd.c to report irrecoverable failures. This seems like a convention that's worth sticking to. Reorganizing things this way makes it easy to make the error message raised in the event of BufFileSize() failure descriptive of the underlying problem. We're now clear on the distinction between temporary file name and BufFile name, and can show errno, confident that its value actually relates to the error being reported. In passing, an existing, similar buffile.c ereport() + errcode_for_file_access() site is changed to follow the same conventions. The API of the function BufFileSize() is changed by this commit, despite already being in a stable release (Postgres 11). This seems acceptable, since the BufFileSize() ABI was changed by commit `aa55183042`, which hasn't made it into a point release yet. Besides, it's difficult to imagine a third party BufFileSize() caller not just raising an error anyway, since BufFile state should be considered corrupt when BufFileSize() fails. Per complaint from Tom Lane. Discussion: https://postgr.es/m/26974.1540826748@sss.pgh.pa.us Backpatch: 11-, where shared BufFiles were introduced.	2018-11-28 14:42:52 -08:00
Thomas Munro	0640d9517e	Don't set PAM_RHOST for Unix sockets. Since commit `2f1d2b7a` we have set PAM_RHOST to "[local]" for Unix sockets. This caused Linux PAM's libaudit integration to make DNS requests for that name. It's not exactly clear what value PAM_RHOST should have in that case, but it seems clear that we shouldn't set it to an unresolvable name, so don't do that. Back-patch to 9.6. Bug #15520. Author: Thomas Munro Reviewed-by: Peter Eisentraut Reported-by: Albert Schabhuetl Discussion: https://postgr.es/m/15520-4c266f986998e1c5%40postgresql.org	2018-11-28 14:14:40 +13:00
Tomas Vondra	f8397c955e	Do not decode TOAST data for table rewrites During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged using XLOG / FPI records, and thus (correctly) ignored in decoding. But the associated TOAST table is WAL-logged as plain INSERT records, and so was logically decoded and passed to reorder buffer. That has severe consequences with TOAST tables of non-trivial size. Firstly, reorder buffer has to keep all those changes, possibly spilling them to a file, incurring I/O costs and disk space. Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into a hash table, which got discarded only after processing the row from the main heap. But as the main heap is not decoded for rewrites, this never happened, so all the TOAST data accumulated in memory, resulting either in excessive memory consumption or OOM. The fix is simple, as commit `e9edc1ba` already introduced infrastructure (namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST tables, but it only applied it to system tables. So simply use it for all TOAST data in raw_heap_insert(). That would however solve only the memory consumption issue - the TOAST changes would still be decoded and added to the reorder buffer, and spilled to disk (although without TOAST tuple data, so much smaller). But we can solve that by tweaking DecodeInsert() to just ignore such INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag, instead of skipping them later in ReorderBufferCommit(). Review: Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com Backpatch: 9.4-, where logical decoding was introduced	2018-11-28 01:43:53 +01:00
Andres Freund	aee085bc01	Fix jit compilation bug on wide tables. The function generated to perform JIT compiled tuple deforming failed when HeapTupleHeader's t_hoff was bigger than a signed int8. I'd failed to realize that LLVM's getelementptr would treat an int8 index argument as signed, rather than unsigned. That means that a hoff larger than 127 would result in a negative offset being applied. Fix that by widening the index to 32bit. Add a testcase with a wide table. Don't drop it, as it seems useful to verify other tools deal properly with wide tables. Thanks to Justin Pryzby for both reporting a bug and then reducing it to a reproducible testcase! Reported-By: Justin Pryzby Author: Andres Freund Discussion: https://postgr.es/m/20181115223959.GB10913@telsasoft.com Backpatch: 11, just as jit compilation was	2018-11-27 10:07:43 -08:00
Andres Freund	5ef8f08b54	Fix `ac218aa4f6` to work on versions before 9.5. Unfortunately `ac218aa4f6` missed the fact that a reference to 'pg_catalog.regnamespace'::regclass wouldn't work before that type is known. Fix that, by replacing the regtype usage with a join to pg_type. Reported-By: Tom Lane Author: Andres Freund Discussion: https://postgr.es/m/8863.1543297423@sss.pgh.pa.us Backpatch: 9.5-, like `ac218aa4f6`	2018-11-26 23:26:15 -08:00
Andres Freund	203a909fd2	Update pg_upgrade test for reg* to include regrole and regnamespace. When the regrole (`0c90f6769`) and regnamespace (`cb9fa802b`) types were added in 9.5, pg_upgrade's check for reg* types wasn't updated. While regrole currently is safe, regnamespace is not. It seems unlikely that anybody uses regnamespace inside catalog tables across a pg_upgrade, but the tests should be correct nevertheless. While at it, reorder the types checked in the query to be alphabetical. Otherwise it's annoying to compare existing and tested for types. Author: Andres Freund Discussion: https://postgr.es/m/037e152a-cb25-3bcb-4f35-bdc9988f8204@2ndQuadrant.com Backpatch: 9.5-, as regrole/regnamespace	2018-11-26 17:00:59 -08:00

1 2 3 4 5 ...

32562 Commits