postgresql

Commit Graph

Author	SHA1	Message	Date
Alvaro Herrera	50e3ed8e91	Fix replay of create database records on standby Crash recovery on standby may encounter missing directories when replaying create database WAL records. Prior to this patch, the standby would fail to recover in such a case. However, the directories could be legitimately missing. Consider a sequence of WAL records as follows: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, the crash recovery must be able to move on. This patch adds a mechanism similar to invalid-page tracking, to keep a tally of missing directories during crash recovery. If all the missing directory references are matched with corresponding drop records at the end of crash recovery, the standby can safely continue following the primary. Backpatch to 13, at least for now. The bug is older, but fixing it in older branches requires more careful study of the interactions with commit `e6d8069522`, which appeared in 13. A new TAP test file is added to verify the condition. However, because it depends on commit `d6d317dbf6`, it can only be added to branch master. I (Álvaro) manually verified that the code behaves as expected in branch 14. It's a bit nervous-making to leave the code uncovered by tests in older branches, but leaving the bug unfixed is even worse. Also, the main reason this fix took so long is precisely that we couldn't agree on a good strategy to approach testing for the bug, so perhaps this is the best we can do. Diagnosed-by: Paul Guo <paulguo@gmail.com> Author: Paul Guo <paulguo@gmail.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com	2022-03-25 13:16:21 +01:00
Robert Haas	1ce14b6b2f	Fix possible recovery trouble if TRUNCATE overlaps a checkpoint. If TRUNCATE causes some buffers to be invalidated and thus the checkpoint does not flush them, TRUNCATE must also ensure that the corresponding files are truncated on disk. Otherwise, a replay from the checkpoint might find that the buffers exist but have the wrong contents, which may cause replay to fail. Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design suggestion from Heikki Linnakangas, with some changes to the comments by me. Review of this and a prior patch that approached the issue differently by Heikki Linnakangas, Andres Freund, Álvaro Herrera, Masahiko Sawada, and Tom Lane. Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com	2022-03-24 14:36:06 -04:00
Andres Freund	c0f99bb520	Don't try to translate NULL in GetConfigOptionByNum(). Noticed via -fsanitize=undefined. Introduced when a few columns in GetConfigOptionByNum() / pg_settings started to be translated in `72be8c29a` / PG 12. Backpatch to all affected branches, for the same reasons as `46ab07ffda`. Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 12-	2022-03-23 13:18:00 -07:00
Andres Freund	8014c61ebd	Don't call fwrite() with len == 0 when writing out relcache init file. Noticed via -fsanitize=undefined. Backpatch to all branches, for the same reasons as `46ab07ffda`. Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 10-	2022-03-23 13:13:20 -07:00
Andres Freund	7c163aa93f	configure: check for dlsym instead of dlopen. When building with sanitizers the sanitizer library provides dlopen, but not dlsym(), making configure think that -ldl isn't needed. Just checking for dlsym() ought to suffice, hard to see dlsym() being provided without dlopen() also being provided. Backpatch to all branches, for the same reasons as `46ab07ffda`. Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 10-	2022-03-23 12:43:40 -07:00
Alvaro Herrera	ca26c64581	pg_upgrade: Upgrade an Assert to a real 'if' test It seems possible for the condition being tested to be true in production, and nobody would never know (except when some data eventually becomes corrupt?). Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql	2022-03-23 19:23:51 +01:00
Alvaro Herrera	98eb3e06ce	Fix "missing continuation record" after standby promotion Invalidate abortedRecPtr and missingContrecPtr after a missing continuation record is successfully skipped on a standby. This fixes a PANIC caused when a recently promoted standby attempts to write an OVERWRITE_RECORD with an LSN of the previously read aborted record. Backpatch to 10 (all stable versions). Author: Sami Imseih <simseih@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com	2022-03-23 18:22:10 +01:00
Thomas Munro	f784fcdc44	Try to stabilize vacuum test. As commits `b700f96c` and `3414099c` did for the reloptions test, make sure VACUUM can always truncate the table as expected. Back-patch to 12, where vacuum_truncate arrived. Discussion: https://postgr.es/m/CAD21AoCNoWjYkdEtr%2BVDoF9v__V905AedKZ9iF%3DArgCtrbxZqw%40mail.gmail.com	2022-03-23 15:07:46 +13:00
Andres Freund	f183e23cc8	Add missing dependency of pg_dumpall to WIN32RES. When cross-building to windows, or building with mingw on windows, the build could fail with x86_64-w64-mingw32-gcc: error: win32ver.o: No such file or director because pg_dumpall didn't depend on WIN32RES, but it's recipe references it. The build nevertheless succeeded most of the time, due to pg_dump/pg_restore having the required dependency, causing win32ver.o to be built. Reported-By: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKGJeekpUPWW6yCVdf9=oBAcCp86RrBivo4Y4cwazAzGPng@mail.gmail.com Backpatch: 10-, omission present on all live branches	2022-03-22 08:28:52 -07:00
Michael Paquier	f32be9938c	Fix failures in SSL tests caused by out-of-tree keys and certificates This issue is environment-sensitive, where the SSL tests could fail in various way by feeding on defaults provided by sslcert, sslkey, sslrootkey, sslrootcert, sslcrl and sslcrldir coming from a local setup, as of ~/.postgresql/ by default. Horiguchi-san has reported two failures, but more advanced testing from me (aka inclusion of garbage SSL configuration in ~/.postgresql/ for all the configuration parameters) has showed dozens of failures that can be triggered in the whole test suite. History has showed that we are not good when it comes to address such issues, fixing them locally like in `dd87799`, and such problems keep appearing. This commit strengthens the entire test suite to put an end to this set of problems by embedding invalid default values in all the connection strings used in the tests. The invalid values are prefixed in each connection string, relying on the follow-up values passed in the connection string to enforce any invalid value previously set. Note that two tests related to CRLs are required to fail with certain pre-set configurations, but we can rely on enforcing an empty value instead after the invalid set of values. Reported-by: Kyotaro Horiguchi Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Kyotaro Horiguchi Discussion: https://postgr.es/m/20220316.163658.1122740600489097632.horikyota.ntt@gmail.com backpatch-through: 10	2022-03-22 13:21:39 +09:00
Tom Lane	dfefe38fbe	Fix assorted missing logic for GroupingFunc nodes. The planner needs to treat GroupingFunc like Aggref for many purposes, in particular with respect to processing of the argument expressions, which are not to be evaluated at runtime. A few places hadn't gotten that memo, notably including subselect.c's processing of outer-level aggregates. This resulted in assertion failures or wrong plans for cases in which a GROUPING() construct references an outer aggregation level. Also fix missing special cases for GroupingFunc in cost_qual_eval (resulting in wrong cost estimates for GROUPING(), although it's not clear that that would affect plan shapes in practice) and in ruleutils.c (resulting in excess parentheses in pretty-print mode). Per bug #17088 from Yaoguang Chen. Back-patch to all supported branches. Richard Guo, Tom Lane Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org	2022-03-21 17:44:29 -04:00
Tom Lane	2241e5ceda	Fix risk of deadlock failure while dropping a partitioned index. DROP INDEX needs to lock the index's table before the index itself, else it will deadlock against ordinary queries that acquire the relation locks in that order. This is correctly mechanized for plain indexes by RangeVarCallbackForDropRelation; but in the case of a partitioned index, we neglected to lock the child tables in advance of locking the child indexes. We can fix that by traversing the inheritance tree and acquiring the needed locks in RemoveRelations, after we have acquired our locks on the parent partitioned table and index. While at it, do some refactoring to eliminate confusion between the actual and expected relkind in RangeVarCallbackForDropRelation. We can save a couple of syscache lookups too, by having that function pass back info that RemoveRelations will need. Back-patch to v11 where partitioned indexes were added. Jimmy Yih, Gaurab Dey, Tom Lane Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com	2022-03-21 12:22:13 -04:00
Tom Lane	36c3acb397	Doc: fix our example systemd script. The example used "TimeoutSec=0", but systemd's documented way to get the desired effect is "TimeoutSec=infinity". Discussion: https://postgr.es/m/164770078557.670.5467111518383664377@wrigleys.postgresql.org	2022-03-20 12:39:53 -04:00
Michael Paquier	fe14b0dd45	doc: Mention SET TABLESPACE clause for ALTER MATERIALIZED VIEW This command flavor is supported, but there was nothing in the documentation about it. Author: Yugo Nagata Discussion: https://postgr.es/m/20220316133337.5dc9740abfa24c25ec9f67f5@sraoss.co.jp Backpatch-through: 10	2022-03-19 16:37:43 +09:00
Tom Lane	88ae77588d	Fix incorrect xmlschema output for types timetz and timestamptz. The output of table_to_xmlschema() and allied functions includes a regex describing valid values for these types ... but the regex was itself invalid, as it failed to escape a literal "+" sign. Report and fix by Renan Soares Lopes. Back-patch to all supported branches. Discussion: https://postgr.es/m/7f6fabaa-3f8f-49ab-89ca-59fbfe633105@me.com	2022-03-18 16:01:42 -04:00
Tom Lane	5e144cc89b	Revert applying column aliases to the output of whole-row Vars. In commit `bf7ca1587`, I had the bright idea that we could make the result of a whole-row Var (that is, foo.) track any column aliases that had been applied to the FROM entry the Var refers to. However, that's not terribly logically consistent, because now the output of the Var is no longer of the named composite type that the Var claims to emit. `bf7ca1587` tried to handle that by changing the output tuple values to be labeled with a blessed RECORD type, but that's really pretty disastrous: we can wind up storing such tuples onto disk, whereupon they're not readable by other sessions. The only practical fix I can see is to give up on what `bf7ca1587` tried to do, and say that the column names of tuples produced by a whole-row Var are always those of the underlying named composite type, query aliases or no. While this introduces some inconsistencies, it removes others, so it's not that awful in the abstract. What is* kind of awful is to make such a behavioral change in a back-patched bug fix. But corrupt data is worse, so back-patched it will be. (A workaround available to anyone who's unhappy about this is to introduce an extra level of sub-SELECT, so that the whole-row Var is referring to the sub-SELECT's output and not to a named table type. Then the Var is of type RECORD to begin with and there's no issue.) Per report from Miles Delahunty. The faulty commit dates to 9.5, so back-patch to all supported branches. Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us	2022-03-17 18:18:05 -04:00
Tomas Vondra	27fafee72d	Fix publish_as_relid with multiple publications Commit `83fd4532a7` allowed publishing of changes via ancestors, for publications defined with publish_via_partition_root. But the way the ancestor was determined in get_rel_sync_entry() was incorrect, simply updating the same variable. So with multiple publications, replicating different ancestors, the outcome depended on the order of publications in the list - the value from the last loop was used, even if it wasn't the top-most ancestor. This is a probably rare situation, as in most cases publications do not overlap, so each partition has exactly one candidate ancestor to replicate as and there's no ambiguity. Fixed by tracking the "ancestor level" for each publication, and picking the top-most ancestor. Adds a test case, verifying the correct ancestor is used for publishing the changes and that this does not depend on order of publications in the list. Older releases have another bug in this loop - once all actions are replicated, the loop is terminated, on the assumption that inspecting additional publications is unecessary. But that misses the fact that those additional applications may replicate different ancestors. Fixed by removal of this break condition. We might still terminate the loop in some cases (e.g. when replicating all actions and the ancestor is the partition root). Backpatch to 13, where publish_via_partition_root was introduced. Initial report and fix by me, test added by Hou zj. Reviews and improvements by Amit Kapila. Author: Tomas Vondra, Hou zj, Amit Kapila Reviewed-by: Amit Kapila, Hou zj Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com	2022-03-16 18:14:33 +01:00
Alexander Korotkov	bad202c615	Fix default signature length for gist_ltree_ops `911e702077` implemented operator class parameters including the signature length in ltree. Previously, the signature length for gist_ltree_ops was 8. Because of bug `911e702077` the default signature length for gist_ltree_ops became 28 for ltree 1.1 (where options method is NOT provided) and 8 for ltree 1.2 (where options method is provided). This commit changes the default signature length for ltree 1.1 to 8. Existing gist_ltree_ops indexes might be corrupted in various scenarios. Thus, we have to recommend reindexing all the gist_ltree_ops indexes after the upgrade. Reported-by: Victor Yegorov Reviewed-by: Tomas Vondra, Tom Lane, Andres Freund, Nikita Glukhov Reviewed-by: Andrew Dunstan Author: Tomas Vondra, Alexander Korotkov Discussion: https://postgr.es/m/17406-71e02820ae79bb40%40postgresql.org Discussion: https://postgr.es/m/d80e0a55-6c3e-5b26-53e3-3c4f973f737c%40enterprisedb.com	2022-03-16 11:41:34 +03:00
Thomas Munro	51e760e5a6	Fix race between DROP TABLESPACE and checkpointing. Commands like ALTER TABLE SET TABLESPACE may leave files for the next checkpoint to clean up. If such files are not removed by the time DROP TABLESPACE is called, we request a checkpoint so that they are deleted. However, there is presently a window before checkpoint start where new unlink requests won't be scheduled until the following checkpoint. This means that the checkpoint forced by DROP TABLESPACE might not remove the files we expect it to remove, and the following ERROR will be emitted: ERROR: tablespace "mytblspc" is not empty To fix, add a call to AbsorbSyncRequests() just before advancing the unlink cycle counter. This ensures that any unlink requests forwarded prior to checkpoint start (i.e., when ckpt_started is incremented) will be processed by the current checkpoint. Since AbsorbSyncRequests() performs memory allocations, it cannot be called within a critical section, so we also need to move SyncPreCheckpoint() to before CreateCheckPoint()'s critical section. This is an old bug, so back-patch to all supported versions. Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220215235845.GA2665318%40nathanxps13	2022-03-16 17:21:19 +13:00
Michael Paquier	028a3c6b1c	pageinspect: Fix memory context allocation of page in brin_revmap_data() This caused the function to fail, as the aligned copy of the raw page given by the function caller was not saved in the correct memory context, which needs to be multi_call_memory_ctx in this case. Issue introduced by `076f4d9`. Per buildfarm members sifika, mylodon and longfin. I have reproduced that locally with macos. Discussion: https://postgr.es/m/YjFPOtfCW6yLXUeM@paquier.xyz Backpatch-through: 10	2022-03-16 12:29:55 +09:00
Thomas Munro	cfdb303be7	Fix waiting in RegisterSyncRequest(). If we run out of space in the checkpointer sync request queue (which is hopefully rare on real systems, but common with very small buffer pool), we wait for it to drain. While waiting, we should report that as a wait event so that users know what is going on, and also handle postmaster death, since otherwise the loop might never terminate if the checkpointer has exited. Back-patch to 12. Although the problem exists in earlier releases too, the code is structured differently before 12 so I haven't gone any further for now, in the absence of field complaints. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de	2022-03-16 15:37:15 +13:00
Michael Paquier	d3a9b83c30	pageinspect: Fix handling of page sizes and AM types This commit fixes a set of issues related to the use of the SQL functions in this module when the caller is able to pass down raw page data as input argument: - The page size check was fuzzy in a couple of places, sometimes looking after only a sub-range, but what we are looking for is an exact match on BLCKSZ. After considering a few options here, I have settled down to do a generalization of get_page_from_raw(). Most of the SQL functions already used that, and this is not strictly required if not accessing an 8-byte-wide value from a raw page, but this feels safer in the long run for alignment-picky environment, particularly if a code path begins to access such values. This also reduces the number of strings that need to be translated. - The BRIN function brin_page_items() uses a Relation but it did not check the access method of the opened index, potentially leading to crashes. All the other functions in need of a Relation already did that. - Some code paths could fail on elog(), but we should to use ereport() for failures that can be triggered by the user. Tests are added to stress all the cases that are fixed as of this commit, with some junk raw pages (\set VERBOSITY ensures that this works across all page sizes) and unexpected index types when functions open relations. Author: Michael Paquier, Justin Prysby Discussion: https://postgr.es/m/20220218030020.GA1137@telsasoft.com Backpatch-through: 10	2022-03-16 11:20:51 +09:00
Thomas Munro	5610411ac7	Back-patch LLVM 14 API changes. Since LLVM 14 has stopped changing and is about to be released, back-patch the following changes from the master branch: `e6a7600202` `807fee1a39` `a56e7b6601` Back-patch to 11, where LLVM JIT support came in.	2022-03-16 11:41:13 +13:00
Michael Paquier	ef00502a5a	doc: Add ALTER/DROP ROUTINE to the event trigger matrix ALTER ROUTINE triggers the events ddl_command_start and ddl_command_end, and DROP ROUTINE triggers sql_drop, ddl_command_start and ddl_command_end, but this was not mention on the matrix table. Reported-by: Leslie Lemaire Discussion: https://postgr.es/m/164647533363.646.5802968483136493025@wrigleys.postgresql.org Backpatch-through: 11	2022-03-09 14:59:22 +09:00
Noah Misch	29ec94efd0	Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts. Slow hosts may avoid load-induced, spurious failures by setting environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds greater than 180. Developers may see faster failures by setting that environment variable to some lesser number of seconds. In tests, write $PostgreSQL::Test::Utils::timeout_default wherever the convention has been to write 180. This change raises the default for some briefer timeouts. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com	2022-03-04 18:53:17 -08:00
Tom Lane	59392746cb	Fix pg_regress to print the correct postmaster address on Windows. pg_regress reported "Unix socket" as the default location whenever HAVE_UNIX_SOCKETS is defined. However, that's not been accurate on Windows since `8f3ec75de`. Update this logic to match what libpq actually does now. This is just cosmetic, but still it's potentially misleading. Back-patch to v13 where `8f3ec75de` came in. Discussion: https://postgr.es/m/3894060.1646415641@sss.pgh.pa.us	2022-03-04 13:23:58 -05:00
Tom Lane	97031f4404	Fix bogus casting in BlockIdGetBlockNumber(). This macro cast the result to BlockNumber after shifting, not before, which is the wrong thing. Per the C spec, the uint16 fields would promote to int not unsigned int, so that (for 32-bit int) the shift potentially shifts a nonzero bit into the sign position. I doubt there are any production systems where this would actually end with the wrong answer, but it is undefined behavior per the C spec, and clang's -fsanitize=undefined option reputedly warns about it on some platforms. (I can't reproduce that right now, but the code is undeniably wrong per spec.) It's easy to fix by casting to BlockNumber (uint32) in the proper places. It's been wrong for ages, so back-patch to all supported branches. Report and patch by Zhihong Yu (cosmetic tweaking by me) Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com	2022-03-03 19:03:42 -05:00
Tom Lane	1a027e6b7b	Clean up assorted failures under clang's -fsanitize=undefined checks. Most of these are cases where we could call memcpy() or other libc functions with a NULL pointer and a zero count, which is forbidden by POSIX even though every production version of libc allows it. We've fixed such things before in a piecemeal way, but apparently never made an effort to try to get them all. I don't claim that this patch does so either, but it gets every failure I observe in check-world, using clang 12.0.1 on current RHEL8. numeric.c has a different issue that the sanitizer doesn't like: "ln(-1.0)" will compute log10(0) and then try to assign the resulting -Inf to an integer variable. We don't actually use the result in such a case, so there's no live bug. Back-patch to all supported branches, with the idea that we might start running a buildfarm member that tests this case. This includes back-patching `c1132aae3` (Check the size in COPY_POINTER_FIELD), which previously silenced some of these issues in copyfuncs.c. Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com	2022-03-03 18:13:24 -05:00
Tom Lane	6599d8f126	Allow root-owned SSL private keys in libpq, not only the backend. This change makes libpq apply the same private-key-file ownership and permissions checks that we have used in the backend since commit `9a83564c5`. Namely, that the private key can be owned by either the current user or root (with different file permissions allowed in the two cases). This allows system-wide management of key files, which is just as sensible on the client side as the server, particularly when the client is itself some application daemon. Sync the comments about this between libpq and the backend, too. Back-patch of `a59c79564` and `50f03473e` into all supported branches. David Steele Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net	2022-03-02 11:57:02 -05:00
Tom Lane	9b2d762a28	Disallow execution of SPI functions during plperl function compilation. Perl can be convinced to execute user-defined code during compilation of a plperl function (or at least a plperlu function). That's not such a big problem as long as the activity is confined within the Perl interpreter, and it's not clear we could do anything about that anyway. However, if such code tries to use plperl's SPI functions, we have a bigger problem. In the first place, those functions are likely to crash because current_call_data->prodesc isn't set up yet. In the second place, because it isn't set up, we lack critical info such as whether the function is supposed to be read-only. And in the third place, this path allows code execution during function validation, which is strongly discouraged because of the potential for security exploits. Hence, reject execution of the SPI functions until compilation is finished. While here, add check_spi_usage_allowed() calls to various functions that hadn't gotten the memo about checking that. I think that perhaps plperl_sv_to_literal may have been intentionally omitted on the grounds that it was safe at the time; but if so, the addition of transforms functionality changed that. The others are more recently added and seem to be flat-out oversights. Per report from Mark Murawski. Back-patch to all supported branches. Discussion: https://postgr.es/m/9acdf918-7fff-4f40-f750-2ffa84f083d2@intellasoft.net	2022-02-25 17:40:44 -05:00
Andres Freund	0b1020a96d	pg_waldump: Fix error message for WAL files smaller than XLOG_BLCKSZ. When opening a WAL file smaller than XLOG_BLCKSZ (e.g. 0 bytes long) while determining the wal_segment_size, pg_waldump checked errno, despite errno not being set by the short read. Resulting in a bogus error message. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220214.181847.775024684568733277.horikyota.ntt@gmail.com Backpatch: 11-, the bug was introducedin `fc49e24fa`	2022-02-25 10:32:38 -08:00
Andres Freund	c2551483e0	Fix temporary object cleanup failing due to toast access without snapshot. When cleaning up temporary objects during process exit the cleanup could fail with: FATAL: cannot fetch toast data without an active snapshot The bug is caused by RemoveTempRelationsCallback() not setting up a snapshot. If an object with toasted catalog data needs to be cleaned up, init_toast_snapshot() could fail with the above error. Most of the time however the the problem is masked due to cached catalog snapshots being returned by GetOldestSnapshot(). But dropping an object can cause catalog invalidations to be emitted. If no further catalog accesses are necessary between the invalidation processing and the next toast datum deletion, the bug becomes visible. It's easy to miss this bug because it typically happens after clients disconnect and the FATAL error just ends up in the log. Luckily temporary table cleanup at the next use of the same temporary schema or during DISCARD ALL does not have the same problem. Fix the bug by pushing a snapshot in RemoveTempRelationsCallback(). Also add isolation tests for temporary object cleanup, including objects with toasted catalog data. A future HEAD only commit will add more assertions. Reported-By: Miles Delahunty Author: Andres Freund Discussion: https://postgr.es/m/CAOFAq3BU5Mf2TTvu8D9n_ZOoFAeQswuzk7yziAb7xuw_qyw5gw@mail.gmail.com Backpatch: 10-	2022-02-21 08:59:30 -08:00
Andrew Dunstan	a5716baac0	Remove most msys special processing in TAP tests Following migration of Windows buildfarm members running TAP tests to use of ucrt64 perl for those tests, special processing for msys perl is no longer necessary and so is removed. Backpatch to release 10 Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net	2022-02-20 11:49:33 -05:00
Andrew Dunstan	c27aa21e0d	Remove PostgreSQL::Test::Utils::perl2host completely Commit `f1ac4a74de` disabled this processing, and as nothing has broken (as expected) here we proceed to remove the routine and adjust all the call sites. Backpatch to release 10 Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de	2022-02-20 10:35:49 -05:00
Tom Lane	46b738ffc2	Suppress warning about stack_base_ptr with late-model GCC. GCC 12 complains that set_stack_base is storing the address of a local variable in a long-lived pointer. This is an entirely reasonable warning (indeed, it just helped us find a bug); but that behavior is intentional here. We can work around it by using __builtin_frame_address(0) instead of a specific local variable; that produces an address a dozen or so bytes different, in my testing, but we don't care about such a small difference. Maybe someday a compiler lacking that function will start to issue a similar warning, but we'll worry about that when it happens. Patch by me, per a suggestion from Andres Freund. Back-patch to v12, which is as far back as the patch will go without some pain. (Recently-established project policy would permit a back-patch as far as 9.2, but I'm disinclined to expend the work until GCC 12 is much more widespread.) Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us	2022-02-17 22:45:34 -05:00
Etsuro Fujita	0dc0284ade	Doc: Update documentation for modifying postgres_fdw foreign tables. Document that they can be modified using COPY as well. Back-patch to v11 where commit `3d956d956` added support for COPY in postgres_fdw.	2022-02-16 15:15:04 +09:00
Amit Kapila	caa231be97	WAL log unchanged toasted replica identity key attributes. Currently, during UPDATE, the unchanged replica identity key attributes are not logged separately because they are getting logged as part of the new tuple. But if they are stored externally then the untoasted values are not getting logged as part of the new tuple and logical replication won't be able to replicate such UPDATEs. So we need to log such attributes as part of the old_key_tuple during UPDATE. Reported-by: Haiying Tang Author: Dilip Kumar and Amit Kapila Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com	2022-02-14 08:24:44 +05:30
Alexander Korotkov	ac2303aa03	Fix memory leak in IndexScan node with reordering Fix ExecReScanIndexScan() to free the referenced tuples while emptying the priority queue. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com Author: Aliaksandr Kalenik Reviewed-by: Tom Lane, Alexander Korotkov Backpatch-through: 10	2022-02-14 03:32:34 +03:00
Tom Lane	51ee561f56	Fix thinko in PQisBusy(). In commit `1f39a1c06` I made PQisBusy consider conn->write_failed, but that is now looking like complete brain fade. In the first place, the logic is quite wrong: it ought to be like "and not" rather than "or". This meant that once we'd gotten into a write_failed state, PQisBusy would always return true, probably causing the calling application to iterate its loop until PQconsumeInput returns a hard failure thanks to connection loss. That's not what we want: the intended behavior is to return an error PGresult, which the application probably has much cleaner support for. But in the second place, checking write_failed here seems like the wrong thing anyway. The idea of the write_failed mechanism is to postpone handling of a write failure until we've read all we can from the server; so that flag should not interfere with input-processing behavior. (Compare 7247e243a.) What we should check for is status = CONNECTION_BAD, ie, socket already closed. (Most places that close the socket don't touch asyncStatus, but they do reset status.) This primarily ensures that if PQisBusy() returns true then there is an open socket, which is assumed by several call sites in our own code, and probably other applications too. While at it, fix a nearby thinko in libpq's my_sock_write: we should only consult errno for res < 0, not res == 0. This is harmless since pqsecure_raw_write would force errno to zero in such a case, but it still could confuse readers. Noted by Andres Freund. Backpatch to v12 where `1f39a1c06` came in. Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de	2022-02-12 13:23:20 -05:00
Tom Lane	0778b24ced	Don't use_physical_tlist for an IOS with non-returnable columns. createplan.c tries to save a runtime projection step by specifying a scan plan node's output as being exactly the table's columns, or index's columns in the case of an index-only scan, if there is not a reason to do otherwise. This logic did not previously pay attention to whether an index's columns are returnable. That worked, sort of accidentally, until commit `9a3ddeb51` taught setrefs.c to reject plans that try to read a non-returnable column. I have no desire to loosen setrefs.c's new check, so instead adjust use_physical_tlist() to not try to optimize this way when there are non-returnable column(s). Per report from Ryan Kelly. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com	2022-02-11 15:23:52 -05:00
Tom Lane	d0e1fd9587	Make pg_ctl stop/restart/promote recheck postmaster aliveness. "pg_ctl stop/restart" checked that the postmaster PID is valid just once, as a side-effect of sending the stop signal, and then would wait-till-timeout for the postmaster.pid file to go away. This neglects the case wherein the postmaster dies uncleanly after we signal it. Similarly, once "pg_ctl promote" has sent the signal, it'd wait for the corresponding on-disk state change to occur even if the postmaster dies. I'm not sure how we've managed not to notice this problem, but it seems to explain slow execution of the 017_shm.pl test script on AIX since commit `4fdbf9af5`, which added a speculative "pg_ctl stop" with the idea of making real sure that the postmaster isn't there. In the test steps that kill-9 and then restart the postmaster, it's possible to get past the initial signal attempt before kill() stops working for the doomed postmaster. If that happens, pg_ctl waited till PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members have that set very high. To fix, include a "kill(pid, 0)" test (similar to what postmaster_is_alive uses) in these wait loops, so that we'll give up immediately if the postmaster PID disappears. While here, I chose to refactor those loops out of where they were. do_stop() and do_restart() can perfectly well share one copy of the wait-for-stop loop, and it seems desirable to put a similar function beside that for wait-for-promote. Back-patch to all supported versions, since pg_ctl's wait logic is substantially identical in all, and we're seeing the slow test behavior in all branches. Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com	2022-02-10 16:49:39 -05:00
Andrew Dunstan	eec7c640fe	Use gendef instead of pexports for building windows .def files Modern msys systems lack pexports but have gendef instead, so use that. Discussion: https://postgr.es/m/3ccde7a9-e4f9-e194-30e0-0936e6ad68ba@dunslane.net Backpatch to release 9.4 to enable building with perl on older branches. Before that pexports is not used for plperl.	2022-02-10 13:51:40 -05:00
Noah Misch	b3558cc96c	Use Test::Builder::todo_start(), replacing $::TODO. Some pre-2017 Test::More versions need perfect $Test::Builder::Level maintenance to find the variable. Buildfarm member snapper reported an overall failure that the file intended to hide via the TODO construct. That trouble was reachable in v11 and v10. For later branches, this serves as defense in depth. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220202055556.GB2745933@rfd.leadboat.com	2022-02-09 18:17:03 -08:00
Noah Misch	1b8462bf1d	Fix back-patch of "Avoid race in RelationBuildDesc() ..." The back-patch of commit `fdd965d074` broke CLOBBER_CACHE_ALWAYS for v9.6 through v13. It updated the InvalidateSystemCaches() call for CLOBBER_CACHE_RECURSIVELY, neglecting the one for CLOBBER_CACHE_ALWAYS. Back-patch to v13, v12, v11, and v10. Reviewed by Tomas Vondra. Reported by Tomas Vondra. Discussion: https://postgr.es/m/df7b4c0b-7d92-f03f-75c4-9e08b269a716@enterprisedb.com	2022-02-09 18:16:56 -08:00
Tom Lane	5ea3b99de0	Remove ppport.h's broken re-implementation of eval_pv(). Recent versions of Devel::PPPort try to redefine eval_pv() to dodge a bug in pre-5.31 Perl versions. Unfortunately the redefinition fails on compilers that don't support statements nested within expressions. However, we aren't actually interested in this bug fix, since we always call eval_pv() with croak_on_error = FALSE. So, until there's an upstream fix for this breakage, just comment out the macro to revert to the older behavior. Per report from Wei Sun, as well as previous buildfarm failure on pademelon (which I'd unfortunately not looked at carefully enough to understand the cause). Back-patch to all supported versions, since we're using the same ppport.h in all. Discussion: https://postgr.es/m/tencent_2EFCC8BA0107B6EC0F97179E019A8A43C806@qq.com Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pademelon&dt=2022-02-02%2001%3A22%3A58	2022-02-08 19:26:17 -05:00
Tom Lane	d7db957207	Stamp 13.6.	2022-02-07 16:17:41 -05:00
Peter Eisentraut	7f7148b72e	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 9aa8bc576f9af5c61de4a6fc8119abfa36493d01	2022-02-07 13:36:22 +01:00
Tom Lane	763c7aac57	Release notes for 14.2, 13.6, 12.10, 11.15, 10.20.	2022-02-06 14:24:55 -05:00
Tom Lane	7d74ff8243	Doc: be clearer that foreign-table partitions need user-added constraints. A very well-informed user might deduce this from what we said already, but I'd bet against it. Lay it out explicitly. While here, rewrite the comment about tuple routing to be more intelligible to an average SQL user. Per bug #17395 from Alexander Lakhin. Back-patch to v11. (The text in this area is different in v10 and I'm not sufficiently excited about this point to adapt the patch.) Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org	2022-02-05 12:55:44 -05:00
Tom Lane	9c7cf083f2	Test, don't just Assert, that mergejoin's inputs are in order. There are two Asserts in nodeMergejoin.c that are reachable if the input data is not in the expected order. This seems way too fragile. Alexander Lakhin reported a case where the assertions could be triggered with misconfigured foreign-table partitions, and bitter experience with unstable operating system collation definitions suggests another easy route to hitting them. Neither Assert is in a place where we can't afford one more test-and-branch, so replace 'em with plain test-and-elog logic. Per bug #17395. While the reported symptom is relatively recent, collation changes could happen anytime, so back-patch to all supported branches. Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org	2022-02-05 11:59:30 -05:00

1 2 3 4 5 ...

50484 Commits All Branches Search

50484 Commits

All Branches