postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-07-18 09:41:08 +02:00

Author	SHA1	Message	Date
Peter Eisentraut	8b2638fdd4	Add missing fields to _outConstraint() As of `897795240c`, check constraints can be declared invalid. But that patch didn't update _outConstraint() to also show the relevant struct fields (which were only applicable to foreign keys before that). This currently only affects debugging output, so no impact in practice.	2022-08-13 10:37:53 +02:00
Peter Eisentraut	8e40d16e95	pg_upgrade: Fix some minor code issues `96ef3b8ff1` accidentally copied a not applicable comment from the float8_pass_by_value code to the data_checksums code. Remove that. `87d3b35a1c` changed pg_upgrade to checking the checksum version rather than just the Boolean presence of checksums, but didn't change the field type in its ControlData struct from bool. So this would not work correctly if there ever is a checksum version larger than 1.	2022-08-13 00:15:37 +02:00
Peter Eisentraut	f70dfbf36f	Fix _outConstraint() for "identity" constraints The set of fields printed by _outConstraint() in the CONSTR_IDENTITY case didn't match the set of fields actually used in that case. (The code was probably uncarefully copied from the CONSTR_DEFAULT case.) Fix that by using the right set of fields. Since there is no read support for this node type, this is really just for debugging output right now, so it doesn't affect anything important.	2022-08-12 08:52:38 +02:00
Amit Kapila	5afa63f0ae	Back-Patch "Add wait_for_subscription_sync for TAP tests." This was originally done in commit `0c20dd33db` for 16 only, to eliminate duplicate code and as an infrastructure that makes it easier to write future tests. However, it has been suggested that it would be good to back-patch this testing infrastructure to aid future tests in back-branches. Backpatch to all supported versions. Author: Masahiko Sawada Reviewed by: Amit Kapila, Shi yu Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com Discussion: https://postgr.es/m/E1oJBIf-0006sw-SA@gemulon.postgresql.org	2022-08-12 11:04:46 +05:30
Amit Kapila	547b963683	Fix catalog lookup with the wrong snapshot during logical decoding. Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, after the restart, if the logical decoding decodes only the commit record of the transaction that has actually modified a catalog, we will miss adding its XID to the snapshot. Thus, we will end up looking at catalogs with the wrong snapshot. To fix this problem, this changes the snapshot builder so that it remembers the last-running-xacts list of the decoded RUNNING_XACTS record after restoring the previously serialized snapshot. Then, we mark the transaction as containing catalog changes if it's in the list of initial running transactions and its commit record has XACT_XINFO_HAS_INVALS. To avoid ABI breakage, we store the array of the initial running transactions in the static variables InitialRunningXacts and NInitialRunningXacts, instead of storing those in SnapBuild or ReorderBuffer. This approach has a false positive; we could end up adding the transaction that didn't change catalog to the snapshot since we cannot distinguish whether the transaction has catalog changes only by checking the COMMIT record. It doesn't have the information on which (sub) transaction has catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate that the transaction has catalog change. But that won't be a problem since we use snapshot built during decoding only to read system catalogs. On the master branch, we took a more future-proof approach by writing catalog modifying transactions to the serialized snapshot which avoids the above false positive. But we cannot backpatch it because of a change in the SnapBuild. Reported-by: Mike Oh Author: Masahiko Sawada Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi Backpatch-through: 10 Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com	2022-08-11 09:30:55 +05:30
Tom Lane	71caf3c4da	Fix handling of R/W expanded datums that are passed to SQL functions. fmgr_sql must make expanded-datum arguments read-only, because it's possible that the function body will pass the argument to more than one callee function. If one of those functions takes the datum's R/W property as license to scribble on it, then later callees will see an unexpected value, leading to wrong answers. From a performance standpoint, it'd be nice to skip this in the common case that the argument value is passed to only one callee. However, detecting that seems fairly hard, and certainly not something that I care to attempt in a back-patched bug fix. Per report from Adam Mackler. This has been broken since we invented expanded datums, so back-patch to all supported branches. Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us	2022-08-10 13:37:25 -04:00
Tom Lane	22b205cbbd	Stabilize output of new regression test. Per buildfarm, the output order of \dx+ isn't consistent across locales. Apply NO_LOCALE to force C locale. There might be a more localized way, but I'm not seeing it offhand, and anyway there is nothing in this test module that particularly cares about locales. Security: CVE-2022-2625	2022-08-08 12:16:01 -04:00
Tom Lane	7e92f78abe	In extensions, don't replace objects not belonging to the extension. Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625	2022-08-08 11:12:31 -04:00
Alvaro Herrera	330c48b284	Translation updates Source-Git-URL: ssh://git@git.postgresql.org/pgtranslation/messages.git Source-Git-Hash: 8ee19d25e0753a690bea62ddcbbfaf2e0d093c1d	2022-08-08 12:39:52 +02:00
Alvaro Herrera	1626590f2e	Remove unportable use of timezone in recent test Per buildfarm member snapper Discussion: https://postgr.es/m/129951.1659812518@sss.pgh.pa.us	2022-08-07 10:19:40 +02:00
Alvaro Herrera	8c5d9ccca9	Improve recently-added test reliability Commit `59be1c942a` already tried to make src/test/recovery/t/033_replay_tsp_drops more reliable, but it wasn't enough. Try to improve on that by making this use of a replication slot to be more like others. Also, don't drop the slot. Make a few other stylistic changes while at it. It's still quite slow, which is another thing that we need to fix in this script. Backpatch to all supported branches. Discussion: https://postgr.es/m/349302.1659191875@sss.pgh.pa.us	2022-08-06 15:52:10 +02:00
Tom Lane	476f9d5330	Partially undo commit `94da73281`. On closer inspection, mcv.c isn't as broken for ScalarArrayOpExpr as I thought. The Var-on-right issue is real enough, but actually it does cope fine with a NULL array constant --- I was misled by an XXX comment suggesting it didn't. Undo that part of the code change, and replace the XXX comment with something less misleading.	2022-08-05 15:57:46 -04:00
Tom Lane	c102d11067	Fix non-bulletproof ScalarArrayOpExpr code for extended statistics. statext_is_compatible_clause_internal() checked that the arguments of a ScalarArrayOpExpr are one Var and one Const, but it would allow cases where the Const was on the left. Subsequent uses of the clause are not expecting that and would suffer assertion failures or core dumps. mcv.c also had not bothered to cope with the case of a NULL array constant, which seems really unacceptably sloppy of somebody. (Although our tools failed us there too, since AFAIK neither Coverity nor any compiler warned of the obvious use-of-uninitialized-variable condition.) It seems best to handle that by having statext_is_compatible_clause_internal() reject it. Noted while fixing bug #17570. Back-patch to v13 where the extended stats code grew some awareness of ScalarArrayOpExpr.	2022-08-05 13:58:49 -04:00
Alvaro Herrera	de31e6f81e	BRIN: mask BRIN_EVACUATE_PAGE for WAL consistency checking That bit is unlogged and therefore it's wrong to consider it in WAL page comparison. Add a test that tickles the case, as branch testing technology allows. This has been a problem ever since wal consistency checking was introduced (commit `a507b86900` for pg10), so backpatch to all supported branches. Author: 王海洋 (Haiyang Wang) <wanghaiyang.001@bytedance.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com	2022-08-05 18:00:17 +02:00
Noah Misch	ad8ebcfe96	Add HINT for restartpoint race with KeepFileRestoredFromArchive(). The five commits ending at `cc2c7d65fc` closed this race condition for v15+. For v14 through v10, add a HINT to discourage studying the cosmetic problem. Reviewed by Kyotaro Horiguchi and David Steele. Discussion: https://postgr.es/m/20220731061747.GA3692882@rfd.leadboat.com	2022-08-05 08:31:01 -07:00
Alvaro Herrera	d2a74621ed	regress: fix test instability Having additional triggers in a test table made the ORDER BY clauses in old queries underspecified. Add another column there for stability. Per sporadic buildfarm pink.	2022-08-05 11:55:52 +02:00
Alvaro Herrera	ab85566301	Fix ENABLE/DISABLE TRIGGER to handle recursion correctly Using ATSimpleRecursion() in ATPrepCmd() to do so as `bbb927b4db` did is not correct, because ATPrepCmd() can't distinguish between triggers that may be cloned and those that may not, so would wrongly try to recurse for the latter category of triggers. So this commit restores the code in EnableDisableTrigger() that `86f575948c` had added to do the recursion, which would do it only for triggers that may be cloned, that is, row-level triggers. This also changes tablecmds.c such that ATExecCmd() is able to pass the value of ONLY flag down to EnableDisableTrigger() using its new 'recurse' parameter. This also fixes what seems like an oversight of `86f575948c` that the recursion to partition triggers would only occur if EnableDisableTrigger() had actually changed the trigger. It is more apt to recurse to inspect partition triggers even if the parent's trigger didn't need to be changed: only then can we be certain that all descendants share the same state afterwards. Backpatch all the way back to 11, like `bbb927b4db`. Care is taken not to break ABI compatibility (and that no catversion bump is needed.) Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com	2022-08-05 09:47:06 +02:00
Tom Lane	23edf0e8b4	Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop. Ordinarily the functions called in this loop ought to have plenty of CFIs themselves; but we've now seen a case where no such CFI is reached, making the loop uninterruptible. Even though that's from a recently-introduced bug, it seems prudent to install a CFI at the loop level in all branches. Per discussion of bug #17558 from Andrew Kesper (an actual fix for that bug will follow). Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org	2022-08-04 14:10:06 -04:00
Tom Lane	8d38ccafca	Add proper regression test for the recent SRFs-in-pathkeys problem. Remove the test case added by commit `fac1b470a`, which never actually worked to expose the problem it claimed to test. Replace it with a case that does expose the problem, and also covers the SRF-not- at-the-top deficiency repaired in `1aa8dad41`. Richard Guo, with some editorialization by me Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org	2022-08-04 11:11:22 -04:00
Tom Lane	da4ed75881	Fix incorrect tests for SRFs in relation_can_be_sorted_early(). Commit `fac1b470a` thought we could check for set-returning functions by testing only the top-level node in an expression tree. This is wrong in itself, and to make matters worse it encouraged others to make the same mistake, by exporting tlist.c's special-purpose IS_SRF_CALL() as a widely-visible macro. I can't find any evidence that anyone's taken the bait, but it was only a matter of time. Use expression_returns_set() instead, and stuff the IS_SRF_CALL() genie back in its bottle, this time with a warning label. I also added a couple of cross-reference comments. After a fair amount of fooling around, I've despaired of making a robust test case that exposes the bug reliably, so no test case here. (Note that the test case added by `fac1b470a` is itself broken, in that it doesn't notice if you remove the code change. The repro given by the bug submitter currently doesn't fail either in v15 or HEAD, though I suspect that may indicate an unrelated bug.) Per bug #17564 from Martijn van Oosterhout. Back-patch to v13, as the faulty patch was. Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org	2022-08-03 17:33:42 -04:00
Tom Lane	b2694aebe3	Reduce test runtime of src/test/modules/snapshot_too_old. The sto_using_cursor and sto_using_select tests were coded to exercise every permutation of their test steps, but AFAICS there is no value in exercising more than one. This matters because each permutation costs about six seconds, thanks to the "pg_sleep(6)". Perhaps we could reduce that, but the useless permutations seem worth getting rid of in any case. (Note that sto_using_hash_index got it right already.) While here, clean up some other sloppiness such as an unused table. This doesn't make too much difference in interactive testing, since the wasted time is typically masked by parallelization with other tests. However, the buildfarm runs this as a serial step, which means we can expect to shave ~40 seconds from every buildfarm run. That makes it worth back-patching. Discussion: https://postgr.es/m/2515192.1659454702@sss.pgh.pa.us	2022-08-03 11:14:55 -04:00
Tom Lane	331f8b851c	Check maximum number of columns in function RTEs, too. I thought commit `fd96d14d9` had plugged all the holes of this sort, but no, function RTEs could produce oversize tuples too, either via long coldeflists or just from multiple functions in one RTE. (I'm pretty sure the other variants of base RTEs aren't a problem, because they ultimately refer to either a table or a sub-SELECT, whose widths are enforced elsewhere. But we explicitly allow join RTEs to be overwidth, as long as you don't try to form their tuple result.) Per further discussion of bug #17561. As before, patch all branches. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org	2022-08-01 12:22:35 -04:00
Michael Paquier	aadaaeff4c	Fix error reporting after ioctl() call with pg_upgrade --clone errno was not reported correctly after attempting to clone a file, leading to incorrect error reports. While scanning through the code, I have not noticed any similar mistakes. Error introduced in `3a769d8`. Author: Justin Pryzby Discussion: https://postgr.es/m/20220731134135.GY15006@telsasoft.com Backpatch-through: 12	2022-08-01 16:39:30 +09:00
Andrew Dunstan	b76e136ceb	Fix new recovery test for log_error_verbosity=verbose case The new test is from commit `9e4f914b5e`. With this setting messages have SQL error numbers included, so that needs to be provided for in the pattern looked for. Backpatch to all live branches like the original.	2022-07-29 18:17:36 -04:00
Tom Lane	ba2002d02c	In transformRowExpr(), check for too many columns in the row. A RowExpr with more than MaxTupleAttributeNumber columns would fail at execution anyway, since we cannot form a tuple datum with more than that many columns. While heap_form_tuple() has a check for too many columns, it emerges that there are some intermediate bits of code that don't check and can be driven to failure with sufficiently many columns. Checking this at parse time seems like the most appropriate place to install a defense, since we already check SELECT list length there. While at it, make the SELECT-list-length error use the same errcode (TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic PROGRAM_LIMIT_EXCEEDED. Per bug #17561 from Egor Chindyaskin. The given test case crashes in all supported branches (and probably a lot further back), so patch all. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org	2022-07-29 13:30:50 -04:00
Alvaro Herrera	7cfe688dee	Fix test instability On FreeBSD, the new test fails due to a WAL file being removed before the standby has had the chance to copy it. Fix by adding a replication slot to prevent the removal until after the standby has connected. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reported-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wj5nau_qpjbwihvmXLfkAWOZ5TKdbnqOc6nKSiRJEoPyQ@mail.gmail.com	2022-07-29 12:50:47 +02:00
Alvaro Herrera	9a7e26b9c2	Fix replay of create database records on standby Crash recovery on standby may encounter missing directories when replaying database-creation WAL records. Prior to this patch, the standby would fail to recover in such a case; however, the directories could be legitimately missing. Consider the following sequence of commands: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, crash recovery must be able to continue. A fix for this problem was already attempted in `49d9cfc68b`, but it was reverted because of design issues. This new version is based on Robert Haas' proposal: any missing tablespaces are created during recovery before reaching consistency. Tablespaces are created as real directories, and should be deleted by later replay. CheckRecoveryConsistency ensures they have disappeared. The problems detected by this new code are reported as PANIC, except when allow_in_place_tablespaces is set to ON, in which case they are WARNING. Apart from making tests possible, this gives users an escape hatch in case things don't go as planned. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Author: Paul Guo <paulguo@gmail.com> Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions) Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions) Reviewed-by: Michaël Paquier <michael@paquier.xyz> Diagnosed-by: Paul Guo <paulguo@gmail.com> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com	2022-07-28 08:26:05 +02:00
Alvaro Herrera	16e7a8fd8e	Allow "in place" tablespaces. This is a backpatch to branches 10-14 of the following commits: `7170f2159f` Allow "in place" tablespaces. `c6f2f01611` Fix pg_basebackup with in-place tablespaces. `f6f0db4d62` Fix pg_tablespace_location() with in-place tablespaces `7a7cd84893` doc: Remove mention to in-place tablespaces for pg_tablespace_location() `5344723755` Remove unnecessary Windows-specific basebackup code. In-place tablespaces were introduced as a testing helper mechanism, but they are going to be used for a bugfix in WAL replay to be backpatched to all stable branches. I (Álvaro) had to adjust some code to account for lack of get_dirent_type() in branches prior to 14. Author: Thomas Munro <thomas.munro@gmail.com> Author: Michaël Paquier <michael@paquier.xyz> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql	2022-07-27 07:55:13 +02:00
Tom Lane	6c193c2ace	Force immediate commit after CREATE DATABASE etc in extended protocol. We have a few commands that "can't run in a transaction block", meaning that if they complete their processing but then we fail to COMMIT, we'll be left with inconsistent on-disk state. However, the existing defenses for this are only watertight for simple query protocol. In extended protocol, we didn't commit until receiving a Sync message. Since the client is allowed to issue another command instead of Sync, we're in trouble if that command fails or is an explicit ROLLBACK. In any case, sitting in an inconsistent state while waiting for a client message that might not come seems pretty risky. This case wasn't reachable via libpq before we introduced pipeline mode, but it's always been an intended aspect of extended query protocol, and likely there are other clients that could reach it before. To fix, set a flag in PreventInTransactionBlock that tells exec_execute_message to force an immediate commit. This seems to be the approach that does least damage to existing working cases while still preventing the undesirable outcomes. While here, add some documentation to protocol.sgml that explicitly says how to use pipelining. That's latent in the existing docs if you know what to look for, but it's better to spell it out; and it provides a place to document this new behavior. Per bug #17434 from Yugo Nagata. It's been wrong for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org	2022-07-26 13:07:03 -04:00
Tom Lane	5b5d435139	Fix ruleutils issues with dropped cols in functions-returning-composite. Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. `9b35ddce9`, `2c4debbd0`), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de	2022-07-21 13:56:02 -04:00
Fujii Masao	162ade612f	Fix assertion failure and segmentation fault in backup code. When a non-exclusive backup is canceled, do_pg_abort_backup() is called and resets some variables set by pg_backup_start (pg_start_backup in v14 or before). But previously it forgot to reset the session state indicating whether a non-exclusive backup is in progress or not in this session. This issue could cause an assertion failure when the session running BASE_BACKUP is terminated after it executed pg_backup_start and pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause a segmentation fault when pg_backup_stop is called after BASE_BACKUP in the same session is canceled. This commit fixes the issue by making do_pg_abort_backup reset that session state. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com	2022-07-20 09:53:53 +09:00
Fujii Masao	5630f39b31	Prevent BASE_BACKUP in the middle of another backup in the same session. Multiple non-exclusive backups are able to be run conrrently in different sessions. But, in the same session, only one non-exclusive backup can be run at the same moment. If pg_backup_start (pg_start_backup in v14 or before) is called in the middle of another non-exclusive backup in the same session, an error is thrown. However, previously, in logical replication walsender mode, even if that walsender session had already called pg_backup_start and started a non-exclusive backup, it could execute BASE_BACKUP command and start another non-exclusive backup. Which caused subsequent pg_backup_stop to throw an error because BASE_BACKUP unexpectedly reset the session state marked by pg_backup_start. This commit prevents BASE_BACKUP command in the middle of another non-exclusive backup in the same session. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com	2022-07-20 09:52:23 +09:00
Peter Eisentraut	b2c8d56618	Re-add SPICleanup for ABI compatibility in stable branch This fixes an ABI break introduced by `cfc86f9873`. Author: Markus Wanner <markus.wanner@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/defd749a-8410-841d-1126-21398686d63d@enterprisedb.com	2022-07-18 19:20:07 +02:00
Tom Lane	36ccca3dba	Fix omissions in support for the "regcollation" type. The patch that added regcollation doesn't seem to have been too thorough about supporting it everywhere that other reg* types are supported. Fix that. (The find_expr_references omission is moderately serious, since it could result in missing expression dependencies. The others are less exciting.) Noted while fixing bug #17483. Back-patch to v13 where regcollation was added. Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us	2022-07-17 17:43:28 -04:00
Thomas Munro	c75b6b454e	Make dsm_impl_posix_resize more future-proof. Commit `4518c798` blocks signals for a short region of code, but it assumed that whatever called it had the signal mask set to UnBlockSig on entry. That may be true today (or may even not be, in extensions in the wild), but it would be better not to make that assumption. We should save-and-restore the caller's signal mask. The PG_SETMASK() portability macro couldn't be used for that, which is why it wasn't done before. But... considering that commit `a65e0864` established back in 9.6 that supported POSIX systems have sigprocmask(), and that this is POSIX-only code, there is no reason not to use standard sigprocmask() directly to achieve that. Back-patch to all supported releases, like `4518c798` and `80845b7c`. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com	2022-07-16 12:23:24 +12:00
Thomas Munro	17aa39da50	Don't clobber postmaster sigmask in dsm_impl_resize. Commit `4518c798` intended to block signals in regular backends that allocate DSM segments, but dsm_impl_resize() is also reached by dsm_postmaster_startup(). It's not OK to clobber the postmaster's signal mask, so only manipulate the signal mask when under the postmaster. Back-patch to all releases, like `4518c798`. Discussion: https://postgr.es/m/CA%2BhUKGKNpK%3D2OMeea_AZwpLg7Bm4%3DgYWk7eDjZ5F6YbozfOf8w%40mail.gmail.com	2022-07-15 02:04:53 +12:00
Thomas Munro	e73fe6e828	Block signals while allocating DSM memory. On Linux, we call posix_fallocate() on shm_open()'d memory to avoid later potential SIGBUS (see commit `899bd785`). Based on field reports of systems stuck in an EINTR retry loop there, there, we made it possible to break out of that loop via slightly odd coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from the loop (see commit `422952ee`). On further reflection, that was not a great choice for at least two reasons: 1. If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing and the EINTR error would be surfaced to the user. 2. If EINTR was reported but neither QueryCancelPending nor ProcDiePending was set, then we'd dutifully retry, but with a bit more understanding of how posix_fallocate() works, it's now clear that you can get into a loop that never terminates. posix_fallocate() is not a function that can do some of the job and tell you about progress if it's interrupted, it has to undo what it's done so far and report EINTR, and if signals keep arriving faster than it can complete (cf recovery conflict signals), you're stuck. Therefore, for now, we'll simply block most signals to guarantee progress. SIGQUIT is not blocked (see InitPostmasterChild()), because its expected handler doesn't return, and unblockable signals like SIGCONT are not expected to arrive at a high rate. For good measure, we'll include the ftruncate() call in the blocked region, and add a retry loop. Back-patch to all supported releases. Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Nicola Contu <nicola.contu@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql	2022-07-14 14:32:48 +12:00
Thomas Munro	7cdd0c2d7c	Fix lock assertions in dshash.c. dshash.c previously maintained flags to be able to assert that you didn't hold any partition lock. These flags could get out of sync with reality in error scenarios. Get rid of all that, and make assertions about the locks themselves instead. Since LWLockHeldByMe() loops internally, we don't want to put that inside another loop over all partition locks. Introduce a new debugging-only interface LWLockAnyHeldByMe() to avoid that. This problem was noted by Tom and Andres while reviewing changes to support the new shared memory stats system, and later showed up in reality while working on commit `389869af`. Back-patch to 11, where dshash.c arrived. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com	2022-07-11 15:48:54 +12:00
Thomas Munro	e5b5b4448c	Fix \watch's interaction with libedit on ^C. When you hit ^C, the terminal driver in Unix-like systems echoes "^C" as well as sending an interrupt signal (depending on stty settings). At least libedit (but maybe also libreadline) is then confused about the current cursor location, and corrupts the display if you try to scroll back. Fix, by moving to a new line before the next prompt is displayed. Back-patch to all supported released. Author: Pavel Stehule <pavel.stehule@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3278793.1626198638%40sss.pgh.pa.us	2022-07-10 16:53:39 +12:00
Dean Rasheed	f890223bc3	Fix alias matching in transformLockingClause(). When locking a specific named relation for a FOR [KEY] UPDATE/SHARE clause, transformLockingClause() finds the relation to lock by scanning the rangetable for an RTE with a matching eref->aliasname. However, it failed to account for the visibility rules of a join RTE. If a join RTE doesn't have a user-supplied alias, it will have a generated eref->aliasname of "unnamed_join" that is not visible as a relation name in the parse namespace. Such an RTE needs to be skipped, otherwise it might be found in preference to a regular base relation with a user-supplied alias of "unnamed_join", preventing it from being locked. In addition, if a join RTE doesn't have a user-supplied alias, but does have a join_using_alias, then the RTE needs to be matched using that alias rather than the generated eref->aliasname, otherwise a misleading "relation not found" error will be reported rather than a "join cannot be locked" error. Backpatch all the way, except for the second part which only goes back to 14, where JOIN USING aliases were added. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com	2022-07-07 13:08:00 +01:00
Andrew Dunstan	03cefe8148	Remove %error-verbose directive from jsonpath parser None of the other bison parsers contains this directive, and it gives rise to some unfortunate and impenetrable messages, so just remove it. Backpatch to release 12, where it was introduced. Per gripe from Erik Rijkers Discussion: https://postgr.es/m/ba069ce2-a98f-dc70-dc17-2ccf2a9bf7c7@xs4all.nl	2022-07-03 17:16:58 -04:00
Noah Misch	97b005f3fb	Fix previous commit's ecpg_clocale for ppc Darwin. Per buildfarm member prairiedog, this platform rejects uninitialized global variables in shared libraries. Back-patch to v10, like the addition of the variable. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20220703030619.GB2378460@rfd.leadboat.com	2022-07-02 21:03:23 -07:00
Noah Misch	b4d7e92bd5	ecpglib: call newlocale() once per process. ecpglib has been calling it once per SQL query and once per EXEC SQL GET DESCRIPTOR. Instead, if newlocale() has not succeeded before, call it while establishing a connection. This mitigates three problems: - If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently proceeded without the intended locale change. - On AIX, each newlocale()+freelocale() cycle leaked memory. - newlocale() CPU usage may have been nontrivial. Fail the connection attempt if newlocale() fails. Rearrange ecpg_do_prologue() to validate the connection before its uselocale(). The sort of program that may regress is one running in an environment where newlocale() fails. If that program establishes connections without running SQL statements, it will stop working in response to this change. I'm betting against the importance of such an ECPG use case. Most SQL execution (any using ECPGdo()) has long required newlocale() success, so there's little a connection could do without newlocale(). Back-patch to v10 (all supported versions). Reviewed by Tom Lane. Reported by Guillaume Lelarge. Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com	2022-07-02 13:00:34 -07:00
Thomas Munro	b436047dc6	Harden dsm_impl.c against unexpected EEXIST. Previously, we trusted the OS not to report EEXIST unless we'd passed in IPC_CREAT \| IPC_EXCL or O_CREAT \| O_EXCL, as appropriate. Solaris's shm_open() can in fact do that, causing us to crash because we didn't ereport and then we blithely assumed the mapping was successful. Let's treat EEXIST just like any other error, unless we're actually trying to create a new segment. This applies to shm_open(), where this behavior has been seen, and also to the equivalent operations for our sysv and mmap modes just on principle. Based on the underlying reason for the error, namely contention on a lock file managed by Solaris librt for each distinct name, this problem is only likely to happen on 15 and later, because the new shared memory stats system produces shm_open() calls for the same path from potentially large numbers of backends concurrently during authentication. Earlier releases only shared memory segments between a small number of parallel workers under one Gather node. You could probably hit it if you tried hard enough though, and we should have been more defensive in the first place. Therefore, back-patch to all supported releases. Per build farm animal margay. This isn't the end of the story, though, it just changes random crashes into random "File exists" errors; more work needed for a green build farm. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com	2022-07-01 14:03:48 +12:00
Heikki Linnakangas	7ba325fd7f	Fix visibility check when XID is committed in CLOG but not in procarray. TransactionIdIsInProgress had a fast path to return 'false' if the single-item CLOG cache said that the transaction was known to be committed. However, that was wrong, because a transaction is first marked as committed in the CLOG but doesn't become visible to others until it has removed its XID from the proc array. That could lead to an error: ERROR: t_xmin is uncommitted in tuple to be updated or for an UPDATE to go ahead without blocking, before the previous UPDATE on the same row was made visible. The window is usually very short, but synchronous replication makes it much wider, because the wait for synchronous replica happens in that window. Another thing that makes it hard to hit is that it's hard to get such a commit-in-progress transaction into the single item CLOG cache. Normally, if you call TransactionIdIsInProgress on such a transaction, it determines that the XID is in progress without checking the CLOG and without populating the cache. One way to prime the cache is to explicitly call pg_xact_status() on the XID. Another way is to use a lot of subtransactions, so that the subxid cache in the proc array is overflown, making TransactionIdIsInProgress rely on pg_subtrans and CLOG checks. This has been broken ever since it was introduced in 2008, but the race condition is very hard to hit, especially without synchronous replication. There were a couple of reports of the error starting from summer 2021, but no one was able to find the root cause then. TransactionIdIsKnownCompleted() is now unused. In 'master', remove it, but I left it in place in backbranches in case it's used by extensions. Also change pg_xact_status() to check TransactionIdIsInProgress(). Previously, it only checked the CLOG, and returned "committed" before the transaction was actually made visible to other queries. Note that this also means that you cannot use pg_xact_status() to reproduce the bug anymore, even if the code wasn't fixed. Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with the pg_xact_status() change added by me. Author: Simon Riggs Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru	2022-06-27 08:24:35 +03:00
Noah Misch	aa1845cdd6	Fix PostgreSQL::Test aliasing for Perl v5.10.1. This Perl segfaults if a declaration of the to-be-aliased package precedes the aliasing itself. Per buildfarm members lapwing and wrasse. Like commit `20911775de`, back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220625171533.GA2012493@rfd.leadboat.com	2022-06-25 14:15:59 -07:00
Noah Misch	8782ce49e4	CREATE INDEX: use the original userid for more ACL checks. Commit `a117cebd63` used the original userid for ACL checks located directly in DefineIndex(), but it still adopted the table owner userid for more ACL checks than intended. That broke dump/reload of indexes that refer to an operator class, collation, or exclusion operator in a schema other than "public" or "pg_catalog". Back-patch to v10 (all supported versions), like the earlier commit. Nathan Bossart and Noah Misch Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de	2022-06-25 09:07:45 -07:00
Noah Misch	e8f037a2df	For PostgreSQL::Test compatibility, alias entire package symbol tables. Remove the need to edit back-branch-specific code sites when back-patching the addition of a PostgreSQL::Test::Utils symbol. Replace per-symbol, incomplete alias lists. Give old and new package names the same EXPORT and EXPORT_OK semantics. Back-patch to v10 (all supported versions). Reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/20220622072144.GD4167527@rfd.leadboat.com	2022-06-25 09:07:45 -07:00
Amit Kapila	3a6ef0cdf3	Fix memory leak due to LogicalRepRelMapEntry.attrmap. When rebuilding the relation mapping on subscribers, we were not releasing the attribute mapping's memory which was no longer required. The attribute mapping used in logical tuple conversion was refactored in PG13 (by commit `e1551f96e6`) but we forgot to update the related code that frees the attribute map. Author: Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila, Shi yu Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com	2022-06-23 09:02:16 +05:30
Tom Lane	cfc86f9873	Fix SPI's handling of errors during transaction commit. SPI_commit previously left it up to the caller to recover from any error occurring during commit. Since that's complicated and requires use of low-level xact.c facilities, it's not too surprising that no caller got it right. Let's move the responsibility for cleanup into spi.c. Doing that requires redefining SPI_commit as starting a new transaction, so that it becomes equivalent to SPI_commit_and_chain except that you get default transaction characteristics instead of preserving the prior transaction's characteristics. We can make this pretty transparent API-wise by redefining SPI_start_transaction() as a no-op. Callers that expect to do something in between might be surprised, but available evidence is that no callers do so. Having made that API redefinition, we can fix this mess by having SPI_commit[_and_chain] trap errors and start a new, clean transaction before re-throwing the error. Likewise for SPI_rollback[_and_chain]. Some cleanup is also needed in AtEOXact_SPI, which was nowhere near smart enough to deal with SPI contexts nested inside a committing context. While plperl and pltcl need no changes beyond removing their now-useless SPI_start_transaction() calls, plpython needs some more work because it hadn't gotten the memo about catching commit/rollback errors in the first place. Such an error resulted in longjmp'ing out of the Python interpreter, which leaks Python stack entries at present and is reported to crash Python 3.11 altogether. Add the missing logic to catch such errors and convert them into Python exceptions. This is a back-patch of commit `2e517818f`. That's now aged long enough to reduce the concerns about whether it will break something, and we do need to ensure that supported branches will work with Python 3.11. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org	2022-06-22 12:12:00 -04:00
Amit Kapila	419c727151	Fix stale values in partition map entries on subscribers. We build the partition map entries on subscribers while applying the changes for update/delete on partitions. The component relation in each entry is closed after its use so we need to update it on successive use of cache entries. This problem was there since the original commit `f1ac27bfda` that introduced this code but we didn't notice it till the recent commit `26b3455afa` started to use the component relation of partition map cache entry. Reported-by: Tom Lane, as per buildfarm Author: Amit Langote, Hou Zhijie Reviewed-by: Amit Kapila, Shi Yu Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com	2022-06-21 15:12:52 +05:30
Amit Kapila	5f113d60e9	Fix partition table's REPLICA IDENTITY checking on the subscriber. In logical replication, we will check if the target table on the subscriber is updatable by comparing the replica identity of the table on the publisher with the table on the subscriber. When the target table is a partitioned table, we only check its replica identity but not for the partition tables. This leads to assertion failure while applying changes for update/delete as we expect those to succeed only when the corresponding partition table has a primary key or has a replica identity defined. Fix it by checking the replica identity of the partition table while applying changes. Reported-by: Shi Yu Author: Shi Yu, Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com	2022-06-21 08:05:31 +05:30
Amit Kapila	1f9a7738eb	Fix data inconsistency between publisher and subscriber. We were not updating the partition map cache in the subscriber even when the corresponding remote rel is changed. Due to this data was getting incorrectly replicated for partition tables after the publisher has changed the table schema. Fix it by resetting the required entries in the partition map cache after receiving a new relation mapping from the publisher. Reported-by: Shi Yu Author: Shi Yu, Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com	2022-06-16 08:24:22 +05:30
Amit Kapila	16f5a8da76	Fix cache look-up failures while applying changes in logical replication. While building a new attrmap which maps partition attribute numbers to remoterel's, we incorrectly update the map for dropped column attributes. Later, it caused cache look-up failure when we tried to use the map to fetch the information about attributes. This also fixes the partition map cache invalidation which was using the wrong type cast to fetch the entry. We were using stale partition map entry after invalidation which leads to the assertion or cache look-up failure. Reported-by: Shi Yu Author: Hou Zhijie, Shi Yu Reviewed-by: Amit Langote, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com	2022-06-15 10:16:35 +05:30
Tom Lane	12b8fb34a9	Avoid ecpglib core dump with out-of-order operations. If an application executed operations like EXEC SQL PREPARE without having first established a database connection, it could get a core dump instead of the expected clean failure. This occurred because we did "pthread_getspecific(actual_connection_key)" without ever having initialized the TSD key actual_connection_key. The results of that are probably platform-specific, but at least on Linux it often leads to a crash. To fix, add calls to ecpg_pthreads_init() in the code paths that might use actual_connection_key uninitialized. It's harmless (and hopefully inexpensive) to do that more than once. Per bug #17514 from Okano Naoki. The problem's ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org	2022-06-14 18:16:46 -04:00
Tom Lane	3f7f067385	Revert "Fix psql's single transaction mode on client-side errors with -c/-f switches". This reverts commits `a04ccf6df` et al. in the back branches only. There was some disagreement already over whether to back-patch `157f8739a`, on the grounds that it is the sort of behavioral change that we don't like to back-patch. Furthermore, it now looks like the logic needs some more work, which we don't have time for before the upcoming 14.4 release. Revert for now, and perhaps reconsider later. Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org	2022-06-10 16:34:25 -04:00
Tom Lane	254cd7f31f	Un-break whole-row Vars referencing domain-over-composite types. In commit `ec62cb0aa`, I foolishly replaced ExecEvalWholeRowVar's lookup_rowtype_tupdesc_domain call with just lookup_rowtype_tupdesc, because I didn't see how a domain could be involved there, and there were no regression test cases to jog my memory. But the existing code was correct, so revert that change and add a test case showing why it's necessary. (Note: per comment in struct DatumTupleFields, it is correct to produce an output tuple that's labeled with the base composite type, not the domain; hence just blindly looking through the domain is correct here.) Per bug #17515 from Dan Kubb. Back-patch to v11 where domains over composites became a thing. Discussion: https://postgr.es/m/17515-a24737438363aca0@postgresql.org	2022-06-10 10:35:57 -04:00
Peter Eisentraut	925816684a	Fix whitespace	2022-06-08 13:42:39 +02:00
Tom Lane	a36196972b	Fix off-by-one loop termination condition in pg_stat_get_subscription(). pg_stat_get_subscription scanned one more LogicalRepWorker array entry than is really allocated. In the worst case this could lead to SIGSEGV, if the LogicalRepCtx data structure is near the end of shared memory. That seems quite unlikely though (thanks to the ordering of calls in CreateSharedMemoryAndSemaphores) and we've heard no field reports of it. A more likely misbehavior is one row of garbage data in the function's result, but even that is not real likely because of the check that the pid field matches some live backend. Report and fix by Kuntal Ghosh. This bug is old, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com	2022-06-07 15:34:30 -04:00
Tom Lane	16d68007cd	Don't fail on libpq-generated error reports in ecpg_raise_backend(). An error PGresult generated by libpq itself, such as a report of connection loss, won't have broken-down error fields. ecpg_raise_backend() blithely assumed that PG_DIAG_MESSAGE_PRIMARY would always be present, and would end up passing a NULL string pointer to snprintf when it isn't. That would typically crash before `3779ac62d`, and it would fail to provide a useful error report in any case. Best practice is to substitute PQerrorMessage(conn) in such cases, so do that. Per bug #17421 from Masayuki Hirose. Back-patch to all supported branches. Discussion: https://postgr.es/m/17421-790ff887e3188874@postgresql.org	2022-06-06 11:20:36 -04:00
Michael Paquier	b364cfdfaf	Fix psql's single transaction mode on client-side errors with -c/-f switches psql --single-transaction is able to handle multiple -c and -f switches in a single transaction since `d5563d7d`, but this had the surprising behavior of forcing a transaction COMMIT even if psql failed with an error in the client (for example incorrect path given to \copy), which would generate an error, but still commit any changes that were already applied in the backend. This commit makes the behavior more consistent, by enforcing a transaction ROLLBACK if any commands fail, both client-side and backend-side, so as no changes are applied if one error happens in any of them. Some tests are added on HEAD to provide some coverage about all that. Backend-side errors are unreliable as IPC::Run can complain on SIGPIPE if psql quits before reading a query result, but that should work properly in the case where any errors come from psql itself, which is what the original report is about. Reported-by: Christoph Berg Author: Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org Backpatch-through: 10	2022-06-06 11:07:27 +09:00
Tom Lane	60ca2e8418	Silence compiler warnings from some older compilers. Since `a117cebd6`, some older gcc versions issue "variable may be used uninitialized in this function" complaints for brin_summarize_range. Silence that using the same coding pattern as in bt_index_check_internal; arguably, `a117cebd6` had too narrow a view of which compilers might give trouble. Nathan Bossart and Tom Lane. Back-patch as the previous commit was. Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13	2022-06-01 17:21:45 -04:00
Tom Lane	eeac7dd9ff	Fix pl/perl test case so it will still work under Perl 5.36. Perl 5.36 has reclassified the warning condition that this test case used, so that the expected error fails to appear. Tweak the test so it instead exercises a case that's handled the same way in all Perl versions of interest. This appears to meet our standards for back-patching into out-of-support branches: it changes no user-visible behavior but enables testing of old branches with newer tools. Hence, back-patch as far as 9.2. Dagfinn Ilmari Mannsåker, per report from Jitka Plesníková. Discussion: https://postgr.es/m/564579.1654093326@sss.pgh.pa.us	2022-06-01 16:15:47 -04:00
Tom Lane	c73748b68a	Ensure ParseTzFile() closes the input file after failing. We hadn't noticed this because (a) few people feed invalid timezone abbreviation files to the server, and (b) in typical scenarios guc.c would throw ereport(ERROR) and then transaction abort handling would silently clean up the leaked file reference. However, it was possible to observe file leakage warnings if one breaks an already-active abbreviation file, because guc.c does not throw ERROR when loading supposedly-validated settings during session start or SIGHUP processing. Report and fix by Kyotaro Horiguchi (cosmetic adjustments by me) Discussion: https://postgr.es/m/20220530.173740.748502979257582392.horikyota.ntt@gmail.com	2022-05-31 14:47:44 -04:00
Michael Paquier	1e6802990c	Handle NULL for short descriptions of custom GUC variables If a short description is specified as NULL in one of the various DefineCustomXXXVariable() functions available to external modules to define a custom parameter, SHOW ALL would crash. This change teaches SHOW ALL to properly handle NULL short descriptions, as well as any code paths that manipulate it, to gain in flexibility. Note that help_config.c was already able to do that, when describing a set of GUCs for postgres --describe-config. Author: Steve Chavez Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com Backpatch-through: 10	2022-05-28 12:12:51 +09:00
Tom Lane	9e3dbc6fd9	Remove misguided SSL key file ownership check in libpq. Commits `a59c79564` et al. tried to sync libpq's SSL key file permissions checks with what we've used for years in the backend. We did not intend to create any new failure cases, but it turns out we did: restricting the key file's ownership breaks cases where the client is allowed to read a key file despite not having the identical UID. In particular a client running as root used to be able to read someone else's key file; and having seen that I suspect that there are other, less-dubious use cases that this restriction breaks on some platforms. We don't really need an ownership check, since if we can read the key file despite its having restricted permissions, it must have the right ownership --- under normal conditions anyway, and the point of this patch is that any additional corner cases where that works should be deemed allowable, as they have been historically. Hence, just drop the ownership check, and rearrange the permissions check to get rid of its faulty assumption that geteuid() can't be zero. (Note that the comparable backend-side code doesn't have to cater for geteuid() == 0, since the server rejects that very early on.) This does have the end result that the permissions safety check used for a root user's private key file is weaker than that used for anyone else's. While odd, root really ought to know what she's doing with file permissions, so I think this is acceptable. Per report from Yogendra Suralkar. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com	2022-05-26 14:14:05 -04:00
Tom Lane	fefd546317	Show 'AS "?column?"' explicitly when it's important. ruleutils.c was coded to suppress the AS label for a SELECT output expression if the column name is "?column?", which is the parser's fallback if it can't think of something better. This is fine, and avoids ugly clutter, so long as (1) nothing further up in the parse tree relies on that column name or (2) the same fallback would be assigned when the rule or view definition is reloaded. Unfortunately (2) is far from certain, both because ruleutils.c might print the expression in a different form from how it was originally written and because FigureColname's rules might change in future releases. So we shouldn't rely on that. Detecting exactly whether there is any outer-level use of a SELECT column name would be rather expensive. This patch takes the simpler approach of just passing down a flag indicating whether there could be any outer use; for example, the output column names of a SubLink are not referenceable, and we also do not care about the names exposed by the right-hand side of a setop. This is sufficient to suppress unwanted clutter in all but one case in the regression tests. That seems like reasonable evidence that it won't be too much in users' faces, while still fixing the cases we need to fix. Per bug #17486 from Nicolas Lutic. This issue is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org	2022-05-21 14:45:58 -04:00
Alvaro Herrera	3753a169e1	Fix DDL deparse of CREATE OPERATOR CLASS When an implicit operator family is created, it wasn't getting reported. Make it do so. This has always been missing. Backpatch to 10. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Michael Paquiër <michael@paquier.xyz> Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr	2022-05-20 18:52:55 +02:00
Alvaro Herrera	99867e7277	Backpatch regression tests added by `2d689babe3` A new plpgsql test function was added in 14 and up to cover for a bugfix that was not backpatchable. We can add it to older versions as a way to cover other bits of DDL event triggers, with an exception clause to avoid the problematic corner case. Originally authored by Michaël Paquier. Backpatch: 10 through 13. Discussion: https://postgr.es/m/202205201523.7m5jbfvyanmj@alvherre.pgsql	2022-05-20 17:52:16 +02:00
Alvaro Herrera	5fd0cccc11	Repurpose PROC_COPYABLE_FLAGS as PROC_XMIN_FLAGS This is a slight, convenient semantics change from what commit `0f0cfb4940` ("Fix parallel operations that prevent oldest xmin from advancing") introduced that lets us simplify the coding in the one place where it is used. Backpatch to 13. This is related to commit `6fea65508a` ("Tighten ComputeXidHorizons' handling of walsenders") rewriting the code site where this is used, which has not yet been backpatched, but it may well be in the future. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/202204191637.eldwa2exvguw@alvherre.pgsql	2022-05-19 16:20:32 +02:00
Alvaro Herrera	5139db5563	Update xml_1.out and xml_2.out Commit `0fbf011200` should have updated them but didn't.	2022-05-18 23:19:53 +02:00
Alvaro Herrera	80656f00f8	Check column list length in XMLTABLE/JSON_TABLE alias We weren't checking the length of the column list in the alias clause of an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was possible to make the server crash by passing an overly long one. Fix it by throwing an error in that case, like the other places that deal with alias lists. In passing, modify the equivalent test used for join RTEs to look like the other ones, which was different for no apparent reason. This bug came in when XMLTABLE was born in version 10; backpatch to all stable versions. Reported-by: Wang Ke <krking@zju.edu.cn> Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org	2022-05-18 20:28:31 +02:00
Michael Paquier	2e9559b302	Fix control file update done in restartpoints still running after promotion If a cluster is promoted (aka the control file shows a state different than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still processing, this function could miss an update of the control file for "checkPoint" and "checkPointCopy" but still do the recycling and/or removal of the past WAL segments, assuming that the to-be-updated LSN values should be used as reference points for the cleanup. This causes a follow-up restart attempting crash recovery to fail with a PANIC on a missing checkpoint record if the end-of-recovery checkpoint triggered by the promotion did not complete while the cluster abruptly stopped or crashed before the completion of this checkpoint. The PANIC would be caused by the redo LSN referred in the control file as located in a segment already gone, recycled by the previous restartpoint with "checkPoint" out-of-sync in the control file. This commit fixes the update of the control file during restartpoints so as "checkPoint" and "checkPointCopy" are updated even if the cluster has been promoted while a restartpoint is running, to be on par with the set of WAL segments actually recycled in the end of CreateRestartPoint(). `7863ee4` has fixed this problem already on master, but the release timing of the latest point versions did not let me enough time to study and fix that on all the stable branches. Reported-by: Fujii Masao, Rui Zhao Author: Kyotaro Horiguchi Reviewed-by: Nathan Bossart, Michael Paquier Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com Backpatch-through: 10	2022-05-16 11:26:26 +09:00
Tom Lane	b7579b25c8	Make pull_var_clause() handle GroupingFuncs exactly like Aggrefs. This follows in the footsteps of commit `2591ee8ec` by removing one more ill-advised shortcut from planning of GroupingFuncs. It's true that we don't intend to execute the argument expression(s) at runtime, but we still have to process any Vars appearing within them, or we risk failure at setrefs.c time (or more fundamentally, in EXPLAIN trying to print such an expression). Vars in upper plan nodes have to have referents in the next plan level, whether we ever execute 'em or not. Per bug #17479 from Michael J. Sullivan. Back-patch to all supported branches. Richard Guo Discussion: https://postgr.es/m/17479-6260deceaf0ad304@postgresql.org	2022-05-12 11:31:46 -04:00
Amit Kapila	55558df237	Fix the logical replication timeout during large transactions. The problem is that we don't send keep-alive messages for a long time while processing large transactions during logical replication where we don't send any data of such transactions. This can happen when the table modified in the transaction is not published or because all the changes got filtered. We do try to send the keep_alive if necessary at the end of the transaction (via WalSndWriteData()) but by that time the subscriber-side can timeout and exit. To fix this we try to send the keepalive message if required after processing certain threshold of changes. Reported-by: Fabrice Chapuis Author: Wang wei and Amit Kapila Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda Backpatch-through: 10 Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com	2022-05-11 10:41:24 +05:30
Michael Paquier	b9d70ef34b	Improve setup of environment values for commands in MSVC's vcregress.pl The current setup assumes that commands for lz4, zstd and gzip always exist by default if not enforced by a user's environment. However, vcpkg, as one example, installs libraries but no binaries, so this default setup to assume that a command should always be present would cause failures. This commit improves the detection of such external commands as follows: * If a ENV value is available, trust the environment/user and use it. * If a ENV value is not available, check its execution by looking in the current PATH, by launching a simple "$command --version" (that should be portable enough). On execution failure, ignore ENV{command}. On execution success, set ENV{command} = "$command". Note that this new rule applies to gzip, lz4 and zstd but not tar that we assume will always exist. Those commands are set up in the environment only when using bincheck and taptest. The CI includes all those commands and I have checked that their setup is correct there. I have also tested this change in a MSVC environment where we have none of those commands. While on it, remove the references to lz4 from the documentation and vcregress.pl in ~v13. --with-lz4 has been added in v14~ so there is no point to have this information in these older branches. Reported-by: Andrew Dunstan Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/14402151-376b-a57a-6d0c-10ad12608e12@dunslane.net Backpatch-through: 10	2022-05-11 10:22:34 +09:00
Tom Lane	91a3a74c65	Fix core dump in transformValuesClause when there are no columns. The parser code that transformed VALUES from row-oriented to column-oriented lists failed if there were zero columns. You can't write that straightforwardly (though probably you should be able to), but the case can be reached by expanding a "tab.*" reference to a zero-column table. Per bug #17477 from Wang Ke. Back-patch to all supported branches. Discussion: https://postgr.es/m/17477-0af3c6ac6b0a6ae0@postgresql.org	2022-05-09 14:15:37 -04:00
Tom Lane	7f0754bc4c	Revert "Disallow infinite endpoints in generate_series() for timestamps." This reverts commit `eafdf9de06` and its back-branch counterparts. Corey Huinker pointed out that we'd discussed this exact change back in 2016 and rejected it, on the grounds that there's at least one usage pattern with LIMIT where an infinite endpoint can usefully be used. Perhaps that argument needs to be re-litigated, but there's no time left before our back-branch releases. To keep our options open, restore the status quo ante; if we do end up deciding to change things, waiting one more quarter won't hurt anything. Rather than just doing a straight revert, I added a new test case demonstrating the usage with LIMIT. That'll at least remind us of the issue if we forget again. Discussion: https://postgr.es/m/3603504.1652068977@sss.pgh.pa.us Discussion: https://postgr.es/m/CADkLM=dzw0Pvdqp5yWKxMd+VmNkAMhG=4ku7GnCZxebWnzmz3Q@mail.gmail.com	2022-05-09 11:40:41 -04:00
Noah Misch	88743d581e	In REFRESH MATERIALIZED VIEW, set user ID before running user code. It intended to, but did not, achieve this. Adopt the new standard of setting user ID just after locking the relation. Back-patch to v10 (all supported versions). Reviewed by Simon Riggs. Reported by Alvaro Herrera. Security: CVE-2022-1552	2022-05-09 08:35:12 -07:00
Noah Misch	35edcc0cee	Make relation-enumerating operations be security-restricted operations. When a feature enumerates relations and runs functions associated with all found relations, the feature's user shall not need to trust every user having permission to create objects. BRIN-specific functionality in autovacuum neglected to account for this, as did pg_amcheck and CLUSTER. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. CREATE INDEX (not a relation-enumerating operation) and REINDEX protected themselves too late. This change extends to the non-enumerating amcheck interface. Back-patch to v10 (all supported versions). Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2022-1552	2022-05-09 08:35:12 -07:00
Peter Eisentraut	916463773c	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 7c8dcb6669ccc6ae33090d02ed92f0bad6ada742	2022-05-09 12:27:22 +02:00
Andres Freund	6348e46850	Disable 031_recovery_conflict.pl until after minor releases. `f40d362a66` disabled part of 031_recovery_conflict.pl due to instability that's not trivial to fix in the back branches. That fixed most of the issues. But there was one more failure (on lapwing / REL_10_STABLE). That failure looks like it might be caused by a genuine problem. Disable the test until after the set of releases, to avoid packagers etc potentially having to fight with a test failure they can't do anything about. Discussion: https://postgr.es/m/3447060.1652032749@sss.pgh.pa.us Backpatch: 10-14	2022-05-08 18:06:36 -07:00
Andres Freund	4e39957e6e	Temporarily skip recovery deadlock test in back branches. The recovery deadlock test has a timing issue that was fixed in `5136967f1e` in HEAD. Unfortunately the same fix doesn't quite work in the back branches: 1) adjust_conf() doesn't exist, which is easy enough to work around 2) a restart cleares the recovery conflict stats < 15. These issues can be worked around, but given the upcoming set of minor releases, skip the problematic test for now. The buildfarm doesn't show failures in other parts of 031_recovery_conflict.pl. Discussion: https://postgr.es/m/20220506155827.dfnaheq6ufylwrqf@alap3.anarazel.de Backpatch: 10-14	2022-05-06 09:07:39 -07:00
Andres Freund	6ab90d215a	Backpatch addition of pump_until() more completely. In `a2ab9c06ea` I just backpatched the introduction of pump_until(), without changing the existing local definitions (as `6da65a3f9a`). The necessary changes seemed more verbose than desirable. However, that leads to warnings, as I failed to realize... Backpatch to all versions containing pump_until() calls before `f74496dd61` (there's none in 10). Discussion: https://postgr.es/m/2808491.1651802860@sss.pgh.pa.us Discussion: https://postgr.es/m/18b37361-b482-b9d8-f30d-6115cd5ce25c@enterprisedb.com Backpatch: 11-14	2022-05-06 08:39:07 -07:00
Tom Lane	e9735d1af1	Update time zone data files to tzdata release 2022a. DST law changes in Palestine. Historical corrections for Chile and Ukraine.	2022-05-05 14:55:10 -04:00
Andres Freund	27e97d0747	Revert "Fix timing issue in deadlock recovery conflict test." This reverts commit `8a3a8d9f10`.	2022-05-04 14:16:41 -07:00
Andres Freund	8a3a8d9f10	Fix timing issue in deadlock recovery conflict test. Per buildfarm members longfin and skink. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-	2022-05-04 12:57:20 -07:00
Andres Freund	0446d3bf39	Backpatch 031_recovery_conflict.pl. The prior commit showed that the introduction of recovery conflict tests was a good idea. Without these tests it's hard to know that the fix didn't break something... 031_recovery_conflict.pl was introduced in `9f8a050f68` and extended in `21e184403b`. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-14	2022-05-02 18:29:35 -07:00
Andres Freund	57c5ad168b	Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin(). The tests added in `9f8a050f68` failed nearly reliably on FreeBSD in CI, and occasionally on the buildfarm. That turns out to be caused not by a bug in the test, but by a longstanding bug in recovery conflict handling. The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(), executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad idea, because the deadlock timeout handler (or a spurious latch set) could have interrupted ProcWaitForSignal(). If unlucky that could cause a self-deadlock on ProcArrayLock, if the deadlock check is in SendRecoveryConflictWithBufferPin()->CancelDBBackends(). To fix, set a flag in StandbyTimeoutHandler(), and check the flag in ResolveRecoveryConflictWithBufferPin(). Subsequently the recovery conflict tests will be backpatched. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-	2022-05-02 18:28:10 -07:00
Andres Freund	90abe1e17f	Backpatch addition of wait_for_log(), pump_until(). These were originally introduced in `a2ab9c06ea` and `a2ab9c06ea`, as they are needed by a about-to-be-backpatched test. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-14	2022-05-02 18:09:43 -07:00
Etsuro Fujita	d85f2bfa09	Fix typo in comment.	2022-05-02 16:45:04 +09:00
Andrew Dunstan	d9cede2c3b	Inhibit mingw CRT's auto-globbing of command line arguments For some reason by default the mingw C Runtime takes it upon itself to expand program arguments that look like shell globbing characters. That has caused much scratching of heads and mis-attribution of the causes of some TAP test failures, so stop doing that. This removes an inconsistency with Windows binaries built with MSVC, which have no such behaviour. Per suggestion from Noah Misch. Backpatch to all live branches. Discussion: https://postgr.es/m/20220423025927.GA1274057@rfd.leadboat.com	2022-04-25 15:50:01 -04:00
Tom Lane	e6440d5007	Remove inadequate assertion check in CTE inlining. inline_cte() expected to find exactly as many references to the target CTE as its cterefcount indicates. While that should be accurate for the tree as emitted by the parser, there are some optimizations that occur upstream of here that could falsify it, notably removal of unused subquery output expressions. Trying to make the accounting 100% accurate seems expensive and doomed to future breakage. It's not really worth it, because all this code is protecting is downstream assumptions that every referenced CTE has a plan. Let's convert those assertions to regular test-and-elog just in case there's some actual problem, and then drop the failing assertion. Per report from Tomas Vondra (thanks also to Richard Guo for analysis). Back-patch to v12 where the faulty code came in. Discussion: https://postgr.es/m/29196a1e-ed47-c7ca-9be2-b1c636816183@enterprisedb.com	2022-04-21 17:58:52 -04:00
Andrew Dunstan	1c60a99dd9	Support new perl module namespace in stable branches Commit `b3b4d8e68a` moved our perl test modules to a better namespace structure, but this has made life hard for people wishing to backpatch improvements in the TAP tests. Here we alleviate much of that difficulty by implementing the new module names on top of the old modules, mostly by using a little perl typeglob aliasing magic, so that we don't have a dual maintenance burden. This should work both for the case where a new test is backpatched and the case where a fix to an existing test that uses the new namespace is backpatched. Reviewed by Michael Paquier Per complaint from Andres Freund Discussion: https://postgr.es/m/20220418141530.nfxtkohefvwnzncl@alap3.anarazel.de Applied to branches 10 through 14	2022-04-21 09:20:00 -04:00
Peter Geoghegan	1272630a24	Fix CLUSTER tuplesorts on abbreviated expressions. CLUSTER sort won't use the datum1 SortTuple field when clustering against an index whose leading key is an expression. This makes it unsafe to use the abbreviated keys optimization, which was missed by the logic that sets up SortSupport state. Affected tuplesorts output tuples in a completely bogus order as a result (the wrong SortSupport based comparator was used for the leading attribute). This issue is similar to the bug fixed on the master branch by recent commit `cc58eecc5d`. But it's a far older issue, that dates back to the introduction of the abbreviated keys optimization by commit `4ea51cdfe8`. Backpatch to all supported versions. Author: Peter Geoghegan <pg@bowt.ie> Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com Backpatch: 10-	2022-04-20 17:17:39 -07:00
Tom Lane	8275ba773d	Disallow infinite endpoints in generate_series() for timestamps. Such cases will lead to infinite loops, so they're of no practical value. The numeric variant of generate_series() already threw error for this, so borrow its message wording. Per report from Richard Wesley. Back-patch to all supported branches. Discussion: https://postgr.es/m/91B44E7B-68D5-448F-95C8-B4B3B0F5DEAF@duckdblabs.com	2022-04-20 18:08:15 -04:00
Tom Lane	f583633bc1	Fix breakage in AlterFunction(). An ALTER FUNCTION command that tried to update both the function's proparallel property and its proconfig list failed to do the former, because it stored the new proparallel value into a tuple that was no longer the interesting one. Carelessness in `7aea8e4f2`. (I did not bother with a regression test, because the only likely future breakage would be for someone to ignore the comment I added and add some other field update after the heap_modify_tuple step. A test using existing function properties could not catch that.) Per report from Bryn Llewellyn. Back-patch to all supported branches. Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com	2022-04-19 23:03:59 -04:00
Amit Kapila	82d4a17a17	Fix the check to limit sync workers. We don't allow to invoke more sync workers once we have reached the sync worker limit per subscription. But the check to enforce this also doesn't allow to launch an apply worker if it gets restarted. This code was introduced by commit `de43897122` but we caught the problem only with the test added by recent commit `c91f71b9dc` which started failing occasionally in the buildfarm. As per buildfarm. Diagnosed-by: Amit Kapila, Masahiko Sawada, Tomas Vondra Author: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com https://postgr.es/m/f90d2b03-4462-ce95-a524-d91464e797c8@enterprisedb.com	2022-04-19 09:08:05 +05:30
Tom Lane	69cefb3fb8	Avoid invalid array reference in transformAlterTableStmt(). Don't try to look at the attidentity field of system attributes, because they're not there in the TupleDescAttr array. Sometimes this is harmless because we accidentally pick up a zero, but otherwise we'll report "no owned sequence found" from an attempt to alter a system attribute. (It seems possible that a SIGSEGV could occur, too, though I've not seen it in testing.) It's not in this function's charter to complain that you can't alter a system column, so instead just hard-wire an assumption that system attributes aren't identities. I didn't bother with a regression test because the appearance of the bug is very erratic. Per bug #17465 from Roman Zharkov. Back-patch to all supported branches. (There's not actually a live bug before v12, because before that get_attidentity() did the right thing anyway. But for consistency I changed the test in the older branches too.) Discussion: https://postgr.es/m/17465-f2a554a6cb5740d3@postgresql.org	2022-04-18 12:16:45 -04:00
Michael Paquier	a6fc64b9a8	Fix race in TAP test 002_archiving.pl when restoring history file This test, introduced in `df86e52`, uses a second standby to check that it is able to remove correctly RECOVERYHISTORY and RECOVERYXLOG at the end of recovery. This standby uses the archives of the primary to restore its contents, with some of the archive's contents coming from the first standby previously promoted. In slow environments, it was possible that the test did not check what it should, as the history file generated by the promotion of the first standby may not be stored yet on the archives the second standby feeds on. So, it could be possible that the second standby selects an incorrect timeline, without restoring a history file at all. This commits adds a wait phase to make sure that the history file required by the second standby is archived before this cluster is created. This relies on poll_query_until() with pg_stat_file() and an absolute path, something not supported in REL_10_STABLE. While on it, this adds a new test to check that the history file has been restored by looking at the logs of the second standby. This ensures that a RECOVERYHISTORY, whose removal needs to be checked, is created in the first place. This should make the test more robust. This test has been introduced by `df86e52`, but it came in light as an effect of the bug fixed by `acf1dd42`, where the extra restore_command calls made the test much slower. Reported-by: Andres Freund Discussion: https://postgr.es/m/YlT23IvsXkGuLzFi@paquier.xyz Backpatch-through: 11	2022-04-18 11:40:18 +09:00
Noah Misch	b404513de8	Add a temp-install prerequisite to src/interfaces/ecpg "checktcp". The target failed, tested $PATH binaries, or tested a stale temporary installation. Commit `c66b438db6` missed this. Back-patch to v10 (all supported versions).	2022-04-16 17:43:57 -07:00
Robert Haas	d18c913b78	Rethink the delay-checkpoint-end mechanism in the back-branches. The back-patch of commit `bbace5697d` had the unfortunate effect of changing the layout of PGPROC in the back-branches, which could break extensions. This happened because it changed the delayChkpt from type bool to type int. So, change it back, and add a new bool delayChkptEnd field instead. The new field should fall within what used to be padding space within the struct, and so hopefully won't cause any extensions to break. Per report from Markus Wanner and discussion with Tom Lane and others. Patch originally by me, somewhat revised by Markus Wanner per a suggestion from Michael Paquier. A very similar patch was developed by Kyotaro Horiguchi, but I failed to see the email in which that was posted before writing one of my own. Discussion: http://postgr.es/m/CA+Tgmoao-kUD9c5nG5sub3F7tbo39+cdr8jKaOVEs_1aBWcJ3Q@mail.gmail.com Discussion: http://postgr.es/m/20220406.164521.17171257901083417.horikyota.ntt@gmail.com	2022-04-14 11:10:11 -04:00
Tom Lane	44096c31ea	Prevent access to no-longer-pinned buffer in heapam_tuple_lock(). heap_fetch() used to have a "keep_buf" parameter that told it to return ownership of the buffer pin to the caller after finding that the requested tuple TID exists but is invisible to the specified snapshot. This was thoughtlessly removed in commit `5db6df0c0`, which broke heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function needs to do more accesses to the tuple even if it's invisible. The net effect is that we would continue to touch the page for a microsecond or two after releasing pin on the buffer. Usually no harm would result; but if a different session decided to defragment the page concurrently, we could see garbage data and mistakenly conclude that there's no newer tuple version to chain up to. (It's hard to say whether this has happened in the field. The bug was actually found thanks to a later change that allowed valgrind to detect accesses to non-pinned buffers.) The most reasonable way to fix this is to reintroduce keep_buf, although I made it behave slightly differently: buffer ownership is passed back only if there is a valid tuple at the requested TID. In HEAD, we can just add the parameter back to heap_fetch(). To avoid an API break in the back branches, introduce an additional function heap_fetch_extended() in those branches. In HEAD there is an additional, less obvious API change: tuple->t_data will be set to NULL in all cases where buffer ownership is not returned, in particular when the tuple exists but fails the time qual (and !keep_buf). This is to defend against any other callers attempting to access non-pinned buffers. We concluded that making that change in back branches would be more likely to introduce problems than cure any. In passing, remove a comment about heap_fetch that was obsoleted by `9a8ee1dc6`. Per bug #17462 from Daniil Anisimov. Back-patch to v12 where the bug was introduced. Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org	2022-04-13 13:35:02 -04:00
Tom Lane	87166d25a4	Suppress "variable 'pagesaving' set but not used" warning. With asserts disabled, late-model clang notices that this variable is incremented but never otherwise read. Discussion: https://postgr.es/m/3171401.1649275153@sss.pgh.pa.us	2022-04-06 17:03:35 -04:00
Peter Eisentraut	8f4b5b0909	Remove obsolete comment accidentally left behind by `4cb658af70`	2022-04-02 07:28:23 +02:00
Tom Lane	fb1d7f4519	Add missing newline in one libpq error message. Oversight in commit `a59c79564`. Back-patch, as that was. Noted by Peter Eisentraut. Discussion: https://postgr.es/m/7f85ef6d-250b-f5ec-9867-89f0b16d019f@enterprisedb.com	2022-03-31 11:24:26 -04:00
Etsuro Fujita	f042dc6a41	Fix typo in comment.	2022-03-30 19:00:04 +09:00
Alvaro Herrera	a54ed2de80	Revert "Fix replay of create database records on standby" This reverts commit `49d9cfc68b`. The approach taken by this patch has problems, so we'll come up with a radically different fix. Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com	2022-03-29 15:36:21 +02:00
Andres Freund	344d89abf3	waldump: fix use-after-free in search_directory(). After closedir() dirent->d_name is not valid anymore. As there alerady are a few places relying on the limited lifetime of pg_waldump, do so here as well, and just pg_strdup() the string. The bug was introduced in `fc49e24fa6`. Found by UBSan, run locally. Backpatch: 11-, like `fc49e24fa6` itself.	2022-03-27 18:15:14 -07:00
Tom Lane	9016a2a3dc	Fix breakage of get_ps_display() in the PS_USE_NONE case. Commit `8c6d30f21` caused this function to fail to set *displen in the PS_USE_NONE code path. If the variable's previous value had been negative, that'd lead to a memory clobber at some call sites. We'd managed not to notice due to very thin test coverage of such configurations, but this appears to explain buildfarm member lorikeet's recent struggles. Credit to Andrew Dunstan for spotting the problem. Back-patch to v13 where the bug was introduced. Discussion: https://postgr.es/m/136102.1648320427@sss.pgh.pa.us	2022-03-27 12:57:57 -04:00
Tom Lane	4fbbea2c19	Suppress compiler warning in relptr_store(). clang 13 with -Wextra warns that "performing pointer subtraction with a null pointer has undefined behavior" in the places where freepage.c tries to set a relptr variable to constant NULL. This appears to be a compiler bug, but it's unlikely to get fixed instantly. Fortunately, we can work around it by introducing an inline support function, which seems like a good change anyway because it removes the macro's existing double-evaluation hazard. Backpatch to v10 where this code was introduced. Patch by me, based on an idea of Andres Freund's. Discussion: https://postgr.es/m/48826.1648310694@sss.pgh.pa.us	2022-03-26 14:29:48 -04:00
Tom Lane	6e9ffcf13a	Harden TAP tests that intentionally corrupt page checksums. The previous method for doing that was to write zeroes into a predetermined set of page locations. However, there's a roughly 1-in-64K chance that the existing checksum will match by chance, and yesterday several buildfarm animals started to reproducibly see that, resulting in test failures because no checksum mismatch was reported. Since the checksum includes the page LSN, test success depends on the length of the installation's WAL history, which is affected by (at least) the initial catalog contents, the set of locales installed on the system, and the length of the pathname of the test directory. Sooner or later we were going to hit a chance match, and today is that day. Harden these tests by specifically inverting the checksum field and leaving all else alone, thereby guaranteeing that the checksum is incorrect. In passing, fix places that were using seek() to set up for syswrite(), a combination that the Perl docs very explicitly warn against. We've probably escaped problems because no regular buffered I/O is done on these filehandles; but if it ever breaks, we wouldn't deserve or get much sympathy. Although we've only seen problems in HEAD, now that we recognize the environmental dependencies it seems like it might be just a matter of time until someone manages to hit this in back-branch testing. Hence, back-patch to v11 where we started doing this kind of test. Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us	2022-03-25 14:23:26 -04:00
Alvaro Herrera	50e3ed8e91	Fix replay of create database records on standby Crash recovery on standby may encounter missing directories when replaying create database WAL records. Prior to this patch, the standby would fail to recover in such a case. However, the directories could be legitimately missing. Consider a sequence of WAL records as follows: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, the crash recovery must be able to move on. This patch adds a mechanism similar to invalid-page tracking, to keep a tally of missing directories during crash recovery. If all the missing directory references are matched with corresponding drop records at the end of crash recovery, the standby can safely continue following the primary. Backpatch to 13, at least for now. The bug is older, but fixing it in older branches requires more careful study of the interactions with commit `e6d8069522`, which appeared in 13. A new TAP test file is added to verify the condition. However, because it depends on commit `d6d317dbf6`, it can only be added to branch master. I (Álvaro) manually verified that the code behaves as expected in branch 14. It's a bit nervous-making to leave the code uncovered by tests in older branches, but leaving the bug unfixed is even worse. Also, the main reason this fix took so long is precisely that we couldn't agree on a good strategy to approach testing for the bug, so perhaps this is the best we can do. Diagnosed-by: Paul Guo <paulguo@gmail.com> Author: Paul Guo <paulguo@gmail.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com	2022-03-25 13:16:21 +01:00
Robert Haas	1ce14b6b2f	Fix possible recovery trouble if TRUNCATE overlaps a checkpoint. If TRUNCATE causes some buffers to be invalidated and thus the checkpoint does not flush them, TRUNCATE must also ensure that the corresponding files are truncated on disk. Otherwise, a replay from the checkpoint might find that the buffers exist but have the wrong contents, which may cause replay to fail. Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design suggestion from Heikki Linnakangas, with some changes to the comments by me. Review of this and a prior patch that approached the issue differently by Heikki Linnakangas, Andres Freund, Álvaro Herrera, Masahiko Sawada, and Tom Lane. Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com	2022-03-24 14:36:06 -04:00
Andres Freund	c0f99bb520	Don't try to translate NULL in GetConfigOptionByNum(). Noticed via -fsanitize=undefined. Introduced when a few columns in GetConfigOptionByNum() / pg_settings started to be translated in `72be8c29a` / PG 12. Backpatch to all affected branches, for the same reasons as `46ab07ffda`. Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 12-	2022-03-23 13:18:00 -07:00
Andres Freund	8014c61ebd	Don't call fwrite() with len == 0 when writing out relcache init file. Noticed via -fsanitize=undefined. Backpatch to all branches, for the same reasons as `46ab07ffda`. Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 10-	2022-03-23 13:13:20 -07:00
Alvaro Herrera	ca26c64581	pg_upgrade: Upgrade an Assert to a real 'if' test It seems possible for the condition being tested to be true in production, and nobody would never know (except when some data eventually becomes corrupt?). Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql	2022-03-23 19:23:51 +01:00
Alvaro Herrera	98eb3e06ce	Fix "missing continuation record" after standby promotion Invalidate abortedRecPtr and missingContrecPtr after a missing continuation record is successfully skipped on a standby. This fixes a PANIC caused when a recently promoted standby attempts to write an OVERWRITE_RECORD with an LSN of the previously read aborted record. Backpatch to 10 (all stable versions). Author: Sami Imseih <simseih@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com	2022-03-23 18:22:10 +01:00
Thomas Munro	f784fcdc44	Try to stabilize vacuum test. As commits `b700f96c` and `3414099c` did for the reloptions test, make sure VACUUM can always truncate the table as expected. Back-patch to 12, where vacuum_truncate arrived. Discussion: https://postgr.es/m/CAD21AoCNoWjYkdEtr%2BVDoF9v__V905AedKZ9iF%3DArgCtrbxZqw%40mail.gmail.com	2022-03-23 15:07:46 +13:00
Andres Freund	f183e23cc8	Add missing dependency of pg_dumpall to WIN32RES. When cross-building to windows, or building with mingw on windows, the build could fail with x86_64-w64-mingw32-gcc: error: win32ver.o: No such file or director because pg_dumpall didn't depend on WIN32RES, but it's recipe references it. The build nevertheless succeeded most of the time, due to pg_dump/pg_restore having the required dependency, causing win32ver.o to be built. Reported-By: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKGJeekpUPWW6yCVdf9=oBAcCp86RrBivo4Y4cwazAzGPng@mail.gmail.com Backpatch: 10-, omission present on all live branches	2022-03-22 08:28:52 -07:00
Michael Paquier	f32be9938c	Fix failures in SSL tests caused by out-of-tree keys and certificates This issue is environment-sensitive, where the SSL tests could fail in various way by feeding on defaults provided by sslcert, sslkey, sslrootkey, sslrootcert, sslcrl and sslcrldir coming from a local setup, as of ~/.postgresql/ by default. Horiguchi-san has reported two failures, but more advanced testing from me (aka inclusion of garbage SSL configuration in ~/.postgresql/ for all the configuration parameters) has showed dozens of failures that can be triggered in the whole test suite. History has showed that we are not good when it comes to address such issues, fixing them locally like in `dd87799`, and such problems keep appearing. This commit strengthens the entire test suite to put an end to this set of problems by embedding invalid default values in all the connection strings used in the tests. The invalid values are prefixed in each connection string, relying on the follow-up values passed in the connection string to enforce any invalid value previously set. Note that two tests related to CRLs are required to fail with certain pre-set configurations, but we can rely on enforcing an empty value instead after the invalid set of values. Reported-by: Kyotaro Horiguchi Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Kyotaro Horiguchi Discussion: https://postgr.es/m/20220316.163658.1122740600489097632.horikyota.ntt@gmail.com backpatch-through: 10	2022-03-22 13:21:39 +09:00
Tom Lane	dfefe38fbe	Fix assorted missing logic for GroupingFunc nodes. The planner needs to treat GroupingFunc like Aggref for many purposes, in particular with respect to processing of the argument expressions, which are not to be evaluated at runtime. A few places hadn't gotten that memo, notably including subselect.c's processing of outer-level aggregates. This resulted in assertion failures or wrong plans for cases in which a GROUPING() construct references an outer aggregation level. Also fix missing special cases for GroupingFunc in cost_qual_eval (resulting in wrong cost estimates for GROUPING(), although it's not clear that that would affect plan shapes in practice) and in ruleutils.c (resulting in excess parentheses in pretty-print mode). Per bug #17088 from Yaoguang Chen. Back-patch to all supported branches. Richard Guo, Tom Lane Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org	2022-03-21 17:44:29 -04:00
Tom Lane	2241e5ceda	Fix risk of deadlock failure while dropping a partitioned index. DROP INDEX needs to lock the index's table before the index itself, else it will deadlock against ordinary queries that acquire the relation locks in that order. This is correctly mechanized for plain indexes by RangeVarCallbackForDropRelation; but in the case of a partitioned index, we neglected to lock the child tables in advance of locking the child indexes. We can fix that by traversing the inheritance tree and acquiring the needed locks in RemoveRelations, after we have acquired our locks on the parent partitioned table and index. While at it, do some refactoring to eliminate confusion between the actual and expected relkind in RangeVarCallbackForDropRelation. We can save a couple of syscache lookups too, by having that function pass back info that RemoveRelations will need. Back-patch to v11 where partitioned indexes were added. Jimmy Yih, Gaurab Dey, Tom Lane Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com	2022-03-21 12:22:13 -04:00
Tom Lane	88ae77588d	Fix incorrect xmlschema output for types timetz and timestamptz. The output of table_to_xmlschema() and allied functions includes a regex describing valid values for these types ... but the regex was itself invalid, as it failed to escape a literal "+" sign. Report and fix by Renan Soares Lopes. Back-patch to all supported branches. Discussion: https://postgr.es/m/7f6fabaa-3f8f-49ab-89ca-59fbfe633105@me.com	2022-03-18 16:01:42 -04:00
Tom Lane	5e144cc89b	Revert applying column aliases to the output of whole-row Vars. In commit `bf7ca1587`, I had the bright idea that we could make the result of a whole-row Var (that is, foo.) track any column aliases that had been applied to the FROM entry the Var refers to. However, that's not terribly logically consistent, because now the output of the Var is no longer of the named composite type that the Var claims to emit. `bf7ca1587` tried to handle that by changing the output tuple values to be labeled with a blessed RECORD type, but that's really pretty disastrous: we can wind up storing such tuples onto disk, whereupon they're not readable by other sessions. The only practical fix I can see is to give up on what `bf7ca1587` tried to do, and say that the column names of tuples produced by a whole-row Var are always those of the underlying named composite type, query aliases or no. While this introduces some inconsistencies, it removes others, so it's not that awful in the abstract. What is* kind of awful is to make such a behavioral change in a back-patched bug fix. But corrupt data is worse, so back-patched it will be. (A workaround available to anyone who's unhappy about this is to introduce an extra level of sub-SELECT, so that the whole-row Var is referring to the sub-SELECT's output and not to a named table type. Then the Var is of type RECORD to begin with and there's no issue.) Per report from Miles Delahunty. The faulty commit dates to 9.5, so back-patch to all supported branches. Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us	2022-03-17 18:18:05 -04:00
Tomas Vondra	27fafee72d	Fix publish_as_relid with multiple publications Commit `83fd4532a7` allowed publishing of changes via ancestors, for publications defined with publish_via_partition_root. But the way the ancestor was determined in get_rel_sync_entry() was incorrect, simply updating the same variable. So with multiple publications, replicating different ancestors, the outcome depended on the order of publications in the list - the value from the last loop was used, even if it wasn't the top-most ancestor. This is a probably rare situation, as in most cases publications do not overlap, so each partition has exactly one candidate ancestor to replicate as and there's no ambiguity. Fixed by tracking the "ancestor level" for each publication, and picking the top-most ancestor. Adds a test case, verifying the correct ancestor is used for publishing the changes and that this does not depend on order of publications in the list. Older releases have another bug in this loop - once all actions are replicated, the loop is terminated, on the assumption that inspecting additional publications is unecessary. But that misses the fact that those additional applications may replicate different ancestors. Fixed by removal of this break condition. We might still terminate the loop in some cases (e.g. when replicating all actions and the ancestor is the partition root). Backpatch to 13, where publish_via_partition_root was introduced. Initial report and fix by me, test added by Hou zj. Reviews and improvements by Amit Kapila. Author: Tomas Vondra, Hou zj, Amit Kapila Reviewed-by: Amit Kapila, Hou zj Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com	2022-03-16 18:14:33 +01:00
Thomas Munro	51e760e5a6	Fix race between DROP TABLESPACE and checkpointing. Commands like ALTER TABLE SET TABLESPACE may leave files for the next checkpoint to clean up. If such files are not removed by the time DROP TABLESPACE is called, we request a checkpoint so that they are deleted. However, there is presently a window before checkpoint start where new unlink requests won't be scheduled until the following checkpoint. This means that the checkpoint forced by DROP TABLESPACE might not remove the files we expect it to remove, and the following ERROR will be emitted: ERROR: tablespace "mytblspc" is not empty To fix, add a call to AbsorbSyncRequests() just before advancing the unlink cycle counter. This ensures that any unlink requests forwarded prior to checkpoint start (i.e., when ckpt_started is incremented) will be processed by the current checkpoint. Since AbsorbSyncRequests() performs memory allocations, it cannot be called within a critical section, so we also need to move SyncPreCheckpoint() to before CreateCheckPoint()'s critical section. This is an old bug, so back-patch to all supported versions. Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220215235845.GA2665318%40nathanxps13	2022-03-16 17:21:19 +13:00
Thomas Munro	cfdb303be7	Fix waiting in RegisterSyncRequest(). If we run out of space in the checkpointer sync request queue (which is hopefully rare on real systems, but common with very small buffer pool), we wait for it to drain. While waiting, we should report that as a wait event so that users know what is going on, and also handle postmaster death, since otherwise the loop might never terminate if the checkpointer has exited. Back-patch to 12. Although the problem exists in earlier releases too, the code is structured differently before 12 so I haven't gone any further for now, in the absence of field complaints. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de	2022-03-16 15:37:15 +13:00
Thomas Munro	5610411ac7	Back-patch LLVM 14 API changes. Since LLVM 14 has stopped changing and is about to be released, back-patch the following changes from the master branch: `e6a7600202` `807fee1a39` `a56e7b6601` Back-patch to 11, where LLVM JIT support came in.	2022-03-16 11:41:13 +13:00
Noah Misch	29ec94efd0	Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts. Slow hosts may avoid load-induced, spurious failures by setting environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds greater than 180. Developers may see faster failures by setting that environment variable to some lesser number of seconds. In tests, write $PostgreSQL::Test::Utils::timeout_default wherever the convention has been to write 180. This change raises the default for some briefer timeouts. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com	2022-03-04 18:53:17 -08:00
Tom Lane	59392746cb	Fix pg_regress to print the correct postmaster address on Windows. pg_regress reported "Unix socket" as the default location whenever HAVE_UNIX_SOCKETS is defined. However, that's not been accurate on Windows since `8f3ec75de`. Update this logic to match what libpq actually does now. This is just cosmetic, but still it's potentially misleading. Back-patch to v13 where `8f3ec75de` came in. Discussion: https://postgr.es/m/3894060.1646415641@sss.pgh.pa.us	2022-03-04 13:23:58 -05:00
Tom Lane	97031f4404	Fix bogus casting in BlockIdGetBlockNumber(). This macro cast the result to BlockNumber after shifting, not before, which is the wrong thing. Per the C spec, the uint16 fields would promote to int not unsigned int, so that (for 32-bit int) the shift potentially shifts a nonzero bit into the sign position. I doubt there are any production systems where this would actually end with the wrong answer, but it is undefined behavior per the C spec, and clang's -fsanitize=undefined option reputedly warns about it on some platforms. (I can't reproduce that right now, but the code is undeniably wrong per spec.) It's easy to fix by casting to BlockNumber (uint32) in the proper places. It's been wrong for ages, so back-patch to all supported branches. Report and patch by Zhihong Yu (cosmetic tweaking by me) Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com	2022-03-03 19:03:42 -05:00
Tom Lane	1a027e6b7b	Clean up assorted failures under clang's -fsanitize=undefined checks. Most of these are cases where we could call memcpy() or other libc functions with a NULL pointer and a zero count, which is forbidden by POSIX even though every production version of libc allows it. We've fixed such things before in a piecemeal way, but apparently never made an effort to try to get them all. I don't claim that this patch does so either, but it gets every failure I observe in check-world, using clang 12.0.1 on current RHEL8. numeric.c has a different issue that the sanitizer doesn't like: "ln(-1.0)" will compute log10(0) and then try to assign the resulting -Inf to an integer variable. We don't actually use the result in such a case, so there's no live bug. Back-patch to all supported branches, with the idea that we might start running a buildfarm member that tests this case. This includes back-patching `c1132aae3` (Check the size in COPY_POINTER_FIELD), which previously silenced some of these issues in copyfuncs.c. Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com	2022-03-03 18:13:24 -05:00
Tom Lane	6599d8f126	Allow root-owned SSL private keys in libpq, not only the backend. This change makes libpq apply the same private-key-file ownership and permissions checks that we have used in the backend since commit `9a83564c5`. Namely, that the private key can be owned by either the current user or root (with different file permissions allowed in the two cases). This allows system-wide management of key files, which is just as sensible on the client side as the server, particularly when the client is itself some application daemon. Sync the comments about this between libpq and the backend, too. Back-patch of `a59c79564` and `50f03473e` into all supported branches. David Steele Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net	2022-03-02 11:57:02 -05:00
Tom Lane	9b2d762a28	Disallow execution of SPI functions during plperl function compilation. Perl can be convinced to execute user-defined code during compilation of a plperl function (or at least a plperlu function). That's not such a big problem as long as the activity is confined within the Perl interpreter, and it's not clear we could do anything about that anyway. However, if such code tries to use plperl's SPI functions, we have a bigger problem. In the first place, those functions are likely to crash because current_call_data->prodesc isn't set up yet. In the second place, because it isn't set up, we lack critical info such as whether the function is supposed to be read-only. And in the third place, this path allows code execution during function validation, which is strongly discouraged because of the potential for security exploits. Hence, reject execution of the SPI functions until compilation is finished. While here, add check_spi_usage_allowed() calls to various functions that hadn't gotten the memo about checking that. I think that perhaps plperl_sv_to_literal may have been intentionally omitted on the grounds that it was safe at the time; but if so, the addition of transforms functionality changed that. The others are more recently added and seem to be flat-out oversights. Per report from Mark Murawski. Back-patch to all supported branches. Discussion: https://postgr.es/m/9acdf918-7fff-4f40-f750-2ffa84f083d2@intellasoft.net	2022-02-25 17:40:44 -05:00
Andres Freund	0b1020a96d	pg_waldump: Fix error message for WAL files smaller than XLOG_BLCKSZ. When opening a WAL file smaller than XLOG_BLCKSZ (e.g. 0 bytes long) while determining the wal_segment_size, pg_waldump checked errno, despite errno not being set by the short read. Resulting in a bogus error message. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220214.181847.775024684568733277.horikyota.ntt@gmail.com Backpatch: 11-, the bug was introducedin `fc49e24fa`	2022-02-25 10:32:38 -08:00
Andres Freund	c2551483e0	Fix temporary object cleanup failing due to toast access without snapshot. When cleaning up temporary objects during process exit the cleanup could fail with: FATAL: cannot fetch toast data without an active snapshot The bug is caused by RemoveTempRelationsCallback() not setting up a snapshot. If an object with toasted catalog data needs to be cleaned up, init_toast_snapshot() could fail with the above error. Most of the time however the the problem is masked due to cached catalog snapshots being returned by GetOldestSnapshot(). But dropping an object can cause catalog invalidations to be emitted. If no further catalog accesses are necessary between the invalidation processing and the next toast datum deletion, the bug becomes visible. It's easy to miss this bug because it typically happens after clients disconnect and the FATAL error just ends up in the log. Luckily temporary table cleanup at the next use of the same temporary schema or during DISCARD ALL does not have the same problem. Fix the bug by pushing a snapshot in RemoveTempRelationsCallback(). Also add isolation tests for temporary object cleanup, including objects with toasted catalog data. A future HEAD only commit will add more assertions. Reported-By: Miles Delahunty Author: Andres Freund Discussion: https://postgr.es/m/CAOFAq3BU5Mf2TTvu8D9n_ZOoFAeQswuzk7yziAb7xuw_qyw5gw@mail.gmail.com Backpatch: 10-	2022-02-21 08:59:30 -08:00
Andrew Dunstan	a5716baac0	Remove most msys special processing in TAP tests Following migration of Windows buildfarm members running TAP tests to use of ucrt64 perl for those tests, special processing for msys perl is no longer necessary and so is removed. Backpatch to release 10 Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net	2022-02-20 11:49:33 -05:00
Andrew Dunstan	c27aa21e0d	Remove PostgreSQL::Test::Utils::perl2host completely Commit `f1ac4a74de` disabled this processing, and as nothing has broken (as expected) here we proceed to remove the routine and adjust all the call sites. Backpatch to release 10 Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de	2022-02-20 10:35:49 -05:00
Tom Lane	46b738ffc2	Suppress warning about stack_base_ptr with late-model GCC. GCC 12 complains that set_stack_base is storing the address of a local variable in a long-lived pointer. This is an entirely reasonable warning (indeed, it just helped us find a bug); but that behavior is intentional here. We can work around it by using __builtin_frame_address(0) instead of a specific local variable; that produces an address a dozen or so bytes different, in my testing, but we don't care about such a small difference. Maybe someday a compiler lacking that function will start to issue a similar warning, but we'll worry about that when it happens. Patch by me, per a suggestion from Andres Freund. Back-patch to v12, which is as far back as the patch will go without some pain. (Recently-established project policy would permit a back-patch as far as 9.2, but I'm disinclined to expend the work until GCC 12 is much more widespread.) Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us	2022-02-17 22:45:34 -05:00
Amit Kapila	caa231be97	WAL log unchanged toasted replica identity key attributes. Currently, during UPDATE, the unchanged replica identity key attributes are not logged separately because they are getting logged as part of the new tuple. But if they are stored externally then the untoasted values are not getting logged as part of the new tuple and logical replication won't be able to replicate such UPDATEs. So we need to log such attributes as part of the old_key_tuple during UPDATE. Reported-by: Haiying Tang Author: Dilip Kumar and Amit Kapila Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com	2022-02-14 08:24:44 +05:30
Alexander Korotkov	ac2303aa03	Fix memory leak in IndexScan node with reordering Fix ExecReScanIndexScan() to free the referenced tuples while emptying the priority queue. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com Author: Aliaksandr Kalenik Reviewed-by: Tom Lane, Alexander Korotkov Backpatch-through: 10	2022-02-14 03:32:34 +03:00
Tom Lane	51ee561f56	Fix thinko in PQisBusy(). In commit `1f39a1c06` I made PQisBusy consider conn->write_failed, but that is now looking like complete brain fade. In the first place, the logic is quite wrong: it ought to be like "and not" rather than "or". This meant that once we'd gotten into a write_failed state, PQisBusy would always return true, probably causing the calling application to iterate its loop until PQconsumeInput returns a hard failure thanks to connection loss. That's not what we want: the intended behavior is to return an error PGresult, which the application probably has much cleaner support for. But in the second place, checking write_failed here seems like the wrong thing anyway. The idea of the write_failed mechanism is to postpone handling of a write failure until we've read all we can from the server; so that flag should not interfere with input-processing behavior. (Compare 7247e243a.) What we should check for is status = CONNECTION_BAD, ie, socket already closed. (Most places that close the socket don't touch asyncStatus, but they do reset status.) This primarily ensures that if PQisBusy() returns true then there is an open socket, which is assumed by several call sites in our own code, and probably other applications too. While at it, fix a nearby thinko in libpq's my_sock_write: we should only consult errno for res < 0, not res == 0. This is harmless since pqsecure_raw_write would force errno to zero in such a case, but it still could confuse readers. Noted by Andres Freund. Backpatch to v12 where `1f39a1c06` came in. Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de	2022-02-12 13:23:20 -05:00
Tom Lane	0778b24ced	Don't use_physical_tlist for an IOS with non-returnable columns. createplan.c tries to save a runtime projection step by specifying a scan plan node's output as being exactly the table's columns, or index's columns in the case of an index-only scan, if there is not a reason to do otherwise. This logic did not previously pay attention to whether an index's columns are returnable. That worked, sort of accidentally, until commit `9a3ddeb51` taught setrefs.c to reject plans that try to read a non-returnable column. I have no desire to loosen setrefs.c's new check, so instead adjust use_physical_tlist() to not try to optimize this way when there are non-returnable column(s). Per report from Ryan Kelly. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com	2022-02-11 15:23:52 -05:00
Tom Lane	d0e1fd9587	Make pg_ctl stop/restart/promote recheck postmaster aliveness. "pg_ctl stop/restart" checked that the postmaster PID is valid just once, as a side-effect of sending the stop signal, and then would wait-till-timeout for the postmaster.pid file to go away. This neglects the case wherein the postmaster dies uncleanly after we signal it. Similarly, once "pg_ctl promote" has sent the signal, it'd wait for the corresponding on-disk state change to occur even if the postmaster dies. I'm not sure how we've managed not to notice this problem, but it seems to explain slow execution of the 017_shm.pl test script on AIX since commit `4fdbf9af5`, which added a speculative "pg_ctl stop" with the idea of making real sure that the postmaster isn't there. In the test steps that kill-9 and then restart the postmaster, it's possible to get past the initial signal attempt before kill() stops working for the doomed postmaster. If that happens, pg_ctl waited till PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members have that set very high. To fix, include a "kill(pid, 0)" test (similar to what postmaster_is_alive uses) in these wait loops, so that we'll give up immediately if the postmaster PID disappears. While here, I chose to refactor those loops out of where they were. do_stop() and do_restart() can perfectly well share one copy of the wait-for-stop loop, and it seems desirable to put a similar function beside that for wait-for-promote. Back-patch to all supported versions, since pg_ctl's wait logic is substantially identical in all, and we're seeing the slow test behavior in all branches. Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com	2022-02-10 16:49:39 -05:00
Andrew Dunstan	eec7c640fe	Use gendef instead of pexports for building windows .def files Modern msys systems lack pexports but have gendef instead, so use that. Discussion: https://postgr.es/m/3ccde7a9-e4f9-e194-30e0-0936e6ad68ba@dunslane.net Backpatch to release 9.4 to enable building with perl on older branches. Before that pexports is not used for plperl.	2022-02-10 13:51:40 -05:00
Noah Misch	1b8462bf1d	Fix back-patch of "Avoid race in RelationBuildDesc() ..." The back-patch of commit `fdd965d074` broke CLOBBER_CACHE_ALWAYS for v9.6 through v13. It updated the InvalidateSystemCaches() call for CLOBBER_CACHE_RECURSIVELY, neglecting the one for CLOBBER_CACHE_ALWAYS. Back-patch to v13, v12, v11, and v10. Reviewed by Tomas Vondra. Reported by Tomas Vondra. Discussion: https://postgr.es/m/df7b4c0b-7d92-f03f-75c4-9e08b269a716@enterprisedb.com	2022-02-09 18:16:56 -08:00
Tom Lane	5ea3b99de0	Remove ppport.h's broken re-implementation of eval_pv(). Recent versions of Devel::PPPort try to redefine eval_pv() to dodge a bug in pre-5.31 Perl versions. Unfortunately the redefinition fails on compilers that don't support statements nested within expressions. However, we aren't actually interested in this bug fix, since we always call eval_pv() with croak_on_error = FALSE. So, until there's an upstream fix for this breakage, just comment out the macro to revert to the older behavior. Per report from Wei Sun, as well as previous buildfarm failure on pademelon (which I'd unfortunately not looked at carefully enough to understand the cause). Back-patch to all supported versions, since we're using the same ppport.h in all. Discussion: https://postgr.es/m/tencent_2EFCC8BA0107B6EC0F97179E019A8A43C806@qq.com Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pademelon&dt=2022-02-02%2001%3A22%3A58	2022-02-08 19:26:17 -05:00
Peter Eisentraut	7f7148b72e	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 9aa8bc576f9af5c61de4a6fc8119abfa36493d01	2022-02-07 13:36:22 +01:00
Tom Lane	9c7cf083f2	Test, don't just Assert, that mergejoin's inputs are in order. There are two Asserts in nodeMergejoin.c that are reachable if the input data is not in the expected order. This seems way too fragile. Alexander Lakhin reported a case where the assertions could be triggered with misconfigured foreign-table partitions, and bitter experience with unstable operating system collation definitions suggests another easy route to hitting them. Neither Assert is in a place where we can't afford one more test-and-branch, so replace 'em with plain test-and-elog logic. Per bug #17395. While the reported symptom is relatively recent, collation changes could happen anytime, so back-patch to all supported branches. Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org	2022-02-05 11:59:30 -05:00
Tom Lane	ff86ee3d21	Revert "plperl: Fix breakage of `c89f409749` in back branches." This reverts commits `d81cac47a` et al. We shouldn't need that hack after the preceding commits. Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de	2022-01-31 15:07:39 -05:00
Tom Lane	bc89af248e	plperl: update ppport.h to Perl 5.34.0. Also apply the changes suggested by running perl ppport.h --compat-version=5.8.0 And remove some no-longer-required NEED_foo declarations. Dagfinn Ilmari Mannsåker Back-patch of commit `05798c9f7` into all supported branches. At the time we thought this update was mostly cosmetic, but the lack of it has caused trouble, while the patch itself hasn't. Discussion: https://postgr.es/m/87y278s6iq.fsf@wibble.ilmari.org Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de	2022-01-31 15:01:05 -05:00
Andres Freund	4afb5aeb9f	plperl: Fix breakage of `c89f409749` in back branches. ppport.h was only updated in `05798c9f7f` (master). Unfortunately my commit `c89f409749` uses PERL_VERSION_LT which came in with that update. Breaking most buildfarm animals. I should have noticed that... We might want to backpatch the ppport update instead, but for now lets get the buildfarm green again. Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de Backpatch: 10-14, master doesn't need it	2022-01-30 18:00:57 -08:00
Andres Freund	0dc0fe7b6c	plperl: windows: Use Perl_setlocale on 5.28+, fixing compile failure. For older versions we need our own copy of perl's setlocale(), because it was not exposed (why we need the setlocale in the first place is explained in plperl_init_interp) . The copy stopped working in 5.28, as some of the used macros are not public anymore. But Perl_setlocale is available in 5.28, so use that. Author: Victor Wagner <vitus@wagner.pp.ru> Reviewed-By: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/20200501134711.08750c5f@antares.wagner.home Backpatch: all versions	2022-01-30 16:42:45 -08:00
Tom Lane	5ad70564f4	Fix failure to validate the result of select_common_type(). Although select_common_type() has a failure-return convention, an apparent successful return just provides a type OID that might work as a common supertype; we've not validated that the required casts actually exist. In the mainstream use-cases that doesn't matter, because we'll proceed to invoke coerce_to_common_type() on each input, which will fail appropriately if the proposed common type doesn't actually work. However, a few callers didn't read the (nonexistent) fine print, and thought that if they got back a nonzero OID then the coercions were sure to work. This affects in particular the recently-added "anycompatible" polymorphic types; we might think that a function/operator using such types matches cases it really doesn't. A likely end result of that is unexpected "ambiguous operator" errors, as for example in bug #17387 from James Inform. Another, much older, case is that the parser might try to transform an "x IN (list)" construct to a ScalarArrayOpExpr even when the list elements don't actually have a common supertype. It doesn't seem desirable to add more checking to select_common_type itself, as that'd just slow down the mainstream use-cases. Instead, write a separate function verify_common_type that performs the missing checks, and add a call to that where necessary. Likewise add verify_common_type_from_oids to go with select_common_type_from_oids. Back-patch to v13 where the "anycompatible" types came in. (The symptom complained of in bug #17387 doesn't appear till v14, but that's just because we didn't get around to converting \|\| to use anycompatible till then.) In principle the "x IN (list)" fix could go back all the way, but I'm not currently convinced that it makes much difference in real-world cases, so I won't bother for now. Discussion: https://postgr.es/m/17387-5dfe54b988444963@postgresql.org	2022-01-29 11:41:12 -05:00
Tomas Vondra	e90f258aca	Fix ordering of XIDs in ProcArrayApplyRecoveryInfo Commit `8431e296ea` reworked ProcArrayApplyRecoveryInfo to sort XIDs before adding them to KnownAssignedXids. But the XIDs are sorted using xidComparator, which compares the XIDs simply as uint32 values, not logically. KnownAssignedXidsAdd() however expects XIDs in logical order, and calls TransactionIdFollowsOrEquals() to enforce that. If there are XIDs for which the two orderings disagree, an error is raised and the recovery fails/restarts. Hitting this issue is fairly easy - you just need two transactions, one started before the 4B limit (e.g. XID 4294967290), the other sometime after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when compared using xidComparator we try to add them in the opposite order. Which makes KnownAssignedXidsAdd() fail with an error like this: ERROR: out-of-order XID insertion in KnownAssignedXids This only happens during replica startup, while processing RUNNING_XACTS records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we skip these records. So this does not affect already running replicas, but if you restart (or create) a replica while there are transactions with XIDs for which the two orderings disagree, you may hit this. Long-running transactions and frequent replica restarts increase the likelihood of hitting this issue. Once the replica gets into this state, it can't be started (even if the old transactions are terminated). Fixed by sorting the XIDs logically - this is fine because we're dealing with normal XIDs (because it's XIDs assigned to backends) and from the same wraparound epoch (otherwise the backends could not be running at the same time on the primary node). So there are no problems with the triangle inequality, which is why xidComparator compares raw values. Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me. This issue is present in all releases since 9.4, however releases up to 9.6 are EOL already so backpatch to 10 only. Reviewed-by: Abhijit Menon-Sen Reviewed-by: Alvaro Herrera Backpatch-through: 10 Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com	2022-01-27 20:16:39 +01:00
Noah Misch	785a28e422	On sparc64+ext4, suppress test failures from known WAL read failure. Buildfarm members kittiwake, tadarida and snapper began to fail frequently when commits `3cd9c3b921` and `f47ed79cc8` added tests of concurrency, but the problem was reachable before those commits. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com	2022-01-26 18:06:23 -08:00
Magnus Hagander	81596645ca	Fix pg_hba_file_rules for authentication method cert For authentication method cert, clientcert=verify-full is implied. But the pg_hba_file_rules entry would incorrectly show clientcert=verify-ca. Per bug #17354 Reported-By: Feike Steenbergen Reviewed-By: Jonathan Katz Backpatch-through: 12	2022-01-26 09:59:19 +01:00
Tom Lane	213c5aa3bd	Revert "graceful shutdown" changes for Windows, in back branches only. This reverts commits `6051857fc` and `ed52c3707`, but only in the back branches. Further testing has shown that while those changes do fix some things, they also break others; in particular, it looks like walreceivers fail to detect walsender-initiated connection close reliably if the walsender shuts down this way. We'll keep trying to improve matters in HEAD, but it now seems unwise to push these changes into stable releases. Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com	2022-01-25 12:17:40 -05:00
David Rowley	f8807e7742	Consider parallel awareness when removing single-child Appends `8edd0e794` added some code to remove Append and MergeAppend nodes when they contained a single child node. As it turned out, this was unsafe to do when the Append/MergeAppend was parallel_aware and the child node was not. Removing the Append/MergeAppend, in this case, could lead to the child plan being called multiple times by parallel workers when it was unsafe to do so. Here we fix this by just not removing the Append/MergeAppend when the parallel_aware flag of the parent and child node don't match. Reported-by: Yura Sokolov Bug: #17335 Discussion: https://postgr.es/m/b59605fecb20ba9ea94e70ab60098c237c870628.camel%40postgrespro.ru Backpatch-through: 12, where `8edd0e794` was first introduced	2022-01-25 21:15:00 +13:00
Tom Lane	d67354d870	Fix limitations on what SQL commands can be issued to a walsender. In logical replication mode, a WalSender is supposed to be able to execute any regular SQL command, as well as the special replication commands. Poor design of the replication-command parser caused it to fail in various cases, notably: * semicolons embedded in a command, or multiple SQL commands sent in a single message; * dollar-quoted literals containing odd numbers of single or double quote marks; * commands starting with a comment. The basic problem here is that we're trying to run repl_scanner.l across the entire input string even when it's not a replication command. Since repl_scanner.l does not understand all of the token types known to the core lexer, this is doomed to have failure modes. We certainly don't want to make repl_scanner.l as big as scan.l, so instead rejigger stuff so that we only lex the first token of a non-replication command. That will usually look like an IDENT to repl_scanner.l, though a comment would end up getting reported as a '-' or '/' single-character token. If the token is a replication command keyword, we push it back and proceed normally with repl_gram.y parsing. Otherwise, we can drop out of exec_replication_command() without examining the rest of the string. (It's still theoretically possible for repl_scanner.l to fail on the first token; but that could only happen if it's an unterminated single- or double-quoted string, in which case you'd have gotten largely the same error from the core lexer too.) In this way, repl_gram.y isn't involved at all in handling general SQL commands, so we can get rid of the SQLCmd node type. (In the back branches, we can't remove it because renumbering enum NodeTag would be an ABI break; so just leave it sit there unused.) I failed to resist the temptation to clean up some other sloppy coding in repl_scanner.l while at it. The only externally-visible behavior change from that is it now accepts \r and \f as whitespace, same as the core lexer. Per bug #17379 from Greg Rychlewski. Back-patch to all supported branches. Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org	2022-01-24 15:33:34 -05:00
Tom Lane	c94c6612da	Remember to reset yy_start state when firing up repl_scanner.l. Without this, we get odd behavior when the previous cycle of lexing exited in a non-default exclusive state. Every other copy of this code is aware that it has to do BEGIN(INITIAL), but repl_scanner.l did not get that memo. The real-world impact of this is probably limited, since most replication clients would abandon their connection after getting a syntax error. Still, it's a bug. This mistake is old, so back-patch to all supported branches. Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us	2022-01-24 12:09:46 -05:00
Tom Lane	0cbc507378	Suppress variable-set-but-not-used warning from clang 13. In the normal configuration where GEQO_DEBUG isn't defined, recent clang versions have started to complain that geqo_main.c accumulates the edge_failures count but never does anything with it. As a minimal back-patchable fix, insert a void cast to silence this warning. (I'd speculated about ripping out the GEQO_DEBUG logic altogether, but I don't think we'd wish to back-patch that.) Per recently-established project policy, this is a candidate for back-patching into out-of-support branches: it suppresses an annoying compiler warning but changes no behavior. Hence, back-patch all the way to 9.2. Discussion: https://postgr.es/m/CA+hUKGLTSZQwES8VNPmWO9AO0wSeLt36OCPDAZTccT1h7Q7kTQ@mail.gmail.com	2022-01-23 11:09:25 -05:00
Tomas Vondra	4d8c4d2061	Correct type of front_pathkey to PathKey In sort_inner_and_outer we iterate a list of PathKey elements, but the variable is declared as (List *). This mistake is benign, because we only pass the pointer to lcons() and never dereference it. This exists since ~2004, but it's confusing. So fix and backpatch to all supported branches. Backpatch-through: 10 Discussion: https://postgr.es/m/bf3a6ea1-a7d8-7211-0669-189d5c169374%40enterprisedb.com	2022-01-23 04:02:22 +01:00
Tomas Vondra	267ccc38ba	Check syscache result in AlterStatistics The syscache lookup may return NULL even for valid OID, for example due to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary. Without it, it may fail with a segfault. Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER STATISTICS ... SET STATISTICS was introduced. Backpatch-through: 13 Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org	2022-01-23 03:20:32 +01:00
Tom Lane	31b7b4d26e	Flush table's relcache during ALTER TABLE ADD PRIMARY KEY USING INDEX. Previously, unless we had to add a NOT NULL constraint to the column, this command resulted in updating only the index's relcache entry. That's problematic when replication behavior is being driven off the existence of a primary key: other sessions (and ours too for that matter) failed to recalculate their opinion of whether the table can be replicated. Add a relcache invalidation to fix it. This has been broken since pg_class.relhaspkey was removed in v11. Before that, updating the table's relhaspkey value sufficed to cause a cache flush. Hence, backpatch to v11. Report and patch by Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB5716EBE01F112C62F8F9B786947B9@OS0PR01MB5716.jpnprd01.prod.outlook.com	2022-01-22 13:32:40 -05:00
Tom Lane	64ebb43df0	Fix race condition in gettext() initialization in libpq and ecpglib. In libpq and ecpglib, multiple threads can concurrently enter the initialization logic for message localization. Since we set the its-done flag before actually doing the work, it'd be possible for some threads to reach gettext() before anyone has called bindtextdomain(). Barring bugs in libintl itself, this would not result in anything worse than failure to localize some early messages. Nonetheless, it's a bug, and an easy one to fix. Noted while investigating bug #17299 from Clemens Zeidler (much thanks to Liam Bowen for followup investigation on that). It currently appears that that actually is a bug in libintl itself, but that doesn't let us off the hook for this bit. Back-patch to all supported versions. Discussion: https://postgr.es/m/17299-7270741958c0b1ab@postgresql.org Discussion: https://postgr.es/m/CAE7q7Eit4Eq2=bxce=Fm8HAStECjaXUE=WBQc-sDDcgJQ7s7eg@mail.gmail.com	2022-01-21 15:36:28 -05:00
Andres Freund	fd48e5f5d3	fsync pg_logical/mappings in CheckPointLogicalRewriteHeap(). While individual logical rewrite files were synced to disk, the directory was not. On some filesystems that could lead to loosing directory entries after a crash. Reported-By: Tom Lane <tgl@sss.pgh.pa.us> Author: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/867F2E29-2782-4869-970E-B984C6D35A8F@amazon.com Backpatch: 10-	2022-01-21 11:24:12 -08:00
Michael Paquier	b5f634116e	Fix one-off bug causing missing commit timestamps for subtransactions The logic in charge of writing commit timestamps (enabled with track_commit_timestamp) for subtransactions had a one-bug bug, where it would be possible that commit timestamps go missing for the last subtransaction committed. While on it, simplify a bit the iteration logic in the loop writing the commit timestamps, as per suggestions from Kyotaro Horiguchi and Tom Lane, so as some variable initializations are not part of the loop itself. Issue introduced in `73c986a`. Analyzed-by: Alex Kingsborough Author: Alex Kingsborough, Kyotaro Horiguchi Discussion: https://postgr.es/m/73A66172-4050-4F2A-B7F1-13508EDA2144@amazon.com Backpatch-through: 10	2022-01-21 14:54:51 +09:00
Tom Lane	4828a80069	Tighten TAP tests' tracking of postmaster state some more. Commits `6c4a8903b` et al. had a couple of deficiencies: * The logic I added to Cluster::start to see if a PID file is present could be fooled by a stale PID file left over from a previous postmaster. To fix, if we're not sure whether we expect to find a running postmaster or not, validate the PID using "kill 0". * 017_shm.pl has a loop in which it just issues repeated Cluster::start calls; this will fail if some invocation fails but leaves self->_pid set. Per buildfarm results, the above fix is not enough to make this safe: we might have "validated" a PID for a postmaster that exits immediately after we look. Hence, match each failed start call with a stop call that will get us back to the self->_pid == undef state. Add a fail_ok option to Cluster::stop to make this work. Discussion: https://postgr.es/m/CA+hUKGKV6fOHvfiPt8=dOKzvswjAyLoFoJF1iQXMNpi7+hD1JQ@mail.gmail.com	2022-01-20 17:28:07 -05:00
Andrew Dunstan	31680730ef	Allow clean.bat to be run from anywhere This was omitted from `c3879a7b4c` which modified the other msvc .bat files. Per request from Juan José Santamaría Flecha Discussion: https://postgr.es/m/CAC+AXB0_fxYGbQoaYjCA8um7TTbOVP4L9aXnVmHwK8WzaT4gdA@mail.gmail.com Backpatch to all live branches.	2022-01-20 10:20:51 -05:00
Thomas Munro	8e8e253a7b	Try to stabilize reloptions test, again. Since the test requires reproducible behavior from VACUUM, and since DISABLE_PAGE_SKIPPING doesn't actually disable all forms of page skipping, let's use a temporary table to avoid contention. Back-patch to 12, like commit `3414099c`. Discussion: https://postgr.es/m/20220120052404.sonrhq3f3qgplpzj%40alap3.anarazel.de	2022-01-20 23:31:01 +13:00
Tom Lane	6eec809fbc	TAP tests: check for postmaster.pid anyway when "pg_ctl start" fails. "pg_ctl start" might start a new postmaster and then return failure anyway, for example if PGCTLTIMEOUT is exceeded. If there is a postmaster there, it's still incumbent on us to shut it down at script end, so check for the PID file even though we are about to fail. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/647439.1642622744@sss.pgh.pa.us	2022-01-19 16:29:09 -05:00
Thomas Munro	639a9293d4	Try to stabilize the reloptions test. Where we test vacuum_truncate's effects, sometimes this is failing to truncate as expected on the build farm. That could be explained by page skipping, so disable it explicitly, with the theory that commit `fe246d1c` didn't go far enough. Back-patch to 12, where the vacuum_truncate tests were added. Discussion: https://postgr.es/m/CA%2BhUKGLT2UL5_JhmBzUgkdyKfc%3D5J-gJSQJLysMs4rqLUKLAzw%40mail.gmail.com	2022-01-19 07:34:38 +13:00
Tom Lane	90e0f9fd8c	Fix psql \d's query for identifying parent triggers. The original coding (from `c33869cc3`) failed with "more than one row returned by a subquery used as an expression" if there were unrelated triggers of the same tgname on parent partitioned tables. (That's possible because statement-level triggers don't get inherited.) Fix by applying LIMIT 1 after sorting the candidates by inheritance level. Also, wrap the subquery in a CASE so that we don't have to execute it at all when the trigger is visibly non-inherited. Aside from saving some cycles, this avoids the need for a confusing and undocumented NULLIF(). While here, tweak the format of the emitted query to look a bit nicer for "psql -E", and add some explanation of this subquery, because it badly needs it. Report and patch by Justin Pryzby (with some editing by me). Back-patch to v13 where the faulty code came in. Discussion: https://postgr.es/m/20211217154356.GJ17618@telsasoft.com	2022-01-17 21:18:49 -05:00
Tom Lane	d18ec312f9	Avoid calling gettext() in signal handlers. It seems highly unlikely that gettext() can be relied on to be async-signal-safe. psql used to understand that, but someone got it wrong long ago in the src/bin/scripts/ version of handle_sigint, and then the bad idea was perpetuated when those two versions were unified into src/fe_utils/cancel.c. I'm unsure why there have not been field complaints about this ... maybe gettext() is signal-safe once it's translated at least one message? But we have no business assuming any such thing. In cancel.c (v13 and up), I preserved our ability to localize "Cancel request sent" messages by invoking gettext() before the signal handler is set up. In earlier branches I just made src/bin/scripts/ not localize those messages, as psql did then. (Just for extra unsafety, the src/bin/scripts/ version was invoking fprintf() from a signal handler. Sigh.) Noted while fixing signal-safety issues in PQcancel() itself. Back-patch to all supported branches. Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us	2022-01-17 13:30:04 -05:00
Tom Lane	f27af7b880	Avoid calling strerror[_r] in PQcancel(). PQcancel() is supposed to be safe to call from a signal handler, and indeed psql uses it that way. All of the library functions it uses are specified to be async-signal-safe by POSIX ... except for strerror. Neither plain strerror nor strerror_r are considered safe. When this code was written, back in the dark ages, we probably figured "oh, strerror will just index into a constant array of strings" ... but in any locale except C, that's unlikely to be true. Probably the reason we've not heard complaints is that (a) this error-handling code is unlikely to be reached in normal use, and (b) in many scenarios, localized error strings would already have been loaded, after which maybe it's safe to call strerror here. Still, this is clearly unacceptable. The best we can do without relying on strerror is to print the decimal value of errno, so make it do that instead. (This is probably not much loss of user-friendliness, given that it is hard to get a failure here.) Back-patch to all supported branches. Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us	2022-01-17 12:52:44 -05:00
Tom Lane	90a847e6dc	Fix psql's tab-completion of enum label values. Since enum labels have to be single-quoted, this part of the tab completion machinery got side-swiped by commit `cd69ec66c`. A side-effect of that commit is that (at least with some versions of Readline) the text string passed for completion will omit the leading quote mark of the enum label literal. Libedit still acts the same as before, though, so adapt COMPLETE_WITH_ENUM_VALUE so that it can cope with either convention. Also, when we fail to find any valid completion, set rl_completion_suppress_quote = 1. Otherwise readline will go ahead and append a closing quote, which is unwanted. Per report from Peter Eisentraut. Back-patch to v13 where `cd69ec66c` came in. Discussion: https://postgr.es/m/8ca82d89-ec3d-8b28-8291-500efaf23b25@enterprisedb.com	2022-01-16 14:59:20 -05:00
Tomas Vondra	d6817032d2	Build inherited extended stats on partitioned tables Commit `859b3003de` disabled building of extended stats for inheritance trees, to prevent updating the same catalog row twice. While that resolved the issue, it also means there are no extended stats for declaratively partitioned tables, because there are no data in the non-leaf relations. That also means declaratively partitioned tables were not affected by the issue `859b3003de` addressed, which means this is a regression affecting queries that calculate estimates for the whole inheritance tree as a whole (which includes e.g. GROUP BY queries). But because partitioned tables are empty, we can invert the condition and build statistics only for the case with inheritance, without losing anything. And we can consider them when calculating estimates. It may be necessary to run ANALYZE on partitioned tables, to collect proper statistics. For declarative partitioning there should no prior statistics, and it might take time before autoanalyze is triggered. For tables partitioned by inheritance the statistics may include data from child relations (if built `859b3003de`), contradicting the current code. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as `859b3003de`). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-15 19:14:00 +01:00
Tomas Vondra	acfde7c583	Ignore extended statistics for inheritance trees Since commit `859b3003de` we only build extended statistics for individual relations, ignoring the child relations. This resolved the issue with updating catalog tuple twice, but we still tried to use the statistics when calculating estimates for the whole inheritance tree. When the relations contain very distinct data, it may produce bogus estimates. This is roughly the same issue `427c6b5b9` addressed ~15 years ago, and we fix it the same way - by ignoring extended statistics when calculating estimates for the inheritance tree as a whole. We still consider extended statistics when calculating estimates for individual child relations, of course. This may result in plan changes due to different estimates, but if the old statistics were not describing the inheritance tree particularly well it's quite likely the new plans is actually better. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as `859b3003de`). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-15 02:30:06 +01:00
Tom Lane	ca14c4184b	Fix ruleutils.c's dumping of whole-row Vars in more contexts. Commit `7745bc352` intended to ensure that whole-row Vars would be printed with "::type" decoration in all contexts where plain "var.*" notation would result in star-expansion, notably in ROW() and VALUES() constructs. However, it missed the case of INSERT with a single-row VALUES, as reported by Timur Khanjanov. Nosing around ruleutils.c, I found a second oversight: the code for RowCompareExpr generates ROW() notation without benefit of an actual RowExpr, and naturally it wasn't in sync :-(. (The code for FieldStore also does this, but we don't expect that to generate strictly parsable SQL anyway, so I left it alone.) Back-patch to all supported branches. Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az	2022-01-13 17:49:26 -05:00
Andrew Dunstan	32cd4264cc	Avoid warning about uninitialized value in MSVC python3 tests Juan José Santamaría Flecha Backpatch to all live branches	2022-01-10 10:12:31 -05:00
Andrew Dunstan	f3ded9c460	Allow MSVC .bat wrappers to be called from anywhere Instead of using a hardcoded or default path to the perl file the .bat file is a wrapper for, we use a path that means the file is found in the same directory as the .bat file. Patch by Anton Voloshin, slightly tweaked by me. Backpatch to all live branches Discussion: https://postgr.es/m/2b7a674b-5fb0-d264-75ef-ecc7a31e54f8@postgrespro.ru	2022-01-07 16:14:16 -05:00
Tom Lane	86d4bbb56a	Prevent altering partitioned table's rowtype, if it's used elsewhere. We disallow altering a column datatype within a regular table, if the table's rowtype is used as a column type elsewhere, because we lack code to go around and rewrite the other tables. This restriction should apply to partitioned tables as well, but it was not checked because ATRewriteTables and ATPrepAlterColumnType were not on the same page about who should do it for which relkinds. Per bug #17351 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org	2022-01-06 16:46:46 -05:00
Michael Paquier	3f8062bcf7	Reduce relcache access in WAL sender streaming logical changes get_rel_sync_entry(), which is called each time a change needs to be logically replicated, is a rather hot code path in the WAL sender sending logical changes. This code path was doing a relcache access on relkind and relpartition for each logical change, but we only need to know this information when building or re-building the cached information for a relation. Some measurements prove that this is noticeable in perf profiles, particularly when attempting to replicate changes from relations that are not published as these cause less overhead in the WAL sender, delaying further the replication of changes for relations that are published. Issue introduced in `83fd453`. Author: Hou Zhijie Reviewed-by: Kyotaro Horiguchi, Euler Taveira Discussion: https://postgr.es/m/OS0PR01MB5716E863AA9E591C1F010F7A947D9@OS0PR01MB5716.jpnprd01.prod.outlook.com Backpatch-through: 13	2022-01-05 10:27:53 +09:00
Alvaro Herrera	33fdd9f854	Fix silly mistake in Assert	2022-01-04 13:21:23 -03:00
Alvaro Herrera	29f9fb8fe8	Allow special SKIP LOCKED condition in Assert() Under concurrency, it is possible for two sessions to be merrily locking and releasing a tuple and marking it again as HEAP_XMAX_INVALID all the while a third session attempts to lock it, miserably fails at it, and then contemplates life, the universe and everything only to eventually fail an assertion that said bit is not set. Before SKIP LOCKED that was indeed a reasonable expectation, but alas! commit `df630b0dd5` falsified it. This bug is as old as time itself, and even older, if you think time begins with the oldest supported branch. Therefore, backpatch to all supported branches. Author: Simon Riggs <simon.riggs@enterprisedb.com> Discussion: https://postgr.es/m/CANbhV-FeEwMnN8yuMyss7if1ZKjOKfjcgqB26n8pqu1e=q0ebg@mail.gmail.com	2022-01-04 13:01:05 -03:00
Tom Lane	20d08b2c61	Fix index-only scan plans, take 2. Commit `4ace45677` failed to fix the problem fully, because the same issue of attempting to fetch a non-returnable index column can occur when rechecking the indexqual after using a lossy index operator. Moreover, it broke EXPLAIN for such indexquals (which indicates a gap in our test cases :-(). Revert the code changes of `4ace45677` in favor of adding a new field to struct IndexOnlyScan, containing a version of the indexqual that can be executed against the index-returned tuple without using any non-returnable columns. (The restrictions imposed by check_index_only guarantee this is possible, although we may have to recompute indexed expressions.) Support construction of that during setrefs.c processing by marking IndexOnlyScan.indextlist entries as resjunk if they can't be returned, rather than removing them entirely. (We could alternatively require setrefs.c to look up the IndexOptInfo again, but abusing resjunk this way seems like a reasonably safe way to avoid needing to do that.) This solution isn't great from an API-stability standpoint: if there are any extensions out there that build IndexOnlyScan structs directly, they'll be broken in the next minor releases. However, only a very invasive extension would be likely to do such a thing. There's no change in the Path representation, so typical planner extensions shouldn't have a problem. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org	2022-01-03 15:42:27 -05:00
Tom Lane	45ae427141	Fix index-only scan plans when not all index columns can be returned. If an index has both returnable and non-returnable columns, and one of the non-returnable columns is an expression using a Var that is in a returnable column, then a query returning that expression could result in an index-only scan plan that attempts to read the non-returnable column, instead of recomputing the expression from the returnable column as intended. To fix, redefine the "indextlist" list of an IndexOnlyScan plan node as containing null Consts in place of any non-returnable columns. This solves the problem by preventing setrefs.c from falsely matching to such entries. The executor is happy since it only cares about the exposed types of the entries, and ruleutils.c doesn't care because a correct plan won't reference those entries. I considered some other ways to prevent setrefs.c from doing the wrong thing, but this way seems good since (a) it allows a very localized fix, (b) it makes the indextlist structure more compact in many cases, and (c) the indextlist is now a more faithful representation of what the index AM will actually produce, viz. nulls for any non-returnable columns. This is easier to hit since we introduced included columns, but it's possible to construct failing examples without that, as per the added regression test. Hence, back-patch to all supported branches. Per bug #17350 from Louis Jachiet. Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org	2022-01-01 16:12:03 -05:00
Thomas Munro	cadd98cd30	Fix overly generic name in with.sql test. Avoid the name "test". In the 10 branch, this could clash with alter_table.sql, as seen in the build farm. That other instance was already renamed in later branches by commit `2cf8c7aa`, but it's good to future-proof the name here too. Back-patch to 10. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGJf4RAXUyAYVUcQawcptX%3DnhEco3SYpuPK5cCbA-F1eLA%40mail.gmail.com	2021-12-30 17:11:20 +13:00
Michael Paquier	28e1e5c2a9	Correct comment and some documentation about REPLICA_IDENTITY_INDEX catalog/pg_class.h was stating that REPLICA_IDENTITY_INDEX with a dropped index is equivalent to REPLICA_IDENTITY_DEFAULT. The code tells a different story, as it is equivalent to REPLICA_IDENTITY_NOTHING. The behavior exists since the introduction of replica identities, and `fe7fd4e` even added tests for this case but I somewhat forgot to fix this comment. While on it, this commit reorganizes the documentation about replica identities on the ALTER TABLE page, and a note is added about the case of dropped indexes with REPLICA_IDENTITY_INDEX. Author: Michael Paquier, Wei Wang Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/OS3PR01MB6275464AD0A681A0793F56879E759@OS3PR01MB6275.jpnprd01.prod.outlook.com Backpatch-through: 10	2021-12-22 16:38:42 +09:00
Tom Lane	da0d8a4545	Ensure casting to typmod -1 generates a RelabelType. Fix the code changed by commit `5c056b0c2` so that we always generate RelabelType, not something else, for a cast to unspecified typmod. Otherwise planner optimizations might not happen. It appears we missed this point because the previous experiments were done on type numeric: the parser undesirably generates a call on the numeric() length-coercion function, but then numeric_support() optimizes that down to a RelabelType, so that everything seems fine. It misbehaves for types that have a non-optimized length coercion function, such as bpchar. Per report from John Naylor. Back-patch to all supported branches, as the previous patch eventually was. Unfortunately, that no longer includes 9.6 ... we really shouldn't put this type of change into a nearly-EOL branch. Discussion: https://postgr.es/m/CAFBsxsEfbFHEkouc+FSj+3K1sHipLPbEC67L0SAe-9-da8QtYg@mail.gmail.com	2021-12-16 15:36:02 -05:00
Michael Paquier	aadbf825b7	Adjust behavior of some env settings for the TAP tests of MSVC `edc2332` has introduced in vcregress.pl some control on the environment variables LZ4, TAR and GZIP_PROGRAM to allow any TAP tests to be able use those commands. This makes the settings more consistent with src/Makefile.global.in, as the same default gets used for Make and MSVC builds. Each parameter can be changed in buildenv.pl, but as a default gets assigned after loading buldenv.pl, it is not possible to unset any of these, and using an empty value would not work with "\|\|=" either. As some environments may not have a compatible command in their PATH (tar coming from MinGW is an issue, for one), this could break tests without an exit path to bypass any failing test. This commit changes things so as the default values for LZ4, TAR and GZIP_PROGRAM are assigned before loading buildenv.pl, not after. This way, we keep the same amount of compatibility as a GNU build with the same defaults, and it becomes possible to unset any of those values. While on it, this adds some documentation about those three variables in the section dedicated to the TAP tests for MSVC. Per discussion with Andrew Dunstan. Discussion: https://postgr.es/m/YbGYe483803il3X7@paquier.xyz Backpatch-through: 10	2021-12-15 10:40:12 +09:00
Tom Lane	0a5682041d	Fix datatype confusion in logtape.c's right_offset(). This could only matter if (a) long is wider than int, and (b) the heap of free blocks exceeds UINT_MAX entries, which seems pretty unlikely. Still, it's a theoretical bug, so backpatch to v13 where the typo came in (in commit `c02fdc922`). In passing, also make swap_nodes() use consistent datatypes. Ma Liangzhu Discussion: https://postgr.es/m/17336-fc4e522d26a750fd@postgresql.org	2021-12-14 11:46:48 -05:00
Michael Paquier	3f710fc2b4	Remove assertion for replication origins in PREPARE TRANSACTION When using replication origins, pg_replication_origin_xact_setup() is an optional choice to be able to set a LSN and a timestamp to mark the origin, which would be additionally added to WAL for transaction commits or aborts (including 2PC transactions). An assertion in the code path of PREPARE TRANSACTION assumed that this data should always be set, so it would trigger when using replication origins without setting up an origin LSN. Some tests are added to cover more this kind of scenario. Oversight in commit `1eb6d65`. Per discussion with Amit Kapila and Masahiko Sawada. Discussion: https://postgr.es/m/YbbBfNSvMm5nIINV@paquier.xyz Backpatch-through: 11	2021-12-14 10:58:29 +09:00
Andres Freund	65f860e78a	isolationtester: append session name to application_name. When writing / debugging an isolation test it sometimes is useful to see which session holds what lock etc. To make it easier, both as part of spec files and interactively, append the session name to application_name. Since `b1907d688` application_name already contains the test name, this appends the session's name to that. insert-conflict-specconflict did something like this manually, which can now be removed. As we have done lately with other test infrastructure improvements, backpatch this change, to make it easier to backpatch tests. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Michael Paquier <michael@paquier.xyz> Reviewed-By: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/20211211012052.2blmzcmxnxqawd2z@alap3.anarazel.de Backpatch: 10-, to make backpatching of tests easier.	2021-12-13 12:02:45 -08:00
Amit Kapila	3f06c00cf6	Fix double publish of child table's data. We publish the child table's data twice for a publication that has both child and parent tables and is published with publish_via_partition_root as true. This happens because subscribers will initiate synchronization using both parent and child tables, since it gets both as separate tables in the initial table list. Ensure that pg_publication_tables returns only parent tables in such cases. Author: Hou Zhijie Reviewed-by: Greg Nancarrow, Amit Langote, Vignesh C, Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com	2021-12-09 09:00:35 +05:30
Michael Paquier	9acea52ea3	Fix corruption of toast indexes with REINDEX CONCURRENTLY REINDEX CONCURRENTLY run on a toast index or a toast relation could corrupt the target indexes rebuilt, as a backend running in parallel that manipulates toast values would directly release the lock on the toast relation when its local operation is done, rather than releasing the lock once the transaction that manipulated the toast values committed. The fix done here is simple: we now hold a ROW EXCLUSIVE lock on the toast relation when saving or deleting a toast value until the transaction working on them is committed, so as a concurrent reindex happening in parallel would be able to wait for any activity and see any new rows inserted (or deleted). An isolation test is added to check after the case fixed here, which is a bit fancy by design as it relies on allow_system_table_mods to rename the toast table and its index to fixed names. This way, it is possible to reindex them directly without any dependency on the OID of the underlying relation. Note that this could not use a DO block either, as REINDEX CONCURRENTLY cannot be run in a transaction block. The test is backpatched down to 13, where it is possible, thanks to `c4a7a39`, to use allow_system_table_mods in a test suite. Reported-by: Alexey Ermakov Analyzed-by: Andres Freund, Noah Misch Author: Michael Paquier Reviewed-by: Nathan Bossart Discussion: https://postgr.es/m/17268-d2fb426e0895abd4@postgresql.org Backpatch-through: 12	2021-12-08 11:01:19 +09:00
Andrew Dunstan	14c54e40f5	Enable settings used in TAP tests for MSVC builds Certain settings from configuration or the Makefile infrastructure are used by the TAP tests, but were not being set up by vcregress.pl. This remedies those omissions. This should increase test coverage, especially on the buildfarm. Reviewed by Noah Misch Discussion: https://postgr.es/m/17093da5-e40d-8335-d53a-2bd803fc38b0@dunslane.net Backpatch to all live branches.	2021-12-07 15:05:33 -05:00
Tom Lane	a8a983e829	On Windows, also call shutdown() while closing the client socket. Further experimentation shows that commit `6051857fc` is not sufficient when using (some versions of?) OpenSSL. The reason is obscure, but calling shutdown(socket, SD_SEND) improves matters. Per testing by Andrew Dunstan and Alexander Lakhin. Back-patch as before. Discussion: https://postgr.es/m/af5e0bf3-6a61-bb97-6cba-061ddf22ff6b@dunslane.net	2021-12-07 13:34:19 -05:00

... 2 3 4 5 6 ...

36668 Commits