postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-09-28 02:21:50 +02:00

Author	SHA1	Message	Date
Magnus Hagander	3f16cb505d	Fix out of date comment	2020-11-10 13:15:44 +01:00
Magnus Hagander	d2e4bf688e	Remove -o option to postmaster This option was declared obsolete many years ago. Reviewed-By: Tom Lane Discussion: https://postgr.es/m/CABUevEyOE=9CQwZm2j=vwP5+6OLCSoxn9pBjK8gyRdkTzMfqtQ@mail.gmail.com	2020-11-10 13:15:01 +01:00
Andres Freund	6c57f2ed16	jit: Add support for LLVM 12. LLVM 12, to be released in a few months, made some breaking changes to the Orc JIT interface. OrcV2 eventually will make it easier to support features like concurrent JIT compilation, but this commit only allows to compile against LLVM 12. This commit is a bit bigger than desirable. That partially is because the V2 interface is more granular than V1 interface, but also because I chose to make some minor changes to < LLVM 12 code to keep the code somewhat readable. The LLVM 12 support will need to be backpatched. I plan to do so after the patch stewed on the buildfarm for a few days. Author: Andres Freund Discussion: https://postgr.es/m/20201016011244.pmyvr3ee2gbzplq4@alap3.anarazel.de	2020-11-09 20:01:33 -08:00
Peter Geoghegan	180cf876d4	Remove ineffective heapam CHECK_FOR_INTERRUPTS(). Remove a CHECK_FOR_INTERRUPTS() call that could never actually handle an interrupt. We always have a heap page buffer lock at this point. Having a useless CHECK_FOR_INTERRUPTS() call is harmless but misleading. It is probably possible to work around the immediate problem by moving the CHECK_FOR_INTERRUPTS() to before the heap page buffer lock is acquired. That isn't enough to make the function responsive to interrupts, though. The index AM caller will still hold an exclusive buffer lock of its own.	2020-11-09 09:00:12 -08:00
Noah Misch	0c3185e963	In security-restricted operations, block enqueue of at-commit user code. Specifically, this blocks DECLARE ... WITH HOLD and firing of deferred triggers within index expressions and materialized view queries. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. One can work around the vulnerability by disabling autovacuum and not manually running ANALYZE, CLUSTER, REINDEX, CREATE INDEX, VACUUM FULL, or REFRESH MATERIALIZED VIEW. (Don't restore from pg_dump, since it runs some of those commands.) Plain VACUUM (without FULL) is safe, and all commands are fine when a trusted user owns the target object. Performance may degrade quickly under this workaround, however. Back-patch to 9.5 (all supported versions). Reviewed by Robert Haas. Reported by Etienne Stalmans. Security: CVE-2020-25695	2020-11-09 07:32:09 -08:00
Thomas Munro	d50e3b1f8d	Fix assertion in collation version lookup. Commit `257836a7` included an assertion that a version lookup routine is not trying to look up "C" or "POSIX", but that case is reachable with the user-facing SQL function pg_collation_actual_version(). Remove the assertion.	2020-11-08 20:45:29 +13:00
Peter Geoghegan	5a2f154a2e	Improve nbtree README's LP_DEAD section. The description of how LP_DEAD bit setting by index scans works following commit `2ed5b87f` was rather unclear. Clean that up a bit. Also refer to LP_DEAD bit setting within _bt_check_unique() at the start of the same section. This mechanism may actually be more important than the generic kill_prior_tuple mechanism that the section focuses on, so it at least deserves to be mentioned in passing.	2020-11-07 18:51:12 -08:00
Alvaro Herrera	52eec1c53a	Message style improvements * Avoid pointlessly highlighting that an index vacuum was executed by a parallel worker; user doesn't care. * Don't give the impression that a non-concurrent reindex of an invalid index on a TOAST table would work, because it wouldn't. * Add a "translator:" comment for a mysterious message. Discussion: https://postgr.es/m/20201107034943.GA16596@alvherre.pgsql Reviewed-by: Michael Paquier <michael@paquier.xyz>	2020-11-07 19:33:43 -03:00
Peter Eisentraut	bdc4edbea6	Move catalog index declarations Move the system catalog index declarations from catalog/indexing.h to the respective parent tables' catalog/pg_*.h files. The original reason for having it split was that the old genbki system produced the output in the order of the catalog files it read, so all the indexing stuff needed to come separately. But this is no longer the case, and keeping it together makes more sense. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/c7cc82d6-f976-75d6-2e3e-b03d2cab26bb@2ndquadrant.com	2020-11-07 12:26:24 +01:00
Peter Eisentraut	b4c9695e79	Move catalog toast table declarations Move the system catalog toast table declarations from catalog/toasting.h to the respective parent tables' catalog/pg_*.h files. The original reason for having it split was that the old genbki system produced the output in the order of the catalog files it read, so all the toasting stuff needed to come separately. But this is no longer the case, and keeping it together makes more sense. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/c7cc82d6-f976-75d6-2e3e-b03d2cab26bb@2ndquadrant.com	2020-11-07 12:26:24 +01:00
Alvaro Herrera	623644f02c	Plug memory leak in index_get_partition The list of indexes was being leaked when asked for an index that doesn't have an index partition in the table partition. Not a common case admittedly --and in most cases where it occurs, caller throws an error anyway-- but worth fixing for cleanliness and in case any third-party code is calling this function. While at it, remove use of lfirst_oid() to obtain a value we already have. Author: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20201105203606.GF22691@telsasoft.com	2020-11-06 22:52:16 -03:00
Michael Paquier	a05dbf477b	Add GUC_LIST_INPUT and GUC_LIST_QUOTE to unix_socket_directories This should have been done in the initial commit that made unix_socket_directories a list as of `c9b0cbe`. This change allows to support correctly the case of ALTER SYSTEM, where it is possible to specify multiple paths as a list, like the following pattern where flattening is applied to each item: ALTER SYSTEM SET unix_socket_directories = '/path1', '/path2'; Any parameters specified in postgresql.conf are parsed the same way, so there is no compatibility change. pg_dump has a hardcoded list of parameters marked with GUC_LIST_QUOTE, that gets its routine update. These are reordered alphabetically for clarity. Author: Ian Lawrence Barwick Reviewed-by: Peter Eisentraunt, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CAB8KJ=iMOtNY6_sUwV=LQVCJ2zgYHBDyNzVfvE5GN3WQ3v9kQg@mail.gmail.com	2020-11-07 10:30:22 +09:00
Tomas Vondra	7577dd8480	Properly detoast data in brin_form_tuple brin_form_tuple failed to consider the values may be toasted, inserting the toast pointer into the index. This may easily result in index corruption, as the toast data may be deleted and cleaned up by vacuum. The cleanup however does not care about indexes, leaving invalid toast pointers behind, which triggers errors like this: ERROR: missing chunk number 0 for toast value 16433 in pg_toast_16426 A less severe consequence are inconsistent failures due to the index row being too large, depending on whether brin_form_tuple operated on plain or toasted version of the row. For example CREATE TABLE t (val TEXT); INSERT INTO t VALUES ('... long value ...') CREATE INDEX idx ON t USING brin (val); would likely succeed, as the row would likely include toast pointer. Switching the order of INSERT and CREATE INDEX would likely fail: ERROR: index row size 8712 exceeds maximum 8152 for index "idx" because this happens before the row values are toasted. The bug exists since PostgreSQL 9.5 where BRIN indexes were introduced. So backpatch all the way back. Author: Tomas Vondra Reviewed-by: Alvaro Herrera Backpatch-through: 9.5 Discussion: https://postgr.es/m/20201001184133.oq5uq75sb45pu3aw@development Discussion: https://postgr.es/m/20201104010544.zexj52mlldagzowv%40development	2020-11-07 00:39:19 +01:00
Tom Lane	eeda7f6338	Revert "Accept relations of any kind in LOCK TABLE". Revert `59ab4ac32`, as well as the followup fix `33862cb9c`, in all branches. We need to think a bit harder about what the behavior of LOCK TABLE on views should be, and there's no time for that before next week's releases. We'll take another crack at this later. Discussion: https://postgr.es/m/16703-e348f58aab3cf6cc@postgresql.org	2020-11-06 16:17:56 -05:00
Magnus Hagander	5ee180a394	Add pg_strong_random_init function to initialize random number generator Currently only OpenSSL requires this initialization, but in the future other SSL implementations are likely to need it as well. Abstracting this functionality out into a separate function makes this cleaner and more clear, and also removes the dependency on OpenSSL headers from fork_process.c. OpenSSL is special in that we need to initialize this random number generator even if we're not going to use it directly, until we drop support for everything prior to OpenSSL 1.1.1. (And of course also if we actually use it). All other implementations are left empty at this time, but more are expected to be added in the future. Author: Daniel Gustafsson <daniel@yesql.se>, Michael Paquier <michael@paquier.xyz> Reviewed-By: Magnus Hagander <magnus@hagander.net> Discussion: https://postgr.es/m/F6291C3C-747C-4C93-BCE0-28BB420B1FF5@yesql.se	2020-11-06 13:21:28 +01:00
Amit Kapila	4f841ce3f7	Use strlcpy instead of memcpy for copying the slot name in pgstat.c. There is no outright bug here but it is better to be consistent with the usage at other places in the same file. In the passing, fix a wrong assertion in pgstat_recv_replslot. Author: Kyotaro Horiguchi Reviewed-by: Sawada Masahiko and Amit Kapila Discussion: https://postgr.es/m/20201104.175523.1704166915688949637.horikyota.ntt@gmail.com	2020-11-06 08:12:48 +05:30
Peter Geoghegan	efc5dcfd8a	Fix wal_consistency_checking nbtree bug. wal_consistency_checking indicated an inconsistency in certain cases involving nbtree page deletion. The underlying issue is that there was a minor difference between the page image produced after a REDO routine ran and the corresponding page image following original execution. This harmless inconsistency has been around forever. We more or less expect total consistency among even deleted nbtree pages these days, though, so this won't do anymore. To fix, tweak the REDO routine to match original execution. Oversight in commit `f47b5e13`.	2020-11-05 15:01:40 -08:00
Tom Lane	5b7bfc3972	Don't throw an error for LOCK TABLE on a self-referential view. LOCK TABLE has complained about "infinite recursion" when applied to a self-referential view, ever since we made it recurse into views in v11. However, that breaks pg_dump's new assumption that it's okay to lock every relation. There doesn't seem to be any good reason to throw an error: if we just abandon the recursion, we've still satisfied the requirement of locking every referenced relation. Per bug #16703 from Andrew Bille (via Alexander Lakhin). Discussion: https://postgr.es/m/16703-e348f58aab3cf6cc@postgresql.org	2020-11-05 11:44:32 -05:00
Peter Geoghegan	48e1291342	Fix nbtree cleanup-only VACUUM stats inaccuracies. Logic for counting heap TIDs from posting list tuples (added by commit `0d861bbb`) was faulty. It didn't count any TIDs/index tuples in the event of no callback being set. This meant that we incorrectly counted no index tuples in clean-up only VACUUMs, which could lead to pg_class.reltuples being spuriously set to 0 in affected indexes. To fix, go back to counting items from the page in cases where there is no callback. This approach isn't very accurate, but it works well enough in practice while avoiding the expense of accessing every index tuple during cleanup-only VACUUMs. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> https://postgr.es/m/20201023174451.69e358f1@firost Backpatch: 13-, where nbtree deduplication was introduced	2020-11-04 18:42:27 -08:00
Thomas Munro	c732c3f8c1	Fix unlinking of SLRU segments. Commit `dee663f7` intended to drop any queued up fsync requests before unlinking segment files, but missed a code path. Fix, by centralizing the forget-and-unlink code into a single function. Reported-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Discussion: https://postgr.es/m/20201104013205.icogbi773przyny5%40development	2020-11-05 13:49:49 +13:00
Tom Lane	40c24bfef9	Improve our ability to regurgitate SQL-syntax function calls. The SQL spec calls out nonstandard syntax for certain function calls, for example substring() with numeric position info is supposed to be spelled "SUBSTRING(string FROM start FOR count)". We accept many of these things, but up to now would not print them in the same format, instead simplifying down to "substring"(string, start, count). That's long annoyed me because it creates an interoperability problem: we're gratuitously injecting Postgres-specific syntax into what might otherwise be a perfectly spec-compliant view definition. However, the real reason for addressing it right now is to support a planned change in the semantics of EXTRACT() a/k/a date_part(). When we switch that to returning numeric, we'll have the parser translate EXTRACT() to some new function name (might as well be "extract" if you ask me) and then teach ruleutils.c to reverse-list that per SQL spec. In this way existing calls to date_part() will continue to have the old semantics. To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX, and make the parser insert that rather than COERCE_EXPLICIT_CALL when the input has SQL-spec decoration. (But if the input has the form of a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even if it's calling one of these functions.) Then ruleutils.c recognizes COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know which decoration to emit using hard-wired knowledge about the functions that could be called this way. (While this solution isn't extensible without manual additions, neither is the grammar, so this doesn't seem unmaintainable.) Notice that this solution will reverse-list a function call with SQL decoration only if it was entered that way; so dump-and-reload will not by itself produce any changes in the appearance of views. This requires adding a CoercionForm field to struct FuncCall. (I couldn't resist the temptation to rearrange that struct's field order a tad while I was at it.) FuncCall doesn't appear in stored rules, so that change isn't a reason for a catversion bump, but I did one anyway because the new enum value for CoercionForm fields could confuse old backend code. Possible future work: * Perhaps CoercionForm should now be renamed to DisplayForm, or something like that, to reflect its more general meaning. This'd require touching a couple hundred places, so it's not clear it's worth the code churn. * The SQLValueFunction node type, which was invented partly for the same goal of improving SQL-compatibility of view output, could perhaps be replaced with regular function calls marked with COERCE_SQL_SYNTAX. It's unclear if this would be a net code savings, however. Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu	2020-11-04 12:34:50 -05:00
Tom Lane	f21636e5d5	Remove useless entries for aggregate functions from fmgrtab.c. Gen_fmgrtab.pl treated aggregate functions the same as other built-in functions, which is wasteful because there is no real need to have entries for them in the fmgr_builtins[] table. Suppressing those entries saves about 3KB in the compiled table on my machine; which is not a lot but it's not nothing either, considering that that table is pretty "hot". The only outside code change needed is that ExecInitWindowAgg() can't be allowed to call fmgr_info_cxt() on a plain aggregate function. But that saves a few cycles anyway. Having done that, the aggregate_dummy() function is unreferenced and might as well be dropped. Using "aggregate_dummy" as the prosrc value for an aggregate is now just a documentation convention not something that matters. There was some discussion of using NULL instead to save a few bytes in pg_proc, but we'd have to remove prosrc's BKI_FORCE_NOT_NULL marking which doesn't seem a great idea. Anyway, it's possible there's client-side code that expects to see "aggregate_dummy" there, so I'm loath to change it without a strong reason. Discussion: https://postgr.es/m/533989.1604263665@sss.pgh.pa.us	2020-11-04 11:25:56 -05:00
Fujii Masao	113d3591b8	Fix segmentation fault that commit `ac22929a26` caused. Commit `ac22929a26` changed recoveryWakeupLatch so that it's reset to NULL at the end of recovery. This change could cause a segmentation fault in the buildfarm member 'elver'. Previously the latch was reset to NULL after calling ShutdownWalRcv(). But there could be a window between ShutdownWalRcv() and the actual exit of walreceiver. If walreceiver set the latch during that window, the segmentation fault could happen. To fix the issue, this commit changes walreceiver so that it sets the latch only when the latch has not been reset to NULL yet. Author: Fujii Masao Discussion: https://postgr.es/m/5c1f8a85-747c-7bf9-241e-dd467d8a3586@iki.fi	2020-11-04 21:49:00 +09:00
Peter Eisentraut	560564d3ad	Enable hash partitioning of text arrays hash_array_extended() needs to pass PG_GET_COLLATION() to the hash function of the element type. Otherwise, the hash function of a collation-aware data type such as text will error out, since the introduction of nondeterministic collation made hash functions require a collation, too. The consequence of this is that before this change, hash partitioning using an array over text in the partition key would not work. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/32c1fdae-95c6-5dc6-058a-a90330a3b621%40enterprisedb.com	2020-11-04 12:46:28 +01:00
Fujii Masao	ac22929a26	Get rid of the dedicated latch for signaling the startup process. This commit gets rid of the dedicated latch for signaling the startup process in favor of using its procLatch, since that comports better with possible generic signal handlers using that latch. Commit `1e53fe0e70` changed background processes so that they use standard SIGHUP handler. Like that, this commit also makes the startup process use standard SIGHUP handler to simplify the code. Author: Fujii Masao Reviewed-by: Bharath Rupireddy, Michael Paquier Discussion: https://postgr.es/m/CALj2ACXPorUqePswDtOeM_s82v9RW32E1fYmOPZ5NuE+TWKj_A@mail.gmail.com	2020-11-04 16:43:43 +09:00
Fujii Masao	02d332297f	Use standard SIGHUP handler in syslogger. Commit `1e53fe0e70` changed background processes so that they use standard SIGHUP handler. Like that, this commit makes syslogger use standard SIGHUP handler to simplify the code. Author: Bharath Rupireddy Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CALj2ACXPorUqePswDtOeM_s82v9RW32E1fYmOPZ5NuE+TWKj_A@mail.gmail.com	2020-11-04 14:48:02 +09:00
Thomas Munro	9f12a3b95d	Tolerate version lookup failure for old style Windows locale names. Accept that we can't get versions for such locale names for now. Users will need to specify the newer language tag format to enable the collation versioning feature. It's not clear that we can do automatic conversion from the old style to the new style reliably enough for this purpose. Unfortunately, this means that collation versioning probably won't work for the default collation unless you provide something like en-US at initdb or CREATE DATABASE time (though, for reasons not yet understood, it does seem to work on some systems). It'd be nice to find a better solution, or document this quirk if we settle on it, but this should unbreak the 3 failing build farm animals in the meantime. Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com	2020-11-04 15:13:08 +13:00
Michael Paquier	e152506ade	Revert pg_relation_check_pages() This reverts the following set of commits, following complaints about the lack of portability of the central part of the code in bufmgr.c as well as the use of partition mapping locks during page reads: `c780a7a9` `f2b88396` `b787d4ce` `ce7f772c` `60a51c6b` Per discussion with Andres Freund, Robert Haas and myself. Bump catalog version. Discussion: https://postgr.es/m/20201029181729.2nrub47u7yqncsv7@alap3.anarazel.de	2020-11-04 10:21:46 +09:00
Tomas Vondra	90851d1d26	Use INT64_FORMAT to print int64 variables in sort debug Commit `6ee3b5fb99` cleaned up most of the long/int64 confusion related to incremental sort, but the sort debug messages were still using %ld for int64 variables. So fix that. Author: Haiying Tang Backpatch-through: 13, where the incremental sort code was added Discussion: https://postgr.es/m/4250be9d350c4992abb722a76e288aef%40G08CNEXMBPEKD05.g08.fujitsu.local	2020-11-03 22:31:57 +01:00
Tomas Vondra	ebb7ae839d	Fix get_useful_pathkeys_for_relation for volatile expressions When considering Incremental Sort below a Gather Merge, we need to be a bit more careful when matching pathkeys to EC members. It's not enough to find a member whose Vars are all in the current relation's target; volatile expressions in particular need to be contained in the target, otherwise it's too early to use the pathkey. Reported-by: Jaime Casanova Author: James Coleman Reviewed-by: Tomas Vondra Backpatch-through: 13, where the incremental sort code was added Discussion: https://postgr.es/m/CAJGNTeNaxpXgBVcRhJX%2B2vSbq%2BF2kJqGBcvompmpvXb7pq%2BoFA%40mail.gmail.com	2020-11-03 22:31:57 +01:00
Tom Lane	92f87182f2	Guard against core dump from uninitialized subplan. If the planner erroneously puts a non-parallel-safe SubPlan into a parallelized portion of the query tree, nodeSubplan.c will fail in the worker processes because it finds a null in es_subplanstates, which it's unable to cope with. It seems worth a test-and-elog to make that an error case rather than a core dump case. This probably should have been included in commit `16ebab688`, which was responsible for allowing nulls to appear in es_subplanstates to begin with. So, back-patch to v10 where that came in. Discussion: https://postgr.es/m/924226.1604422326@sss.pgh.pa.us	2020-11-03 16:16:36 -05:00
Tom Lane	17fb60387c	Improve error messages around REPLICATION and BYPASSRLS properties. Clarify wording as per suggestion from Wolfgang Walther. No back-patch; this doesn't seem worth thrashing translatable strings in the back branches. Tom Lane and Stephen Frost Discussion: https://postgr.es/m/a5548a9f-89ee-3167-129d-162b5985fcf8@technowledgy.de	2020-11-03 15:49:05 -05:00
Tom Lane	d907bd0543	Allow users with BYPASSRLS to alter their own passwords. The intention in commit `491c029db` was to require superuserness to change the BYPASSRLS property, but the actual effect of the coding in AlterRole() was to require superuserness to change anything at all about a BYPASSRLS role. Other properties of a BYPASSRLS role should be changeable under the same rules as for a normal role, though. Fix that, and also take care of some documentation omissions related to BYPASSRLS and REPLICATION role properties. Tom Lane and Stephen Frost, per bug report from Wolfgang Walther. Back-patch to all supported branches. Discussion: https://postgr.es/m/a5548a9f-89ee-3167-129d-162b5985fcf8@technowledgy.de	2020-11-03 15:41:32 -05:00
Peter Eisentraut	bf797a8d97	Disallow ALTER TABLE ONLY / DROP EXPRESSION The current implementation cannot handle this correctly, so just forbid it for now. GENERATED clauses must be attached to the column definition and cannot be added later like DEFAULT, so if a child table has a generation expression that the parent does not have, the child column will necessarily be an attlocal column. So to implement ALTER TABLE ONLY / DROP EXPRESSION, we'd need extra code to update attislocal of the direct child tables, somewhat similar to how DROP COLUMN does it, so that the resulting state can be properly dumped and restored. Discussion: https://www.postgresql.org/message-id/flat/15830.1575468847%40sss.pgh.pa.us	2020-11-03 15:28:23 +01:00
Magnus Hagander	13cfa02f77	Improve error handling in backend OpenSSL implementation Commit `d94c36a45a` introduced error handling to sslinfo to handle OpenSSL errors gracefully. This ports this errorhandling to the backend TLS implementation. Author: Daniel Gustafsson <daniel@yesql.se>	2020-11-03 09:55:51 +01:00
Amit Kapila	8c2d8f6cc4	Fix typos. Author: Hou Zhijie Discussion: https://postgr.es/m/855a9421839d402b8b351d273c89a8f8@G08CNEXMBPEKD05.g08.fujitsu.local	2020-11-03 08:38:27 +05:30
Tom Lane	0a4b340312	Fix unportable use of getnameinfo() in pg_hba_file_rules view. fill_hba_line() thought it could get away with passing sizeof(struct sockaddr_storage) rather than the actual addrlen previously returned by getaddrinfo(). While that appears to work on many platforms, it does not work on FreeBSD 11: you get back a failure, which leads to the view showing NULL for the address and netmask columns in all rows. The POSIX spec for getnameinfo() is pretty clearly on FreeBSD's side here: you should pass the actual address length. So it seems plausible that there are other platforms where this coding also fails, and we just hadn't noticed. Also, IMO the fact that getnameinfo() failure leads to a NULL output is pretty bogus in itself. Our pg_getnameinfo_all() wrapper is careful to emit "???" on failure, and we should use that in such cases. NULL should only be emitted in rows that don't have IP addresses. Per bug #16695 from Peter Vandivier. Back-patch to v10 where this code was added. Discussion: https://postgr.es/m/16695-a665558e2f630be7@postgresql.org	2020-11-02 21:11:50 -05:00
Tom Lane	e1339bfc7a	Remove special checks for pg_rewrite.ev_qual and ev_action being NULL. make_ruledef() and make_viewdef() were coded to cope with possible null-ness of these columns, but they've been marked BKI_FORCE_NOT_NULL for some time. So there's not really any need to do more than what we do for the other columns of pg_rewrite, i.e. just Assert that we got non-null results. (There is a school of thought that says Asserts aren't the thing to do to check for corrupt data, but surely here is not the place to start if we want such a policy.) Also, remove long-dead-if-indeed-it-ever-wasn't-dead handling of an empty actions list in make_ruledef(). That's an error case and should be treated as such. (DO INSTEAD NOTHING is represented by a CMD_NOTHING Query, not an empty list; cf transformRuleStmt.) Kyotaro Horiguchi, some changes by me Discussion: https://postgr.es/m/CAEudQApoA=tMTic6xEPYP_hsNZ8XtToVThK_0x7D_aFQYowq3w@mail.gmail.com	2020-11-02 14:34:34 -05:00
Tom Lane	8e1f37c07a	Rethink the generation rule for fmgroids.h macros. Traditionally, the names of fmgroids.h macros for pg_proc OIDs have been constructed from the prosrc field. But sometimes the same C function underlies multiple pg_proc entries, forcing us to make an arbitrary choice of which OID to reference; the other entries are then not namable via fmgroids.h. Moreover, we could not have macros at all for pg_proc entries that aren't for C-coded functions. Instead, use the proname field, and append the proargtypes field (replacing inter-argument spaces with underscores) if proname is not unique. Special-casing unique entries such as F_OIDEQ removes the need to change a lot of code. Indeed, I can only find two places in the tree that need to be adjusted; while this changes quite a few existing entries in fmgroids.h, few of them are referenced from C code. With this patch, all entries in pg_proc.dat have macros in fmgroids.h. Discussion: https://postgr.es/m/472274.1604258384@sss.pgh.pa.us	2020-11-02 11:57:28 -05:00
Peter Eisentraut	dd26a0ad76	Use PG_GETARG_TRANSACTIONID where appropriate Some places were using PG_GETARG_UINT32 where PG_GETARG_TRANSACTIONID would be more appropriate. (Of course, they are the same internally, so there is no externally visible effect.) To do that, export PG_GETARG_TRANSACTIONID outside of xid.c. We also export PG_RETURN_TRANSACTIONID for symmetry, even though there are currently no external users. Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local	2020-11-02 16:48:22 +01:00
Thomas Munro	257836a755	Track collation versions for indexes. Record the current version of dependent collations in pg_depend when creating or rebuilding an index. When accessing the index later, warn that the index may be corrupted if the current version doesn't match. Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe, Michael Paquier, Robert Haas, Tom Lane and others for very helpful discussion. Author: Thomas Munro <thomas.munro@gmail.com> Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions) Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com	2020-11-03 01:19:50 +13:00
Thomas Munro	cd6f479e79	Add pg_depend.refobjversion. Provide a place for the version of referenced database objects to be recorded. A follow-up commit will use this to record dependencies on collation versions for indexes, but similar ideas for other kinds of objects have also been mooted. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com	2020-11-03 00:44:59 +13:00
Thomas Munro	7d1297df08	Remove pg_collation.collversion. This model couldn't be extended to cover the default collation, and didn't have any information about the affected database objects when the version changed. Remove, in preparation for a follow-up commit that will add a new mechanism. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com	2020-11-03 00:44:59 +13:00
Michael Paquier	8a15e735be	Fix some grammar and typos in comments and docs The documentation fixes are backpatched down to where they apply. Author: Justin Pryzby Discussion: https://postgr.es/m/20201031020801.GD3080@telsasoft.com Backpatch-through: 9.6	2020-11-02 15:14:41 +09:00
Amit Kapila	644f0d7cc9	Use Enum for top level logical replication message types. Logical replication protocol uses a single byte character to identify a message type in logical replication protocol. The code uses string literals for the same. Use Enum so that 1. All the string literals used can be found at a single place. This makes it easy to add more types without the risk of conflicts. 2. It's easy to locate the code handling a given message type. 3. When used with switch statements, it is easy to identify the missing cases using -Wswitch. Author: Ashutosh Bapat Reviewed-by: Kyotaro Horiguchi, Andres Freund, Peter Smith and Amit Kapila Discussion: https://postgr.es/m/CAExHW5uPzQ7L0oAd_ENyvaiYMOPgkrAoJpE+ZY5-obdcVT6NPg@mail.gmail.com	2020-11-02 08:18:18 +05:30
David Rowley	a929e17e5a	Allow run-time pruning on nested Append/MergeAppend nodes Previously we only tagged on the required information to allow the executor to perform run-time partition pruning for Append/MergeAppend nodes belonging to base relations. It was thought that nested Append/MergeAppend nodes were just about always pulled up into the top-level Append/MergeAppend and that making the run-time pruning info for any sub Append/MergeAppend nodes was a waste of time. However, that was likely badly thought through. Some examples of cases we're unable to pullup nested Append/MergeAppends are: 1) Parallel Append nodes with a mix of parallel and non-parallel paths into a Parallel Append. 2) When planning an ordered Append scan a sub-partition which is unordered may require a nested MergeAppend path to ensure sub-partitions don't mix up the order of tuples being fed into the top-level Append. Unfortunately, it was not just as simple as removing the lines in createplan.c which were purposefully not building the run-time pruning info for anything but RELOPT_BASEREL relations. The code in add_paths_to_append_rel() was far too sloppy about which partitioned_rels it included for the Append/MergeAppend paths. The original code there would always assume accumulate_append_subpath() would pull each sub-Append and sub-MergeAppend path into the top-level path. While it does not appear that there were any actual bugs caused by having the additional partitioned table RT indexes recorded, what it did mean is that later in planning, when we built the run-time pruning info that we wasted effort and built PartitionedRelPruneInfos for partitioned tables that we had no subpaths for the executor to run-time prune. Here we tighten that up so that partitioned_rels only ever contains the RT index for partitioned tables which actually have subpaths in the given Append/MergeAppend. We can now Assert that every PartitionedRelPruneInfo has a non-empty present_parts. That should allow us to catch any weird corner cases that have been missed. In passing, it seems there is no longer a good reason to have the AppendPath and MergeAppendPath's partitioned_rel fields a List of IntList. We can simply have a List of Relids instead. This is more compact in memory and faster to add new members to. We still know which is the root level partition as these always have a lower relid than their children. Previously this field was used for more things, but run-time partition pruning now remains the only user of it and it has no need for a List of IntLists. Here we also get rid of the RelOptInfo partitioned_child_rels field. This is what was previously used to (sometimes incorrectly) set the Append/MergeAppend path's partitioned_rels field. That was the only usage of that field, so we can happily just remove it. I also couldn't resist changing some nearby code to make use of the newly added for_each_from macro so we can skip the first element in the list without checking if the current item was the first one on each iteration. A bug report from Andreas Kretschmer prompted all this work, however, after some consideration, I'm not personally classing this as a bug fix. So no backpatch. In Andreas' test case, it just wasn't that clear that there was a nested Append since the top-level Append just had a single sub-path which was pulled up a level, per `8edd0e794`. Author: David Rowley Reviewed-by: Amit Langote Discussion: https://postgr.es/m/flat/CAApHDvqSchs%2BubdybcfFaSPB%2B%2BEA7kqMaoqajtP0GtZvzOOR3g%40mail.gmail.com	2020-11-02 13:46:56 +13:00
Michael Paquier	b17ff07aa3	Preserve index data in pg_statistic across REINDEX CONCURRENTLY Statistics associated to an index got lost after running REINDEX CONCURRENTLY, while the non-concurrent case preserves these correctly. The concurrent and non-concurrent operations need to be consistent for the end-user, and missing statistics would force to wait for a new analyze to happen, which could take some time depending on the activity of the existing autovacuum workers. This issue is fixed by copying any existing entries in pg_statistic associated to the old index to the new one. Note that this copy is already done with the data of the index in the stats collector. Reported-by: Fabrízio de Royes Mello Author: Michael Paquier, Fabrízio de Royes Mello Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAFcNs+qpFPmiHd1oTXvcPdvAHicJDA9qBUSujgAhUMJyUMb+SA@mail.gmail.com Backpatch-through: 12	2020-11-01 21:22:07 +09:00
Noah Misch	f90e80b913	Reproduce debug_query_string==NULL on parallel workers. Certain background workers initiate parallel queries while debug_query_string==NULL, at which point they attempted strlen(NULL) and died to SIGSEGV. Older debug_query_string observers allow NULL, so do likewise in these newer ones. Back-patch to v11, where commit `7de4a1bcc5` introduced the first of these. Discussion: https://postgr.es/m/20201014022636.GA1962668@rfd.leadboat.com	2020-10-31 08:43:28 -07:00
Tom Lane	970c050575	Fix assertion failure in check_new_partition_bound(). Commit `6b2c4e59d` was overly confident about not being able to see a negative cmpval result from partition_range_bsearch(). Adjust the code to cope with that. Report and patch by Amul Sul; some additional cosmetic changes by me Discussion: https://postgr.es/m/CAAJ_b97WCO=EyVA7fKzc86kKfojHXLU04_zs7-7+yVzm=-1QkQ@mail.gmail.com	2020-10-30 17:00:59 -04:00
Heikki Linnakangas	6f0bc5e1da	Fix missing validation for the new GiST sortsupport functions. Because of this, if you tried to create an operator family with the new sortsupport function, you got an error: ERROR: support function number 11 is invalid for access method gist We missed this in commit `16fa9b2b30` that added the sortsupport function, because it only added sortsupport to a built-in operator family. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/3520A18A-5C38-4697-A2E3-F3BDE3496CD5%40yandex-team.ru	2020-10-30 19:30:19 +02:00
Tom Lane	f90149e628	Don't use custom OID symbols in pg_type.dat, either. On the same reasoning as in commit `36b931214`, forbid using custom oid_symbol macros in pg_type as well as pg_proc, so that we always rely on the predictable macro names generated by genbki.pl. We do continue to grant grandfather status to the names CASHOID and LSNOID, although those are now considered deprecated aliases for the preferred names MONEYOID and PG_LSNOID. This is because there's likely to be client-side code using the old names, and this bout of neatnik-ism doesn't quite seem worth breaking client code. There might be a case for grandfathering EVTTRIGGEROID, too, since externally-maintained PLs may reference that symbol. But renaming such references to EVENT_TRIGGEROID doesn't seem like a particularly heavy lift --- we make far more significant backend API changes in every major release. For now I didn't add that, but we could reconsider if there's pushback. The other names changed here seem pretty unlikely to have any outside uses. Again, we could add alias macros if there are complaints, but for now I didn't. As before, no need for a catversion bump. John Naylor Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com	2020-10-29 13:33:38 -04:00
Andres Freund	1c7675a7a4	Fix wrong data table horizon computation during backend startup. When ComputeXidHorizons() was called before MyDatabaseOid is set, e.g. because a dead row in a shared relation is encountered during InitPostgres(), the horizon for normal tables was computed too aggressively, ignoring all backends connected to a database. During subsequent pruning in a data table the too aggressive horizon could end up still being used, possibly leading to still needed tuples being removed. Not good. This is a bug in `dc7420c2c9`, which the test added in `94bc27b576` made visible, if run with force_parallel_mode set to regress. In that case the bug is reliably triggered, because "pruning_query" is run in a parallel worker and the start of that parallel worker is likely to encounter a dead row in pg_database. The fix is trivial: Compute a more pessimistic data table horizon if MyDatabaseId is not yet known. Author: Andres Freund Discussion: https://postgr.es/m/20201029040030.p4osrmaywhqaesd4@alap3.anarazel.de	2020-10-28 21:49:07 -07:00
Amit Kapila	8e90ec5580	Track statistics for streaming of changes from ReorderBuffer. This adds the statistics about transactions streamed to the decoding output plugin from ReorderBuffer. Users can query the pg_stat_replication_slots view to check these stats and call pg_stat_reset_replication_slot to reset the stats of a particular slot. Users can pass NULL in pg_stat_reset_replication_slot to reset stats of all the slots. Commit `9868167500` has added the basic infrastructure to capture the stats of slot and this commit extends the statistics collector to track additional information about slots. Bump the catversion as we have added new columns in the catalog entry. Author: Ajin Cherian and Amit Kapila Reviewed-by: Sawada Masahiko and Dilip Kumar Discussion: https://postgr.es/m/CAA4eK1+chpEomLzgSoky-D31qev19AmECNiEAietPQUGEFhtVA@mail.gmail.com	2020-10-29 09:11:51 +05:30
Andres Freund	94bc27b576	Centralize horizon determination for temp tables, fixing bug due to skew. This fixes a bug in the edge case where, for a temp table, heap_page_prune() can end up with a different horizon than heap_vacuum_rel(). Which can trigger errors like "ERROR: cannot freeze committed xmax ...". The bug was introduced due to interaction of `a7212be8b9` "Set cutoff xmin more aggressively when vacuuming a temporary table." with `dc7420c2c9` "snapshot scalability: Don't compute global horizons while building snapshots.". The problem is caused by lazy_scan_heap() assuming that the only reason its HeapTupleSatisfiesVacuum() call would return HEAPTUPLE_DEAD is if the tuple is a HOT tuple, or if the tuple's inserting transaction has aborted since the heap_page_prune() call. But after `a7212be8b9` that was also possible in other cases for temp tables, because heap_page_prune() uses a different visibility test after `dc7420c2c9`. The fix is fairly simple: Move the special case logic for temp tables from vacuum_set_xid_limits() to the infrastructure introduced in `dc7420c2c9`. That ensures that the horizon used for pruning is at least as aggressive as the one used by lazy_scan_heap(). The concrete horizon used for temp tables is slightly different than the logic in `dc7420c2c9`, but should always be as aggressive as before (see comments). A significant benefit to centralizing the logic procarray.c is that now the more aggressive horizons for temp tables does not just apply to VACUUM but also to e.g. HOT pruning and the nbtree killtuples logic. Because isTopLevel is not needed by vacuum_set_xid_limits() anymore, I undid the the related changes from `a7212be8b9`. This commit also adds an isolation test ensuring that the more aggressive vacuuming and pruning of temp tables keeps working. Debugged-By: Amit Kapila <amit.kapila16@gmail.com> Debugged-By: Tom Lane <tgl@sss.pgh.pa.us> Debugged-By: Ashutosh Sharma <ashu.coek88@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20201014203103.72oke6hqywcyhx7s@alap3.anarazel.de Discussion: https://postgr.es/m/20201015083735.derdzysdtqdvxshp@alap3.anarazel.de	2020-10-28 18:02:31 -07:00
Michael Paquier	60a51c6b32	Fix incorrect placement of pfree() in pg_relation_check_pages() This would cause the function to crash when more than one page is considered as broken and reported in the SRF. Reported-by: Noriyoshi Shinoda Discussion: https://postgr.es/m/TU4PR8401MB11523D42C315AAF822E74275EE170@TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM	2020-10-29 09:17:34 +09:00
Tom Lane	ad77039fad	Calculate extraUpdatedCols in query rewriter, not parser. It's unsafe to do this at parse time because addition of generated columns to a table would not invalidate stored rules containing UPDATEs on the table ... but there might now be dependent generated columns that were not there when the rule was made. This also fixes an oversight that rewriteTargetView failed to update extraUpdatedCols when transforming an UPDATE on an updatable view. (Since the new calculation is downstream of that, rewriteTargetView doesn't actually need to do anything; but before, there was a demonstrable bug there.) In v13 and HEAD, this leads to easily-visible bugs because (since commit `c6679e4fc`) we won't recalculate generated columns that aren't listed in extraUpdatedCols. In v12 this bitmap is mostly just used for trigger-firing decisions, so you'd only notice a problem if a trigger cared whether a generated column had been updated. I'd complained about this back in May, but then forgot about it until bug #16671 from Michael Paul Killian revived the issue. Back-patch to v12 where this field was introduced. If existing stored rules contain any extraUpdatedCols values, they'll be ignored because the rewriter will overwrite them, so the bug will be fixed even for existing rules. (But note that if someone were to update to 13.1 or 12.5, store some rules with UPDATEs on tables having generated columns, and then downgrade to a prior minor version, they might observe issues similar to what this patch fixes. That seems unlikely enough to not be worth going to a lot of effort to fix.) Discussion: https://postgr.es/m/10206.1588964727@sss.pgh.pa.us Discussion: https://postgr.es/m/16671-2fa55851859fb166@postgresql.org	2020-10-28 13:47:02 -04:00
Tom Lane	36b9312143	Don't use custom OID symbols in pg_proc.dat. We have a perfectly good convention for OID macros for built-in functions already, so making custom symbols is just introducing unnecessary deviation from the convention. Remove the one case that had snuck in, and add an error check in genbki.pl to discourage future instances. Although this touches pg_proc.dat, there's no need for a catversion bump since the actual catalog data isn't changed. John Naylor Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com	2020-10-28 12:18:45 -04:00
Tom Lane	ad1c36b070	Fix foreign-key selectivity estimation in the presence of constants. get_foreign_key_join_selectivity() looks for join clauses that equate the two sides of the FK constraint. However, if we have a query like "WHERE fktab.a = pktab.a and fktab.a = 1", it won't find any such join clause, because equivclass.c replaces the given clauses with "fktab.a = 1 and pktab.a = 1", which can be enforced at the scan level, leaving nothing to be done for column "a" at the join level. We can fix that expectation without much trouble, but then a new problem arises: applying the foreign-key-based selectivity rule produces a rowcount underestimate, because we're effectively double-counting the selectivity of the "fktab.a = 1" clause. So we have to cancel that selectivity out of the estimate. To fix, refactor process_implied_equality() so that it can pass back the new RestrictInfo to its callers in equivclass.c, allowing the generated "fktab.a = 1" clause to be saved in the EquivalenceClass's ec_derives list. Then it's not much trouble to dig out the relevant RestrictInfo when we need to adjust an FK selectivity estimate. (While at it, we can also remove the expensive use of initialize_mergeclause_eclasses() to set up the new RestrictInfo's left_ec and right_ec pointers. The equivclass.c code can set those basically for free.) This seems like clearly a bug fix, but I'm hesitant to back-patch it, first because there's some API/ABI risk for extensions and second because we're usually loath to destabilize plan choices in stable branches. Per report from Sigrid Ehrenreich. Discussion: https://postgr.es/m/1019549.1603770457@sss.pgh.pa.us Discussion: https://postgr.es/m/AM6PR02MB5287A0ADD936C1FA80973E72AB190@AM6PR02MB5287.eurprd02.prod.outlook.com	2020-10-28 11:15:47 -04:00
Michael Paquier	ce7f772c5e	Use correct GetDatum() in pg_relation_check_pages() UInt32GetDatum() was getting used, while the result needs Int64GetDatum(). Oversight in `f2b8839`. Per buildfarm member florican. Discussion: https://postgr.es/m/1226629.1603859189@sss.pgh.pa.us	2020-10-28 13:59:18 +09:00
Michael Paquier	f2b8839695	Add pg_relation_check_pages() to check on-disk pages of a relation This makes use of CheckBuffer() introduced in `c780a7a`, adding a SQL wrapper able to do checks for all the pages of a relation. By default, all the fork types of a relation are checked, and it is possible to check only a given relation fork. Note that if the relation given in input has no physical storage or is temporary, then no errors are generated, allowing full-database checks when coupled with a simple scan of pg_class for example. This is not limited to clusters with data checksums enabled, as clusters without data checksums can still apply checks on pages using the page headers or for the case of a page full of zeros. This function returns a set of tuples consisting of: - The physical file where a broken page has been detected (without the segment number as that can be AM-dependent, which can be guessed from the block number for heap). A relative path from PGPATH is used. - The block number of the broken page. By default, only superusers have an access to this function but execution rights can be granted to other users. The feature introduced here is still minimal, and more improvements could be done, like: - Addition of a start and end block number to run checks on a range of blocks, which would apply only if one fork type is checked. - Addition of some progress reporting. - Throttling, with configuration parameters in function input or potentially some cost-based GUCs. Regression tests are added for positive cases in the main regression test suite, and TAP tests are added for cases involving the emulation of page corruptions. Bump catalog version. Author: Julien Rouhaud, Michael Paquier Reviewed-by: Masahiko Sawada, Justin Pryzby Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com	2020-10-28 12:15:00 +09:00
Michael Paquier	c780a7a90a	Add CheckBuffer() to check on-disk pages without shared buffer loading CheckBuffer() is designed to be a concurrent-safe function able to run sanity checks on a relation page without loading it into the shared buffers. The operation is done using a lock on the partition involved in the shared buffer mapping hashtable and an I/O lock for the buffer itself, preventing the risk of false positives due to any concurrent activity. The primary use of this function is the detection of on-disk corruptions for relation pages. If a page is found in shared buffers, the on-disk page is checked if not dirty (a follow-up checkpoint would flush a valid version of the page if dirty anyway), as it could be possible that a page was present for a long time in shared buffers with its on-disk version corrupted. Such a scenario could lead to a corrupted cluster if a host is plugged off for example. If the page is not found in shared buffers, its on-disk state is checked. PageIsVerifiedExtended() is used to apply the same sanity checks as when a page gets loaded into shared buffers. This function will be used by an upcoming patch able to check the state of on-disk relation pages using a SQL function. Author: Julien Rouhaud, Michael Paquier Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com	2020-10-28 11:12:46 +09:00
Peter Eisentraut	f893e68d76	Add select_common_typmod() This accompanies select_common_type() and select_common_collation(). Typmods were previously combined using hand-coded logic in several places. The logic in select_common_typmod() isn't very exciting, but it makes the code more compact and readable in a few locations, and in the future we can perhaps do more complicated things if desired. As a small enhancement, the type unification of the direct and aggregate arguments of hypothetical-set aggregates now unifies the typmod as well using this new function, instead of just dropping it. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/97df3af9-8b5e-fb7f-a029-3eb7e80d7af9@2ndquadrant.com	2020-10-27 18:10:42 +01:00
Alvaro Herrera	59ab4ac324	Accept relations of any kind in LOCK TABLE The restriction that only tables and views can be locked by LOCK TABLE is quite arbitrary, since the underlying mechanism can lock any relation type. Drop the restriction so that programs such as pg_dump can lock all relations they're interested in, preventing schema changes that could cause a dump to fail after expending much effort. Backpatch to 9.5. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Wells Oliver <wells.oliver@gmail.com> Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql	2020-10-27 13:49:19 -03:00
Peter Eisentraut	0525572860	Fix enum errdetail to mention bytes, not chars The enum label length is in terms of bytes, not charactes. Author: Ian Lawrence Barwick <barwick@gmail.com> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAB8KJ=itZEJ7C9BacTHSYgeUysH4xx8wDiOnyppnSLyn6-g+Bw@mail.gmail.com	2020-10-27 11:50:18 +01:00
Peter Eisentraut	9213462c53	Make procedure OUT parameters work with JDBC The JDBC driver sends OUT parameters with type void. This makes sense when calling a function, so that the parameters are ignored in ParseFuncOrColumn(). For a procedure call we want to treat them as unknown. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/d7e49540-ea92-b4e2-5fff-42036102f968%402ndquadrant.com	2020-10-27 09:01:54 +01:00
Tom Lane	20d3fe9009	In INSERT/UPDATE, use the table's real tuple descriptor as target. Previously, ExecInitModifyTable relied on ExecInitJunkFilter, and thence ExecCleanTypeFromTL, to build the target descriptor from the query tlist. While we just checked (in ExecCheckPlanOutput) that the tlist produces compatible output, this is not a great substitute for the relation's actual tuple descriptor that's available from the relcache. For one thing, dropped columns will not be correctly marked attisdropped; it's a bit surprising that we've gotten away with that this long. But the real reason for being concerned with this is that using the table's descriptor means that the slot will have correct attrmissing data, allowing us to revert the klugy fix of commit `ba9f18abd`. (This commit undoes that one's changes in trigger.c, but keeps the new test case.) Thus we can solve the bogus-trigger-tuple problem with fewer cycles rather than more. No back-patch, since this doesn't fix any additional bug, and it seems somewhat more likely to have unforeseen side effects than ba9f18abd's narrow fix. Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org	2020-10-26 11:36:53 -04:00
Michael Paquier	d401c5769e	Extend PageIsVerified() to handle more custom options This is useful for checks of relation pages without having to load the pages into the shared buffers, and two cases can make use of that: page verification in base backups and the online, lock-safe, flavor. Compatibility is kept with past versions using a macro that calls the new extended routine with the set of options compatible with the original version. Extracted from a larger patch by the same author. Author: Anastasia Lubennikova Reviewed-by: Michael Paquier, Julien Rouhaud Discussion: https://postgr.es/m/608f3476-0598-2514-2c03-e05c7d2b0cbd@postgrespro.ru	2020-10-26 09:55:28 +09:00
Tom Lane	ba9f18abd3	Fix corner case for a BEFORE ROW UPDATE trigger returning OLD. If the old row has any "missing" attributes that are supposed to be retrieved from an associated tuple descriptor, the wrong things happened because the trigger result is shoved directly into an executor slot that lacks the missing-attribute data. Notably, CHECK-constraint verification would incorrectly see those columns as NULL, and so would RETURNING-list evaluation. Band-aid around this by forcibly expanding the tuple before passing it to the trigger function. (IMO it was a fundamental misdesign to put the missing-attribute data into tuple constraints, which so much of the system considers to be optional. But we're probably stuck with that now, and will have to continue to apply band-aids as we find other places with similar issues.) Back-patch to v12. v11 would also have the issue, except that commit `920311ab1` already applied a similar band-aid. That forced expansion in more cases than seem really necessary, though, so this isn't a directly equivalent fix. Amit Langote, with some cosmetic changes by me Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org	2020-10-25 13:57:46 -04:00
David Rowley	e83c9f913c	Fix incorrect parameter name in a function header comment Author: Zhijie Hou Discussion: https://postgr.es/m/14cd74ea00204cc8a7ea5d738ac82cd1@G08CNEXMBPEKD05.g08.fujitsu.local Backpatch-through: 12, where the mistake was introduced	2020-10-25 22:40:03 +13:00
Tom Lane	87a174c0e7	Fix broken XML formatting in EXPLAIN output for incremental sorts. The ExplainCloseGroup arguments for incremental sort usage data didn't match the corresponding ExplainOpenGroup. This only matters for XML-format output, which is probably why we'd not noticed. Daniel Gustafsson, per bug #16683 from Frits Jalvingh Discussion: https://postgr.es/m/16683-8005033324ad34e9@postgresql.org	2020-10-23 11:32:33 -04:00
Heikki Linnakangas	22b73d3cb0	Fix initialization of es_result_relations in EvalPlanQualStart(). Thinko in commit `1375422c78`. EvalPlanQualStart() was mistakenly resetting the parent EState's es_result_relations, when it should initialize the field in the child EPQ EState it just created. That was clearly wrong, but it didn't cause any ill effects, because es_result_relations is currently not used after the ExecInit* phase. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFEuq8AAAmxXsTDVZ1r38cHbfYuiPQx_%3DYyKe2DC-6q4A%40mail.gmail.com	2020-10-23 09:30:08 +03:00
Robert Haas	866e24d47d	Extend amcheck to check heap pages. Mark Dilger, reviewed by Peter Geoghegan, Andres Freund, Álvaro Herrera, Michael Paquier, Amul Sul, and by me. Some last-minute cosmetic revisions by me. Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com	2020-10-22 08:44:18 -04:00
David Rowley	e7c2b95d37	Optimize a few list_delete_ptr calls There is a handful of places where we called list_delete_ptr() to remove some element from a List. In many of these places we know, or with very little additional effort know the index of the ListCell that we need to remove. Here we change all of those places to instead either use one of; list_delete_nth_cell(), foreach_delete_current() or list_delete_last(). Each of these saves from having to iterate over the list to search for the element to remove by its pointer value. There are some small performance gains to be had by doing this, but in the general case, none of these lists are likely to be very large, so the lookup was probably never that expensive anyway. However, some of the calls are in fairly hot code paths, e.g process_equivalence(). So any small gains there are useful. Author: Zhijie Hou and David Rowley Discussion: https://postgr.es/m/b3517353ec7c4f87aa560678fbb1034b@G08CNEXMBPEKD05.g08.fujitsu.local	2020-10-22 14:36:32 +13:00
Peter Eisentraut	8a58347a3c	Fix -Wcast-function-type warnings on Windows/MinGW After `de8feb1f3a`, some warnings remained that were only visible when using GCC on Windows. Fix those as well. Note that the ecpg test source files don't use the full pg_config.h, so we can't use pg_funcptr_t there but have to do it the long way.	2020-10-21 08:17:51 +02:00
Alvaro Herrera	bbb927b4db	Fix ALTER TABLE .. ENABLE/DISABLE TRIGGER recursion More precisely, correctly handle the ONLY flag indicating not to recurse. This was implemented in `86f575948c` by recursing in trigger.c, but that's the wrong place; use ATSimpleRecursion instead, which behaves properly. However, because legacy inheritance has never recursed in that situation, make sure to do that only for new-style partitioning. I noticed this problem while testing a fix for another bug in the vicinity. This has been wrong all along, so backpatch to 11. Discussion: https://postgr.es/m/20201016235925.GA29829@alvherre.pgsql	2020-10-20 19:22:09 -03:00
Amit Kapila	03d51b776d	Change the attribute name in pg_stat_replication_slots view. Change the attribute 'name' to 'slot_name' in pg_stat_replication_slots view to make it clear and that way we will be consistent with the other places like pg_stat_wal_receiver view where we display the same attribute. In the passing, fix the typo in one of the macros in the related code. Bump the catversion as we have modified the name in the catalog as well. Reported-by: Noriyoshi Shinoda Author: Noriyoshi Shinoda Reviewed-by: Sawada Masahiko and Amit Kapila Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com	2020-10-20 10:24:36 +05:30
Tom Lane	c8ab970179	Fix list-munging bug that broke SQL function result coercions. Since commit `913bbd88d`, check_sql_fn_retval() can either insert type coercion steps in-line in the Query that produces the SQL function's results, or generate a new top-level Query to perform the coercions, if modifying the Query's output in-place wouldn't be safe. However, it appears that the latter case has never actually worked, because the code tried to inject the new Query back into the query list it was passed ... which is not the list that will be used for later processing when we execute the SQL function "normally" (without inlining it). So we ended up with no coercion happening at run-time, leading to wrong results or crashes depending on the datatypes involved. While the regression tests look like they cover this area well enough, through a huge bit of bad luck all the test cases that exercise the separate-Query path were checking either inline-able cases (which accidentally didn't have the bug) or cases that are no-ops at runtime (e.g., varchar to text), so that the failure to perform the coercion wasn't obvious. The fact that the cases that don't work weren't allowed at all before v13 probably contributed to not noticing the problem sooner, too. To fix, get rid of the separate "flat" list of Query nodes and instead pass the real two-level list that is going to be used later. I chose to make the same change in check_sql_fn_statements(), although that has no actual bug, just so that we don't need that data structure at all. This is an API change, as evidenced by the adjustments needed to callers outside functions.c. That's a bit scary to be doing in a released branch, but so far as I can tell from a quick search, there are no outside callers of these functions (and they are sufficiently specific to our semantics for SQL-language functions that it's not apparent why any extension would need to call them). In any case, v13 already changed the API of check_sql_fn_retval() compared to prior branches. Per report from pinker. Back-patch to v13 where this code came in. Discussion: https://postgr.es/m/1603050466566-0.post@n3.nabble.com	2020-10-19 14:33:09 -04:00
Heikki Linnakangas	fb5883da86	Remove PartitionRoutingInfo struct. The extra indirection neeeded to access its members via its enclosing ResultRelInfo seems pointless. Move all the fields from PartitionRoutingInfo to ResultRelInfo. Author: Amit Langote Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFViT47Zbr_ASBejiK7iDG8%3DQ1swQ-tjM6caRPQ67pT%3Dw%40mail.gmail.com	2020-10-19 14:42:55 +03:00
Heikki Linnakangas	6973533650	Revise child-to-root tuple conversion map management. Store the tuple conversion map to convert a tuple from a child table's format to the root format in a new ri_ChildToRootMap field in ResultRelInfo. It is initialized if transition tuple capture for FOR STATEMENT triggers or INSERT tuple routing on a partitioned table is needed. Previously, ModifyTable kept the maps in the per-subplan ModifyTableState->mt_per_subplan_tupconv_maps array, or when tuple routing was used, in ResultRelInfo->ri_Partitioninfo->pi_PartitionToRootMap. The new field replaces both of those. Now that the child-to-root tuple conversion map is always available in ResultRelInfo (when needed), remove the TransitionCaptureState.tcs_map field. The callers of Exec*Trigger() functions no longer need to set or save it, which is much less confusing and bug-prone. Also, as a future optimization, this will allow us to delay creating the map for a given result relation until the relation is actually processed during execution. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqHtCWLdK-LO%3DNEsvOdHx%2B7yv4mE_zYK0i3BH7dXb-wxog%40mail.gmail.com	2020-10-19 14:42:55 +03:00
Heikki Linnakangas	f49b85d783	Clean up code to resolve the "root target relation" in nodeModifyTable.c When executing DDL on a partitioned table or on a table with inheritance children, statement-level triggers must be fired against the table given in the original statement. The code to look that up was a bit messy and duplicative. Commit `501ed02cf6` added a helper function, getASTriggerResultRelInfo() (later renamed to getTargetResultRelInfo()) for it, but for some reason it was only used when firing AFTER STATEMENT triggers and the code to fire BEFORE STATEMENT triggers duplicated the logic. Determine the target relation in ExecInitModifyTable(), and set it always in ModifyTableState. Code that used to call getTargetResultRelInfo() can now use ModifyTableState->rootResultRelInfo directly. Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFViT47Zbr_ASBejiK7iDG8%3DQ1swQ-tjM6caRPQ67pT%3Dw%40mail.gmail.com	2020-10-19 14:42:40 +03:00
Peter Eisentraut	26ec6b5948	Avoid invalid alloc size error in shm_mq In shm_mq_receive(), a huge payload could trigger an unjustified "invalid memory alloc request size" error due to the way the buffer size is increased. Add error checks (documenting the upper limit) and avoid the error by limiting the allocation size to MaxAllocSize. Author: Markus Wanner <markus.wanner@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/3bb363e7-ac04-0ac4-9fe8-db1148755bfa%402ndquadrant.com	2020-10-19 08:52:25 +02:00
David Rowley	a90c950fc7	Prevent overly large and NaN row estimates in relations Given a query with enough joins, it was possible that the query planner, after multiplying the row estimates with the join selectivity that the estimated number of rows would exceed the limits of the double data type and become infinite. To give an indication on how extreme a case is required to hit this, the particular example case reported required 379 joins to a table without any statistics, which resulted in the 1.0/DEFAULT_NUM_DISTINCT being used for the join selectivity. This eventually caused the row estimates to go infinite and resulted in an assert failure in initial_cost_mergejoin() where the infinite row estimated was multiplied by an outerstartsel of 0.0 resulting in NaN. The failing assert verified that NaN <= Inf, which is false. To get around this we use clamp_row_est() to cap row estimates at a maximum of 1e100. This value is thought to be low enough that costs derived from it would remain within the bounds of what the double type can represent. Aside from fixing the failing Assert, this also has the added benefit of making it so add_path() will still receive proper numerical values as costs which will allow it to make more sane choices when determining the cheaper path in extreme cases such as the one described above. Additionally, we also get rid of the isnan() checks in the join costing functions. The actual case which originally triggered those checks to be added in the first place never made it to the mailing lists. It seems likely that the new code being added to clamp_row_est() will result in those becoming checks redundant, so just remove them. The fairly harmless assert failure problem does also exist in the backbranches, however, a more minimalistic fix will be applied there. Reported-by: Onder Kalaci Reviewed-by: Tom Lane Discussion: https://postgr.es/m/DM6PR21MB1211FF360183BCA901B27F04D80B0@DM6PR21MB1211.namprd21.prod.outlook.com	2020-10-19 10:53:52 +13:00
Andres Freund	fe2a16d8b3	llvmjit: Work around bug in LLVM 3.9 causing crashes after `72559438f9`. Unfortunately in LLVM 3.9 LLVMGetAttributeCountAtIndex(func, index) crashes when called with an index that has 0 attributes. Since there's no way to work around this in the C API, add a small C++ wrapper doing so. The only reason this didn't fail before `72559438f9` is that there always are function attributes... Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20201016001254.w2nfj7gd74jmb5in@alap3.anarazel.de Backpatch: 11-, like `72559438f9`	2020-10-15 18:17:00 -07:00
Andres Freund	72559438f9	llvmjit: Also copy parameter / return value attributes from template functions. Previously we only copied the function attributes. That caused problems at least on s390x: Because we didn't copy the 'zeroext' attribute for ExecAggTransReparent()'s *IsNull parameters, expressions invoking it didn't ensure that the upper bytes of the registers were zeroed. In the - relatively rare - cases where not, ExecAggTransReparent() wrongly ended up in the newValueIsNull branch due to the register not being zero. Subsequently causing a crash. It's quite possible that this would cause problems on other platforms, and in other places than just ExecAggTransReparent() on s390x. Thanks to Christoph (and the Debian project) for providing me with access to a s390x machine, allowing me to debug this. Reported-By: Christoph Berg Author: Andres Freund Discussion: https://postgr.es/m/20201015083246.kie5726xerdt3ael@alap3.anarazel.de Backpatch: 11-, where JIT was added	2020-10-15 14:29:53 -07:00
Alvaro Herrera	93f84d59f8	Revert "Remove pointless HeapTupleHeaderIndicatesMovedPartitions calls" This reverts commit `85adb5e91e`. It was not intended for commit just yet.	2020-10-15 15:16:11 -03:00
Alvaro Herrera	85adb5e91e	Remove pointless HeapTupleHeaderIndicatesMovedPartitions calls Pavan Deolasee recently noted that a few of the HeapTupleHeaderIndicatesMovedPartitions calls added by commit `5db6df0c01` are useless, since they are done after comparing t_self with t_ctid. But because t_self can never be set to the magical values that indicate that the tuple moved partition, this can never succeed: if the first test fails (so we know t_self equals t_ctid), necessarily the second test will also fail. So these checks can be removed and no harm is done. Discussion: https://postgr.es/m/20200929164411.GA15497@alvherre.pgsql	2020-10-15 14:32:34 -03:00
Alvaro Herrera	b05fe7b442	Review logical replication tablesync code Most importantly, remove optimization in LogicalRepSyncTableStart that skips the normal walrcv_startstreaming/endstreaming dance. The optimization is not critically important for production uses anyway, since it only fires in cases with no activity, and saves an uninteresting amount of work even then. Critically, it obscures bugs by hiding the interesting code path from test cases. Also: in GetSubscriptionRelState, remove pointless relation open; access pg_subscription_rel->srsubstate with GETSTRUCT as is typical rather than SysCacheGetAttr; remove unused 'missing_ok' argument. In wait_for_relation_state_change, use explicit catalog snapshot invalidation rather than obscurely (and expensively) through GetLatestSnapshot. In various places: sprinkle comments more liberally and rewrite a number of them. Other cosmetic code improvements. No backpatch, since no bug is being fixed here. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Petr Jelínek <petr.jelinek@2ndquadrant.com> Discussion: https://postgr.es/m/20201010190637.GA5774@alvherre.pgsql	2020-10-15 11:35:51 -03:00
Heikki Linnakangas	c5b097f8fa	Refactor code for cross-partition updates to a separate function. ExecUpdate() is very long, so extract the part of it that deals with cross-partition updates to a separate function to make it more readable. Per Andres Freund's suggestion. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqEUgb5RdUgxR7Sqco4S09jzJstHiaT2vnCRPGR4JCAPqA%40mail.gmail.com	2020-10-15 17:08:10 +03:00
Michael Paquier	86dba33217	Replace calls of htonl()/ntohl() with pg_bswap.h for GSSAPI encryption The in-core equivalents can make use of built-in functions if the compiler supports this option, making optimizations possible. `0ba99c8` replaced all existing calls in the code base at this time, but `b0b39f7` (GSSAPI encryption) has forgotten to do the switch. Discussion: https://postgr.es/m/20201014055303.GG3349@paquier.xyz	2020-10-15 17:03:56 +09:00
David Rowley	110d81728a	Fixup some appendStringInfo and appendPQExpBuffer calls A number of places were using appendStringInfo() when they could have been using appendStringInfoString() instead. While there's no functionality change there, it's just more efficient to use appendStringInfoString() when no formatting is required. Likewise for some appendStringInfoString() calls which were just appending a single char. We can just use appendStringInfoChar() for that. Additionally, many places were using appendPQExpBuffer() when they could have used appendPQExpBufferStr(). Change those too. Patch by Zhijie Hou, but further searching by me found significantly more places that deserved the same treatment. Author: Zhijie Hou, David Rowley Discussion: https://postgr.es/m/cb172cf4361e4c7ba7167429070979d4@G08CNEXMBPEKD05.g08.fujitsu.local	2020-10-15 20:35:17 +13:00
Thomas Munro	70516a178a	Handle EACCES errors from kevent() better. While registering for postmaster exit events, we have to handle a couple of edge cases where the postmaster is already gone. Commit `815c2f09` missed one: EACCES must surely imply that PostmasterPid no longer belongs to our postmaster process (or alternatively an unexpected permissions model has been imposed on us). Like ESRCH, this should be treated as a WL_POSTMASTER_DEATH event, rather than being raised with ereport(). No known problems reported in the wild. Per code review from Tom Lane. Back-patch to 13. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3624029.1602701929%40sss.pgh.pa.us	2020-10-15 18:34:21 +13:00
Amit Kapila	d7eb52d718	Execute invalidation messages for each XLOG_XACT_INVALIDATIONS message during logical decoding. Prior to commit `c55040ccd0` we have no way of knowing the invalidations before commit. So, while decoding we use to execute all the invalidations at each command end as we had no way of knowing which invalidations happened before that command. Due to this, transactions involving large amounts of DDLs use to take more time and also lead to high CPU usage. But now we know specific invalidations at each command end so we execute only required invalidations. It has been observed that decoding of a transaction containing truncation of a table with 1000 partitions would be finished in 1s whereas before this patch it used to take 4-5 minutes. Author: Dilip Kumar Reviewed-by: Amit Kapila and Keisuke Kuroda Discussion: https://postgr.es/m/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com	2020-10-15 08:17:51 +05:30
Alvaro Herrera	4e9821b6fa	Restore replication protocol's duplicate command tags I removed the duplicate command tags for START_REPLICATION inadvertently in commit `07082b08cc`, but the replication protocol requires them. The fact that the replication protocol was broken was not noticed because all our test cases use an optimized code path that exits early, failing to verify that the behavior is correct for non-optimized cases. Put them back. Also document this protocol quirk. Add a test case that shows the failure. It might still succeed even without the patch when run on a fast enough server, but it suffices to show the bug in enough cases that it would be noticed in buildfarm. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Henry Hinze <henry.hinze@gmail.com> Reviewed-by: Petr Jelínek <petr.jelinek@2ndquadrant.com> Discussion: https://postgr.es/m/16643-eaadeb2a1a58d28c@postgresql.org	2020-10-14 20:12:26 -03:00
Thomas Munro	b94109ce37	Make WL_POSTMASTER_DEATH level-triggered on kqueue builds. If WaitEventSetWait() reports that the postmaster has gone away, later calls to WaitEventSetWait() should continue to report that. Otherwise further waits that occur in the proc_exit() path after we already noticed the postmaster's demise could block forever. Back-patch to 13, where the kqueue support landed. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3624029.1602701929%40sss.pgh.pa.us	2020-10-15 11:41:58 +13:00
Heikki Linnakangas	a04daa97a4	Remove es_result_relation_info from EState. Maintaining 'es_result_relation_info' correctly at all times has become cumbersome, especially with partitioning where each partition gets its own result relation info. Having to set and reset it across arbitrary operations has caused bugs in the past. This changes all the places that used 'es_result_relation_info', to receive the currently active ResultRelInfo via function parameters instead. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com	2020-10-14 11:41:40 +03:00
Heikki Linnakangas	178f2d560d	Include result relation info in direct modify ForeignScan nodes. FDWs that can perform an UPDATE/DELETE remotely using the "direct modify" set of APIs need to access the ResultRelInfo of the target table. That's currently available in EState.es_result_relation_info, but the next commit will remove that field. This commit adds a new resultRelation field in ForeignScan, to store the target relation's RT index, and the corresponding ResultRelInfo in ForeignScanState. The FDW's PlanDirectModify callback is expected to set 'resultRelation' along with 'operation'. The core code doesn't need them for anything, they are for the convenience of FDW's Begin- and IterateDirectModify callbacks. Authors: Amit Langote, Etsuro Fujita Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com	2020-10-14 10:58:38 +03:00
Peter Eisentraut	4e118fc33e	Correct error message Apparently copy-and-pasted from nearby.	2020-10-14 07:54:14 +02:00
Heikki Linnakangas	1375422c78	Create ResultRelInfos later in InitPlan, index them by RT index. Instead of allocating all the ResultRelInfos upfront in one big array, allocate them in ExecInitModifyTable(). es_result_relations is now an array of ResultRelInfo pointers, rather than an array of structs, and it is indexed by the RT index. This simplifies things: we get rid of the separate concept of a "result rel index", and don't need to set it in setrefs.c anymore. This also allows follow-up optimizations (not included in this commit yet) to skip initializing ResultRelInfos for target relations that were not needed at runtime, and removal of the es_result_relation_info pointer. The EState arrays of regular result rels and root result rels are merged into one array. Similarly, the resultRelations and rootResultRelations lists in PlannedStmt are merged into one. It's not actually clear to me why they were kept separate in the first place, but now that the es_result_relations array is indexed by RT index, it certainly seems pointless. The PlannedStmt->resultRelations list is now only needed for ExecRelationIsTargetRelation(). One visible effect of this change is that ExecRelationIsTargetRelation() will now return 'true' also for the partition root, if a partitioned table is updated. That seems like a good thing, although the function isn't used in core code, and I don't see any reason for an FDW to call it on a partition root. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com	2020-10-13 12:57:02 +03:00
Tom Lane	371668a838	Fix GiST buffering build to work when there are included columns. gistRelocateBuildBuffersOnSplit did not get the memo about which attribute count to use. This could lead to a crash if there were included columns and buffering build was chosen. (Because there are random page-split decisions elsewhere in GiST index build, the crashes are not entirely deterministic.) Back-patch to v12 where GiST gained support for included columns. Pavel Borisov Discussion: https://postgr.es/m/CALT9ZEECCV5m7wvxg46PC-7x-EybUmnpupBGhSFMoAAay+r6HQ@mail.gmail.com	2020-10-12 18:01:34 -04:00
Tom Lane	78c0b6ed27	Re-allow testing of GiST buffered builds. Commit `16fa9b2b3` broke the ability to reliably test GiST buffered builds, because it caused sorted builds to be done instead if sortsupport is available, regardless of any attempt to override that. While a would-be test case could try to work around that by choosing an opclass that has no sortsupport function, coverage would be silently lost the moment someone decides it'd be a good idea to add a sortsupport function. Hence, rearrange the logic in gistbuild() so that if "buffering = on" is specified in CREATE INDEX, we will use that method, sortsupport or no. Also document the interaction between sorting and the buffering parameter, as `16fa9b2b3` failed to do. (Note that in fact we still lack any test coverage of buffered builds, but this is a prerequisite to adding a non-fragile test.) Discussion: https://postgr.es/m/3249980.1602532990@sss.pgh.pa.us	2020-10-12 17:09:50 -04:00
Tom Lane	397ea901e8	Fix memory leak when guc.c decides a setting can't be applied now. The prohibitValueChange code paths in set_config_option(), which are executed whenever we re-read a PGC_POSTMASTER variable from postgresql.conf, neglected to free anything before exiting. Thus we'd leak the proposed new value of a PGC_STRING variable, as noted by BoChen in bug #16666. For all variable types, if the check hook creates an "extra" chunk, we'd also leak that. These are malloc not palloc chunks, so there is no mechanism for recovering the leaks before process exit. Fortunately, the values are typically not very large, meaning you'd have to go through an awful lot of SIGHUP configuration-reload cycles to make the leakage amount to anything. Still, for a long-lived postmaster process it could potentially be a problem. Oversight in commit `2594cf0e8`. Back-patch to all supported branches. Discussion: https://postgr.es/m/16666-2c41a4eec61b03e1@postgresql.org	2020-10-12 13:31:24 -04:00
Thomas Munro	f0f13a3a08	Fix estimates for ModifyTable paths without RETURNING. In the past, we always estimated that a ModifyTable node would emit the same number of rows as its subpaths. Without a RETURNING clause, the correct estimate is zero. Fix, in preparation for a proposed parallel write patch that is sensitive to that number. A remaining problem is that for RETURNING queries, the estimated width is based on subpath output rather than the RETURNING tlist. Reviewed-by: Greg Nancarrow <gregn4422@gmail.com> Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV%3DqpFJrR3AcrTS3g%40mail.gmail.com	2020-10-13 00:26:49 +13:00
Tom Lane	fe27009cbb	Recognize network-failure errnos as indicating hard connection loss. Up to now, only ECONNRESET (and EPIPE, in most but not quite all places) received special treatment in our error handling logic. This patch changes things so that related error codes such as ECONNABORTED are also recognized as indicating that the connection's dead and unlikely to come back. We continue to think, however, that only ECONNRESET and EPIPE should be reported as probable server crashes; the other cases indicate network connectivity problems but prove little about the server's state. Thus, there's no change in the error message texts that are output for such cases. The key practical effect is that errcode_for_socket_access() will report ERRCODE_CONNECTION_FAILURE rather than ERRCODE_INTERNAL_ERROR for a network failure. It's expected that this will fix buildfarm member lorikeet's failures since commit `32a9c0bdf`, as that seems to be due to not treating ECONNABORTED equivalently to ECONNRESET. The set of errnos treated this way now includes ECONNABORTED, EHOSTDOWN, EHOSTUNREACH, ENETDOWN, ENETRESET, and ENETUNREACH. Several of these were second-class citizens in terms of their handling in places like get_errno_symbol(), so upgrade the infrastructure where necessary. As committed, this patch assumes that all these symbols are defined everywhere. POSIX specifies all of them except EHOSTDOWN, but that seems to exist on all platforms of interest; we'll see what the buildfarm says about that. Probably this should be back-patched, but let's see what the buildfarm thinks of it first. Fujii Masao and Tom Lane Discussion: https://postgr.es/m/2621622.1602184554@sss.pgh.pa.us	2020-10-10 13:28:12 -04:00
Amit Kapila	f13f2e4841	Fix typos in logical.c and reorderbuffer.c. Reviewed-by: Sawada Masahiko Discussion: https://postgr.es/m/CAA4eK1K6zTpuqf_d7wXCBjo_EF0_B6Fz3Ecp71Vq18t=wG-nzg@mail.gmail.com	2020-10-09 08:16:43 +05:30
Tom Lane	7538708394	Avoid gratuitous inaccuracy in numeric width_bucket(). Multiply before dividing, not the reverse, so that cases that should produce exact results do produce exact results. (width_bucket_float8 got this right already.) Even when the result is inexact, this avoids making it more inexact, since only the division step introduces any imprecision. While at it, fix compute_bucket() to not uselessly repeat the sign check already done by its caller, and avoid duplicating the multiply/divide steps by adjusting variable usage. Per complaint from Martin Visser. Although this seems like a bug fix, I'm hesitant to risk changing width_bucket()'s results in stable branches, so no back-patch. Discussion: https://postgr.es/m/6FA5117D-6AED-4656-8FEF-B74AC18FAD85@brytlyt.com	2020-10-08 13:06:27 -04:00
Tom Lane	8ce423b191	Fix numeric width_bucket() to allow its first argument to be infinite. While the calculation is not well-defined if the bounds arguments are infinite, there is a perfectly sane outcome if the test operand is infinite: it's just like any other value that's before the first bucket or after the last one. width_bucket_float8() got this right, but I was too hasty about the case when adding infinities to numerics (commit `a57d312a7`), so that width_bucket_numeric() just rejected it. Fix that, and sync the relevant error message strings. No back-patch needed, since infinities-in-numeric haven't shipped yet. Discussion: https://postgr.es/m/2465409.1602170063@sss.pgh.pa.us	2020-10-08 12:37:59 -04:00
Michael Paquier	b90b79e140	Fix typo in multixact.c AtEOXact_MultiXact() was referenced in two places with an incorrect routine name. Author: Hou Zhijie Discussion: https://postgr.es/m/1b41e9311e8f474cb5a360292f0b3cb1@G08CNEXMBPEKD05.g08.fujitsu.local	2020-10-08 14:06:12 +09:00
Amit Kapila	9868167500	Track statistics for spilling of changes from ReorderBuffer. This adds the statistics about transactions spilled to disk from ReorderBuffer. Users can query the pg_stat_replication_slots view to check these stats and call pg_stat_reset_replication_slot to reset the stats of a particular slot. Users can pass NULL in pg_stat_reset_replication_slot to reset stats of all the slots. This commit extends the statistics collector to track this information about slots. Author: Sawada Masahiko and Amit Kapila Reviewed-by: Amit Kapila and Dilip Kumar Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com	2020-10-08 09:09:08 +05:30
Tom Lane	8d2a01ae12	Fix optimization hazard in gram.y's makeOrderedSetArgs(), redux. It appears that commit `cf63c641c`, which intended to prevent misoptimization of the result-building step in makeOrderedSetArgs, didn't go far enough: buildfarm member hornet's version of xlc is now optimizing back to the old, broken behavior in which list_length(directargs) is fetched only after list_concat() has changed that value. I'm not entirely convinced whether that's an undeniable compiler bug or whether it can be justified by a sufficiently aggressive interpretation of C sequence points. So let's just change the code to make it harder to misinterpret. Back-patch to all supported versions, just in case. Discussion: https://postgr.es/m/1830491.1601944935@sss.pgh.pa.us	2020-10-07 18:41:39 -04:00
Tom Lane	3db322eaab	Prevent internal overflows in date-vs-timestamp and related comparisons. The date-vs-timestamp, date-vs-timestamptz, and timestamp-vs-timestamptz comparators all worked by promoting the first type to the second and then doing a simple same-type comparison. This works fine, except when the conversion result is out of range, in which case we throw an entirely avoidable error. The sources of such failures are (a) type date can represent dates much farther in the future than the timestamp types can; (b) timezone rotation might cause a just-in-range timestamp value to become a just-out-of-range timestamptz value. Up to now we just ignored these corner-case issues, but now we have an actual user complaint (bug #16657 from Huss EL-Sheikh), so let's do something about it. It turns out that commit `52ad1e659` already built all the necessary infrastructure to support error-free comparisons, but neglected to actually use it in the main-line code paths. Fix that, do a little bit of code style review, and remove the now-duplicate logic in jsonpath_exec.c. Back-patch to v13 where `52ad1e659` came in. We could take this back further by back-patching said infrastructure, but given the small number of complaints so far, I don't feel a great need to. Discussion: https://postgr.es/m/16657-cde2f876d8cc7971@postgresql.org	2020-10-07 17:10:26 -04:00
Amit Kapila	f07707099c	Display the names of missing columns in error during logical replication. In logical replication when a subscriber is missing some columns, it currently emits an error message that says "some" columns are missing, but it doesn't specify the missing column names. Change that to display missing column names which makes an error to be more informative to the user. We have decided not to backpatch this commit as this is a minor usability improvement and no user has reported this. Reported-by: Bharath Rupireddy Author: Bharath Rupireddy Reviewed-by: Kyotaro Horiguchi and Amit Kapila Discussion: https://postgr.es/m/CALj2ACVkW-EXH_4pmBK8tNeHRz5ksUC4WddGactuCjPiBch-cg@mail.gmail.com	2020-10-07 08:14:19 +05:30
Tom Lane	d7885b1f87	Build EC members for child join rels in the right memory context. This patch prevents crashes or wrong plans when partition-wise joins are considered during GEQO planning, as a consequence of the EquivalenceClass data structures becoming corrupt after a GEQO context reset. A remaining problem is that successive GEQO cycles will make multiple copies of the required EC members, since add_child_join_rel_equivalences has no idea that such members might exist already. For now we'll just live with that. The lack of field complaints of crashes suggests that this is a mighty little-used situation. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/1683100.1601860653@sss.pgh.pa.us	2020-10-06 11:43:53 -04:00
Michael Paquier	0a3c864c32	Fix compilation warning in xlog.c Oversight in 9d0bd95. Reported-by: Andres Freund Discussion: https://postgr.es/m/20201006023802.qqfi6m5bw5y77zql@alap3.anarazel.de	2020-10-06 15:29:34 +09:00
Bruce Momjian	253f1025da	Overhaul pg_hba.conf clientcert's API Since PG 12, clientcert no longer supported only on/off, so remove 1/0 as possible values, and instead support only the text strings 'verify-ca' and 'verify-full'. Remove support for 'no-verify' since that is possible by just not specifying clientcert. Also, throw an error if 'verify-ca' is used and 'cert' authentication is used, since cert authentication requires verify-full. Also improve the docs. THIS IS A BACKWARD INCOMPATIBLE API CHANGE. Reported-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200716.093012.1627751694396009053.horikyota.ntt@gmail.com Author: Kyotaro Horiguchi Backpatch-through: master	2020-10-05 15:48:50 -04:00
Tom Lane	18c170a08e	Include the process PID in assertion-failure messages. This should help to identify what happened when studying the postmaster log after-the-fact. While here, clean up some old comments in the same function. Discussion: https://postgr.es/m/1568983.1601845687@sss.pgh.pa.us	2020-10-05 13:40:28 -04:00
Tom Lane	53c6daff43	Fix two latent(?) bugs in equivclass.c. get_eclass_for_sort_expr() computes expr_relids and nullable_relids early on, even though they won't be needed unless we make a new EquivalenceClass, which we often don't. Aside from the probably-minor inefficiency, there's a memory management problem: these bitmapsets will be built in the caller's context, leading to dangling pointers if that is shorter-lived than root->planner_cxt. This would be a live bug if get_eclass_for_sort_expr() could be called with create_it = true during GEQO join planning. So far as I can find, the core code never does that, but it's hard to be sure that no extensions do, especially since the comments make it clear that that's supposed to be a supported case. Fix by not computing these values until we've switched into planner_cxt to build the new EquivalenceClass. generate_join_implied_equalities() uses inner_rel->relids to look up relevant eclasses, but it ought to be using nominal_inner_relids. This is presently harmless because a child RelOptInfo will always have exactly the same eclass_indexes as its topmost parent; but that might not be true forever, and anyway it makes the code confusing. The first of these is old (introduced by me in `f3b3b8d5b`), so back-patch to all supported branches. The second only dates to v13, but we might as well back-patch it to keep the code looking similar across branches. Discussion: https://postgr.es/m/1508010.1601832581@sss.pgh.pa.us	2020-10-05 13:15:39 -04:00
Peter Eisentraut	2453ea1422	Support for OUT parameters in procedures Unlike for functions, OUT parameters for procedures are part of the signature. Therefore, they have to be listed in pg_proc.proargtypes as well as mentioned in ALTER PROCEDURE and DROP PROCEDURE. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2b8490fe-51af-e671-c504-47359dc453c5@2ndquadrant.com	2020-10-05 09:21:43 +02:00
Michael Paquier	10c5291cc2	Fix handling of redundant options with COPY for "freeze" and "header" The handling of those options was inconsistent, as the processing used directly the value assigned to the option to check if it was redundant, leading to patterns like this one to succeed (note that false is specified first): COPY hoge to '/path/to/file/' (header off, header on); And the opposite would fail correctly (note that true is first here): COPY hoge to '/path/to/file/' (header on, header off); While on it, add some tests to check for all redundant patterns with the options of COPY. I have gone through the code and did not notice similar mistakes for other commands. "header" got it wrong since `b63990c`, and "freeze" was wrong from the start as of `8de72b6`. No backpatch is done per the lack of complaints. Reported-by: Rémi Lapeyre Discussion: https://postgr.es/m/20200929072433.GA15570@paquier.xyz Discussion: https://postgr.es/m/0B55BD07-83E4-439F-AACC-FA2D7CF50532@lenstra.fr	2020-10-05 09:43:17 +09:00
Tom Lane	97b6144826	Make postgres.bki use the same literal-string syntax as postgresql.conf. The BKI file's string quoting conventions were previously quite weird, perhaps as a result of repurposing a function built to scan single-quoted strings to scan double-quoted ones. Change to use the same rules as we use in GUC files, allowing some simplifications in genbki.pl and initdb.c. While at it, completely remove the backend's scanstr() function, which was essentially a duplicate of the string dequoting code in guc-file.l. Instead export that one (under a less generic name than it had) and let bootscanner.l use it. Now we can clarify that scansup.c exists only to support the main lexer. We could alternatively have removed GUC_scanstr, but this way seems better since the previous arrangement could mislead a reader into thinking that scanstr() had something to do with the main lexer's handling of string literals. Maybe it did once, but if so it was a long time ago. This patch does not bump catversion, since the initially-installed catalog contents don't change. Note however that successful initdb after applying this patch will require up-to-date postgres.bki as well as postgres and initdb executables. In passing, remove a bunch of very-long-obsolete #include's in bootparse.y and bootscanner.l. John Naylor Discussion: https://postgr.es/m/CACPNZCtDpd18T0KATTmCggO2GdVC4ow86ypiq5ENff1VnauL8g@mail.gmail.com	2020-10-04 16:09:55 -04:00
Fujii Masao	8d9a935965	Add pg_stat_wal statistics view. This view shows the statistics about WAL activity. Currently it has only two columns: wal_buffers_full and stats_reset. wal_buffers_full column indicates the number of times WAL data was written to the disk because WAL buffers got full. This information is useful when tuning wal_buffers. stats_reset column indicates the time at which these statistics were last reset. pg_stat_wal view is also the basic infrastructure to expose other various statistics about WAL activity later. Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format. Bump catalog version. Author: Masahiro Ikeda Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com	2020-10-02 10:17:11 +09:00
Michael Paquier	9d0bd95fa9	Add block information in error context of WAL REDO apply loop Providing this information can be useful for example when diagnosing problems related to recovery conflicts or for recovery issues without having to go through the output generated by pg_waldump to get some information about the blocks a WAL record works on. The block information is printed in the same format as pg_waldump. This already existed in xlog.c for debugging purposes with -DWAL_DEBUG, so adding the block information in the callback has required just a small refactoring. Author: Bertrand Drouvot Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/c31e2cba-efda-762c-f4ad-5c25e5dac3d0@amazon.com	2020-10-02 09:31:50 +09:00
Heikki Linnakangas	265ea56785	Set right-links during sorted GiST index build. This is not strictly necessary, as the right-links are only needed by scans that are concurrent with page splits, and neither scans or page splits can happen during sorted index build. But it seems like a good idea to set them anyway, if we e.g. want to add a check to amcheck in the future to verify that the chain of right-links is complete. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/4D68C21F-9FB9-41DA-B663-FDFC8D143788%40yandex-team.ru	2020-10-01 11:10:43 +03:00
Andres Freund	7b28913bca	Fix and test snapshot behavior on standby. I (Andres) broke this in 623a9CA79bx, because I didn't think about the way snapshots are built on standbys sufficiently. Unfortunately our existing tests did not catch this, as they are all just querying with psql (therefore ending up with fresh snapshots). The fix is trivial, we just need to increment the transaction completion counter in ExpireTreeKnownAssignedTransactionIds(), which is the equivalent of ProcArrayEndTransaction() during recovery. This commit also adds a new test doing some basic testing of the correctness of snapshots built on standbys. To avoid the aforementioned issue of one-shot psql's not exercising the snapshot caching, the test uses a long lived psqls, similar to 013_crash_restart.pl. It'd be good to extend the test further. Reported-By: Ian Barwick <ian.barwick@2ndquadrant.com> Author: Andres Freund <andres@anarazel.de> Author: Ian Barwick <ian.barwick@2ndquadrant.com> Discussion: https://postgr.es/m/61291ffe-d611-f889-68b5-c298da9fb18f@2ndquadrant.com	2020-09-30 17:28:51 -07:00
Alvaro Herrera	9fc2122712	Reword partitioning error message The error message about columns in the primary key not including all of the partition key was unclear; reword it. Backpatch all the way to pg11, where it appeared. Reported-by: Nagaraj Raj <nagaraj.sf@yahoo.com> Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com	2020-09-30 18:25:23 -03:00
Tom Lane	489c9c3407	Fix handling of BC years in to_date/to_timestamp. Previously, a conversion such as to_date('-44-02-01','YYYY-MM-DD') would result in '0045-02-01 BC', as the code attempted to interpret the negative year as BC, but failed to apply the correction needed for our internal handling of BC years. Fix the off-by-one problem. Also, arrange for the combination of a negative year and an explicit "BC" marker to cancel out and produce AD. This is how the negative-century case works, so it seems sane to do likewise. Continue to read "year 0000" as 1 BC. Oracle would throw an error, but we've accepted that case for a long time so I'm hesitant to change it in a back-patch. Per bug #16419 from Saeed Hubaishan. Back-patch to all supported branches. Dar Alathar-Yemen and Tom Lane Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org	2020-09-30 15:40:23 -04:00
Tom Lane	a094c8ff53	Fix make_timestamp[tz] to accept negative years as meaning BC. Previously we threw an error. But make_date already allowed the case, so it is inconsistent as well as unhelpful for make_timestamp not to. Both functions continue to reject year zero. Code and test fixes by Peter Eisentraut, doc changes by me Discussion: https://postgr.es/m/13c0992c-f15a-a0ca-d839-91d3efd965d9@2ndquadrant.com	2020-09-29 13:48:06 -04:00
Alexander Korotkov	927d9abb65	Support for ISO 8601 in the jsonpath .datetime() method The SQL standard doesn't require jsonpath .datetime() method to support the ISO 8601 format. But our to_json[b]() functions convert timestamps to text in the ISO 8601 format in the sake of compatibility with javascript. So, we add support of the ISO 8601 to the jsonpath .datetime() in the sake compatibility with to_json[b](). The standard mode of datetime parsing currently supports just template patterns and separators in the format string. In order to implement ISO 8601, we have to add support of the format string double quotes to the standard parsing mode. Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru Author: Nikita Glukhov, revised by me Backpatch-through: 13	2020-09-29 12:00:04 +03:00
Alexander Korotkov	c2aa562ea5	Remove excess space from jsonpath .datetime() default format string `bffe1bd684` has introduced jsonpath .datetime() method, but default formats for time and timestamp contain excess space between time and timezone. This commit removes this excess space making behavior of .datetime() method standard-compliant. Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru Author: Nikita Glukhov Backpatch-through: 13	2020-09-29 11:00:22 +03:00
Fujii Masao	fd26f78231	Archive timeline history files in standby if archive_mode is set to "always". Previously the standby server didn't archive timeline history files streamed from the primary even when archive_mode is set to "always", while it archives the streamed WAL files. This could cause the PITR to fail because there was no required timeline history file in the archive. The cause of this issue was that walreceiver didn't mark those files as ready for archiving. This commit makes walreceiver mark those streamed timeline history files as ready for archiving if archive_mode=always. Then the archiver process archives the marked timeline history files. Back-patch to all supported versions. Reported-by: Grigory Smolkin Author: Grigory Smolkin, Fujii Masao Reviewed-by: David Zhang, Anastasia Lubennikova Discussion: https://postgr.es/m/54b059d4-2b48-13a4-6f43-95a087c92367@postgrespro.ru	2020-09-29 16:21:46 +09:00
Michael Paquier	e66bcfb4c6	Fix progress reporting of REINDEX CONCURRENTLY This addresses a couple of issues with the so-said subject: - Report the correct parent relation with the index actually being rebuilt or validated. Previously, the command status remained set to the last index created for the progress of the index build and validation, which would be incorrect when working on a table that has more than one index. - Use the correct phase when waiting before the drop of the old indexes. Previously, this was reported with the same status as when waiting before the old indexes are marked as dead. Author: Matthias van de Meent, Michael Paquier Discussion: https://postgr.es/m/CAEze2WhqFgcwe1_tv=sFYhLWV2AdpfukumotJ6JNcAOQs3jufg@mail.gmail.com Backpatch-through: 12	2020-09-29 14:15:57 +09:00
Tom Lane	56fe008996	Add for_each_from, to simplify loops starting from non-first list cells. We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit `1cff1b95a` changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in `1cff1b95a`. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in `1cff1b95a`, and are unused as of `cc99baa43`. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com	2020-09-28 20:33:13 -04:00
Tom Lane	72647ac3bf	Assign collations in partition bound expressions. Failure to do this can result in errors during evaluation of the bound expression, as illustrated by the new regression test. Back-patch to v12 where the ability for partition bounds to be expressions was added. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com	2020-09-28 14:12:38 -04:00
Tom Lane	2dfa3fea88	Remove complaints about COLLATE clauses in partition bound values. transformPartitionBoundValue went out of its way to do the wrong thing: there is no reason to complain about a non-matching COLLATE clause in a partition boundary expression. We're coercing the bound expression to the target column type as though by an implicit assignment, and the rules for implicit assignment say that collations can be implicitly converted. What we do need to do, and the code is not doing, is apply assign_expr_collations() to the bound expression. While this is merely a definition disagreement, that is a bug that needs to be back-patched, so I'll commit it separately. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com	2020-09-28 13:44:01 -04:00
Tom Lane	0a87ddff5c	Cache the result of converting now() to a struct pg_tm. SQL operations such as CURRENT_DATE, CURRENT_TIME, LOCALTIME, and conversion of "now" in a datetime input string have to obtain the transaction start timestamp ("now()") as a broken-down struct pg_tm. This is a remarkably expensive conversion, and since now() does not change intra-transaction, it doesn't really need to be done more than once per transaction. Introducing a simple cache provides visible speedups in queries that compute these values many times, for example insertion of many rows that use a default value of CURRENT_DATE. Peter Smith, with a bit of kibitzing by me Discussion: https://postgr.es/m/CAHut+Pu89TWjq530V2gY5O6SWi=OEJMQ_VHMt8bdZB_9JFna5A@mail.gmail.com	2020-09-28 12:05:03 -04:00
Tom Lane	9d299a4924	Minor mop-up for List improvements. Fix a few places that were using written-out versions of the pg_list.h macros that commit `cc99baa43` just improved, making them also use those macros so as to gain whatever performance improvement is to be had. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com	2020-09-27 22:30:52 -04:00
Tom Lane	41efb83408	Move resolution of AlternativeSubPlan choices to the planner. When commit `bd3daddaf` introduced AlternativeSubPlans, I had some ambitions towards allowing the choice of subplan to change during execution. That has not happened, or even been thought about, in the ensuing twelve years; so it seems like a failed experiment. So let's rip that out and resolve the choice of subplan at the end of planning (in setrefs.c) rather than during executor startup. This has a number of positive benefits: * Removal of a few hundred lines of executor code, since AlternativeSubPlans need no longer be supported there. * Removal of executor-startup overhead (particularly, initialization of subplans that won't be used). * Removal of incidental costs of having a larger plan tree, such as tree-scanning and copying costs in the plancache; not to mention setrefs.c's own costs of processing the discarded subplans. * EXPLAIN no longer has to print a weird (and undocumented) representation of an AlternativeSubPlan choice; it sees only the subplan actually used. This should mean less confusion for users. * Since setrefs.c knows which subexpression of a plan node it's working on at any instant, it's possible to adjust the estimated number of executions of the subplan based on that. For example, we should usually estimate more executions of a qual expression than a targetlist expression. The implementation used here is pretty simplistic, because we don't want to expend a lot of cycles on the issue; but it's better than ignoring the point entirely, as the executor had to. That last point might possibly result in shifting the choice between hashed and non-hashed EXISTS subplans in a few cases, but in general this patch isn't meant to change planner choices. Since we're doing the resolution so late, it's really impossible to change any plan choices outside the AlternativeSubPlan itself. Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us	2020-09-27 12:51:28 -04:00
Tom Lane	e55f718fc4	Revise RelationBuildRowSecurity() to avoid memory leaks. This function leaked some memory while loading qual clauses for an RLS policy. While ordinarily negligible, that could build up in some repeated-reload cases, as reported by Konstantin Knizhnik. We can improve matters by borrowing the coding long used in RelationBuildRuleLock: build stringToNode's result directly in the target context, and remember to explicitly pfree the input string. This patch by no means completely guarantees zero leaks within this function, since we have no real guarantee that the catalog- reading subroutines it calls don't leak anything. However, practical tests suggest that this is enough to resolve the issue. In any case, any remaining leaks are similar to those risked by RelationBuildRuleLock and other relcache-loading subroutines. If we need to fix them, we should adopt a more global approach such as that used by the RECOVER_RELATION_BUILD_MEMORY hack. While here, let's remove the need for an expensive PG_TRY block by using MemoryContextSetParent to reparent an initially-short-lived context for the RLS data. Back-patch to all supported branches. Discussion: https://postgr.es/m/21356c12-8917-8249-b35f-1c447231922b@postgrespro.ru	2020-09-26 16:04:06 -04:00
Amit Kapila	079d0cacf4	Fix the logical replication from HEAD to lower versions. Commit `464824323e` changed the logical replication protocol to allow the streaming of in-progress transactions and used the new version of protocol irrespective of the server version. Use the appropriate version of the protocol based on the server version. Reported-by: Ashutosh Sharma Author: Dilip Kumar Reviewed-by: Ashutosh Sharma and Amit Kapila Discussion: https://postgr.es/m/CAE9k0P=9OpXcNrcU5Gsvd5MZ8GFpiN833vNHzX6Uc=8+h1ft1Q@mail.gmail.com	2020-09-26 10:13:51 +05:30
Thomas Munro	dee663f784	Defer flushing of SLRU files. Previously, we called fsync() after writing out individual pg_xact, pg_multixact and pg_commit_ts pages due to cache pressure, leading to regular I/O stalls in user backends and recovery. Collapse requests for the same file into a single system call as part of the next checkpoint, as we already did for relation files, using the infrastructure developed by commit `3eb77eba`. This can cause a significant improvement to recovery performance, especially when it's otherwise CPU-bound. Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer that it applies to all the SLRU mini-buffer-pools as well as the main buffer pool. Rearrange things so that data collected in CheckpointStats includes SLRU activity. Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions, because they were redundant after the shutdown checkpoint that immediately precedes them. (I'm not sure if they were ever needed, but they aren't now.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com	2020-09-25 19:00:15 +12:00
Robert Haas	55b7e2f4d7	Fix two bugs in MaintainOldSnapshotTimeMapping. The previous coding was confused about whether head_timestamp was intended to represent the timestamp for the newest bucket in the mapping or the oldest timestamp for the oldest bucket in the mapping. Decide that it's intended to be the oldest one, and repair accordingly. To do that, we need to do two things. First, when advancing to a new bucket, don't categorically set head_timestamp to the new timestamp. Do this only if we're blowing out the map completely because a lot of time has passed since we last maintained it. If we're replacing entries one by one, advance head_timestamp by 1 minute for each; if we're filling in unused entries, don't advance head_timestamp at all. Second, fix the computation of how many buckets we need to advance. The previous formula would be correct if head_timestamp were the timestamp for the new bucket, but we're now making all the code agree that it's the timestamp for the oldest bucket, so adjust the formula accordingly. This is certainly a bug fix, but I don't feel good about back-patching it without the introspection tools added by commit `aecf5ee2bb`, and perhaps also some actual tests. Since back-patching the introspection tools might not attract sufficient support and since there are no automated tests of these fixes yet, I'm just committing this to master for now. Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar. Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com	2020-09-24 15:27:19 -04:00
Peter Eisentraut	c005eb00e7	Standardize the printf format for st_size Existing code used various inconsistent ways to printf struct stat's st_size member. The type of that is off_t, which is in most cases a signed 64-bit integer, so use the long long int format for it.	2020-09-24 21:04:21 +02:00
Robert Haas	f5ea92e8d6	Expose oldSnapshotControl definition via new header. This makes it possible for code outside snapmgr.c to examine the contents of this data structure. This commit does not add any code which actually does so; a subsequent commit will make that change. Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar. Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com	2020-09-24 13:33:09 -04:00
Tom Lane	83b61319a1	Improve behavior of tsearch_readline(), and remove t_readline(). Commit `fbeb9da22`, which added the tsearch_readline APIs, left t_readline() in place as a compatibility measure. But that function has been unused and deprecated for twelve years now, so that seems like enough time to remove it. Doing so, and merging t_readline's code into tsearch_readline, aids in making several useful improvements: * The hard-wired 4K limit on line length in tsearch data files is removed, by using a StringInfo buffer instead of a fixed-size buffer. * We can buy back the per-line palloc/pfree added by `3ea7e9550` in the common case where encoding conversion is not required. * We no longer need a separate pg_verify_mbstr call, as that functionality was folded into encoding conversion some time ago. (We could have done some of this stuff while keeping t_readline as a separate API, but there seems little point, since there's no reason for anyone to still be using t_readline directly.) Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se	2020-09-23 20:26:58 -04:00
Thomas Munro	aca74843e4	Fix missing fsync of SLRU directories. Harmonize behavior by moving reponsibility for fsyncing directories down into slru.c. In 10 and later, only the multixact directories were missed (see commit `1b02be21`), and in older branches all SLRUs were missed. Back-patch to all supported releases. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com	2020-09-24 10:39:52 +12:00
Tom Lane	6b2c4e59d0	Improve error cursor positions for problems with partition bounds. We failed to pass down the query string to check_new_partition_bound, so that its attempts to provide error cursor positions were for naught; one must have the query string to get parser_errposition to do anything. Adjust its API to require a ParseState to be passed down. Also, improve the logic inside check_new_partition_bound so that the cursor points at the partition bound for the specific column causing the issue, when one can be identified. That part is also for naught if we can't determine the query position of the column with the problem. Improve transformPartitionBoundValue so that it makes sure that const-simplified partition expressions will be properly labeled with positions. In passing, skip calling evaluate_expr if the value is already a Const, which is surely the most common case. Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com	2020-09-23 18:04:53 -04:00
Tom Lane	3ea7e9550e	Avoid possible dangling-pointer access in tsearch_readline_callback. tsearch_readline() saves the string pointer it returns to the caller for possible use in the associated error context callback. However, the caller will usually pfree that string sometime before it next calls tsearch_readline(), so that there is a window where an ereport will try to print an already-freed string. The built-in users of tsearch_readline() happen to all do that pfree at the bottoms of their loops, so that the window is effectively empty for them. However, this is not documented as a requirement, and contrib/dict_xsyn doesn't do it like that, so it seems likely that third-party dictionaries might have live bugs here. The practical consequences of this seem pretty limited in any case, since production builds wouldn't clobber the freed string immediately, besides which you'd not expect syntax errors in dictionary files being used in production. Still, it's clearly a bug waiting to bite somebody. Fix by pstrdup'ing the string to be saved for the error callback, and then pfree'ing it next time through. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se	2020-09-23 11:36:13 -04:00
Thomas Munro	733fa9aa51	Allow WaitLatch() to be used without a latch. Due to flaws in commit `3347c982ba`, using WaitLatch() without WL_LATCH_SET could cause an assertion failure or crash. Repair. While here, also add a check that the latch we're switching to belongs to this backend, when changing from one latch to another. Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com	2020-09-23 15:17:30 +12:00
Tom Lane	ce90f075f0	Improve the error message for an inappropriate column definition list. The existing message about "a column definition list is only allowed for functions returning "record"" could be given in some cases where it was fairly confusing; in particular, a function with multiple OUT parameters does return record according to pg_proc. Break it down into a couple more cases to deliver a more on-point complaint. Per complaint from Bruce Momjian. Discussion: https://postgr.es/m/798909.1600562993@sss.pgh.pa.us	2020-09-22 10:49:11 -04:00
Tom Lane	f859c2ffa0	Fix a few more generator scripts to produce pgindent-clean output. This completes the project of making all our derived files be pgindent-clean (or else explicitly excluded from indentation), so that no surprises result when running pgindent in a built-out development tree. Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com	2020-09-21 13:58:26 -04:00
Tom Lane	9436041ed8	Copy editing: fix a bunch of misspellings and poor wording. 99% of this is docs, but also a couple of comments. No code changes. Justin Pryzby Discussion: https://postgr.es/m/20200919175804.GE30557@telsasoft.com	2020-09-21 12:43:42 -04:00
Peter Eisentraut	80fc96eceb	Standardize order of use strict and use warnings in Perl code The standard order in PostgreSQL and other code is use strict first, but some code was uselessly inconsistent about this.	2020-09-21 17:04:36 +02:00
Heikki Linnakangas	c47a240fe6	Fix checksum calculation in the new sorting GiST build. Since we're bypassing the buffer manager, we need to call PageSetChecksumInplace() directly. As reported by Justin Pryzby. In the passing, add RelationOpenSmgr() calls before all smgrwrite() and smgrextend() calls. Tom added one before the first smgrextend() call in commit `c2bb287025`, which seems to be enough, but let's play it safe and do it before each one. That's how it's done in the similar code in nbtsort.c, too. Discussion: https://www.postgresql.org/message-id/20200920224446.GF30557@telsasoft.com	2020-09-21 14:50:07 +03:00
Tom Lane	c2bb287025	Fix new GIST build code to work under CLOBBER_CACHE_ALWAYS. Can't say if this fixes all cases, but at least we get through the "point" regression test now, which hyrax's last run did not. Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-09-19%2021%3A27%3A23	2020-09-20 17:08:49 -04:00
Peter Eisentraut	3d13867a2c	Fix whitespace	2020-09-20 14:42:54 +02:00
Tom Lane	28a61fc6c5	Remove precedence hacks no longer needed without postfix operators. It's no longer necessary to assign explicit precedences to GENERATED, NULL_P, PRESERVE, or STRIP_P. Actually, we don't need to assign precedence to IDENT either; that was really just there to govern the behavior of target_el's "a_expr IDENT" production, which no longer ends with that terminal. However, it seems like a good idea to continue to do so, because it provides a reference point for a precedence level that we can assign to other unreserved keywords that lack a natural precedence level. Research by Peter Eisentraut and John Naylor; comment rewrite by me. Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com	2020-09-19 15:11:26 -04:00
Thomas Munro	ff28809feb	Code review for dynahash change. Commit `be0a6666` left behind a comment about the order of some tests that didn't make sense without the expensive division, and in fact we might as well change the order to one that fails more cheaply most of the time as a micro-optimization. Also, remove the "+ 1" applied to max_bucket, to drop an instruction and match the original behavior. Per review from Tom Lane. Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com	2020-09-19 15:53:06 +12:00
Thomas Munro	be0a666665	Remove large fill factor support from dynahash.c. Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It was an integer, whereas for in-memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, effectively fixing it at 1. Author: Jakub Wartak <Jakub.Wartak@tomtom.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com	2020-09-19 11:40:39 +12:00
Tom Lane	06a7c3154f	Allow most keywords to be used as column labels without requiring AS. Up to now, if you tried to omit "AS" before a column label in a SELECT list, it would only work if the column label was an IDENT, that is not any known keyword. This is rather unfriendly considering that we have so many keywords and are constantly growing more. In the wake of commit `1ed6b8956` it's possible to improve matters quite a bit. We'd originally tried to make this work by having some of the existing keyword categories be allowed without AS, but that didn't work too well, because each category contains a few special cases that don't work without AS. Instead, invent an entirely orthogonal keyword property "can be bare column label", and mark all keywords that way for which we don't get shift/reduce errors by doing so. It turns out that of our 450 current keywords, all but 39 can be made bare column labels, improving the situation by over 90%. This number might move around a little depending on future grammar work, but it's a pretty nice improvement. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com	2020-09-18 16:46:36 -04:00
Amit Kapila	24fb35e111	Update file header comments for logical/relation.c. Author: Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CA+HiwqE20oZoix13JyCeALpTf_SmjarZWtBFe5sND6zz+iupAw@mail.gmail.com	2020-09-18 10:14:30 +05:30
Amit Kapila	0d32511eca	Fix comments in heapam.c. After commits `85f6b49c2c` and `3ba59ccc89`, we can allow parallel inserts which was earlier not possible as parallel group members won't conflict for relation extension and page lock. In those commits, we forgot to update comments at few places. Author: Amit Kapila Reviewed-by: Robert Haas and Dilip Kumar Backpatch-through: 13 Discussion: https://postgr.es/m/CAFiTN-tMrQh5FFMPx5aWJ+1gi1H6JxktEhq5mDwCHgnEO5oBkA@mail.gmail.com	2020-09-18 09:50:44 +05:30
Tom Lane	1ed6b89563	Remove support for postfix (right-unary) operators. This feature has been a thorn in our sides for a long time, causing many grammatical ambiguity problems. It doesn't seem worth the pain to continue to support it, so remove it. There are some follow-on improvements we can make in the grammar, but this commit only removes the bare minimum number of productions, plus assorted backend support code. Note that pg_dump and psql continue to have full support, since they may be used against older servers. However, pg_dump warns about postfix operators. There is also a check in pg_upgrade. Documentation-wise, I (tgl) largely removed the "left unary" terminology in favor of saying "prefix operator", which is a more standard and IMO less confusing term. I included a catversion bump, although no initial catalog data changes here, to mark the boundary at which oprkind = 'r' stopped being valid in pg_operator. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com	2020-09-17 19:38:05 -04:00
Amit Kapila	b7f2dd959a	Update parallel BTree scan state when the scan keys can't be satisfied. For parallel btree scan to work for array of scan keys, it should reach BTPARALLEL_DONE state once for every distinct combination of array keys. This is required to ensure that the parallel workers don't try to seize blocks at the same time for different scan keys. We missed to update this state when we discovered that the scan keys can't be satisfied. Author: James Hunter Reviewed-by: Amit Kapila Tested-by: Justin Pryzby Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com	2020-09-17 16:11:48 +05:30
Peter Eisentraut	45b9805706	Allow CURRENT_ROLE where CURRENT_USER is accepted In the particular case of GRANTED BY, this is specified in the SQL standard. Since in PostgreSQL, CURRENT_ROLE is equivalent to CURRENT_USER, and CURRENT_USER is already supported here, adding CURRENT_ROLE is trivial. The other cases are PostgreSQL extensions, but for the same reason it also makes sense there. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Asif Rehman <asifr.rehman@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/f2feac44-b4c5-f38f-3699-2851d6a76dc9%402ndquadrant.com	2020-09-17 11:40:08 +02:00
Heikki Linnakangas	16fa9b2b30	Add support for building GiST index by sorting. This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru	2020-09-17 11:33:40 +03:00
Tom Lane	babef40c9a	Teach walsender to update its process title for replication commands. Because the code path taken for SQL commands executed in a walsender will update the process title, we pretty much have to update the title for replication commands as well. Otherwise, the title shows "idle" for the rest of a logical walsender's lifetime once it's executed any SQL command. Playing with this, I confirm that a walsender now typically spends most of its life reporting walsender postgres [local] START_REPLICATION Considering this in isolation, it might be better to have it say walsender postgres [local] sending replication data However, consistency with the other cases seems to be a stronger argument. In passing, remove duplicative pgstat_report_activity call. Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us	2020-09-16 21:06:50 -04:00
Alvaro Herrera	07082b08cc	Fix bogus completion tag usage in walsender Since commit `fd5942c18f` (2012, 9.3-era), walsender has been sending completion tags for certain replication commands twice -- and they're not even consistent. Apparently neither libpq nor JDBC have a problem with it, but it's not kosher. Fix by remove the EndCommand() call in the common code path for them all, and inserting specific calls to EndReplicationCommand() specifically in those places where it's needed. EndReplicationCommand() is a new simple function to send the completion tag for replication commands. Do this instead of sending a generic SELECT completion tag for them all, which was also pretty bogus (if innocuous). While at it, change StartReplication() to use EndReplicationCommand() instead of pg_puttextmessage(). In commit `2f9661311b`, I failed to realize that replication commands are not close-enough kin of regular SQL commands, so the DROP_REPLICATION_SLOT tag I added is undeserved and a type pun. Take it out. Backpatch to 13, where the latter commit appeared. The duplicate tag has been sent since 9.3, but since nothing is broken, it doesn't seem worth fixing. Per complaints from Tom Lane. Discussion: https://postgr.es/m/1347966.1600195735@sss.pgh.pa.us	2020-09-16 21:16:25 -03:00
Tom Lane	44fc6e259b	Centralize setup of SIGQUIT handling for postmaster child processes. We decided that the policy established in commit `7634bd4f6` for the bgwriter, checkpointer, walwriter, and walreceiver processes, namely that they should accept SIGQUIT at all times, really ought to apply uniformly to all postmaster children. Therefore, get rid of the duplicative and inconsistent per-process code for establishing that signal handler and removing SIGQUIT from BlockSig. Instead, make InitPostmasterChild do it. The handler set up by InitPostmasterChild is SignalHandlerForCrashExit, which just summarily does _exit(2). In interactive backends, we almost immediately replace that with quickdie, since we would prefer to try to tell the client that we're dying. However, this patch is changing the behavior of autovacuum (both launcher and workers), as well as walsenders. Those processes formerly also used quickdie, but AFAICS that was just mindless copy-and-paste: they don't have any interactive client that's likely to benefit from being told this. The stats collector continues to be an outlier, in that it thinks SIGQUIT means normal exit. That should probably be changed for consistency, but there's another patch set where that's being dealt with, so I didn't do so here. Discussion: https://postgr.es/m/644875.1599933441@sss.pgh.pa.us	2020-09-16 16:04:36 -04:00
Tom Lane	2000b6c10a	Don't fetch partition check expression during InitResultRelInfo. Since there is only one place that actually needs the partition check expression, namely ExecPartitionCheck, it's better to fetch it from the relcache there. In this way we will never fetch it at all if the query never has use for it, and we still fetch it just once when we do need it. The reason for taking an interest in this is that if the relcache doesn't already have the check expression cached, fetching it requires obtaining AccessShareLock on the partition root. That means that operations that look like they should only touch the partition itself will also take a lock on the root. In particular we observed that TRUNCATE on a partition may take a lock on the partition's root, contributing to a deadlock situation in parallel pg_restore. As written, this patch does have a small cost, which is that we are microscopically reducing efficiency for the case where a partition has an empty check expression. ExecPartitionCheck will be called, and will go through the motions of setting up and checking an empty qual, where before it would not have been called at all. We could avoid that by adding a separate boolean flag to track whether there is a partition expression to test. However, this case only arises for a default partition with no siblings, which surely is not an interesting case in practice. Hence adding complexity for it does not seem like a good trade-off. Amit Langote, per a suggestion by me Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com	2020-09-16 14:28:18 -04:00
Tom Lane	e5fac1cb19	Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL. If a partitioned table's column is already marked NOT NULL, there is no need to examine its partitions, because we can rely on previous DDL to have enforced that the child columns are NOT NULL as well. (Unfortunately, the same cannot be said for traditional inheritance, so for now we have to restrict the optimization to partitioned tables.) Hence, we may skip recursing to child tables in this situation. The reason this case is worth worrying about is that when pg_dump dumps a partitioned table having a primary key, it will include the requisite NOT NULL markings in the CREATE TABLE commands, and then add the primary key as a separate step. The primary key addition generates a SET NOT NULL as a subcommand, just to be sure. So the situation where a SET NOT NULL is redundant does arise in the real world. Skipping the recursion does more than just save a few cycles: it means that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY KEY" will take locks only on the partition parent table, not on the partitions. It turns out that parallel pg_restore is effectively assuming that that's true, and has little choice but to do so because the dependencies listed for such a TOC entry don't include the partitions. pg_restore could thus issue this ALTER while data restores on the partitions are still in progress. Taking unnecessary locks on the partitions not only hurts concurrency, but can lead to actual deadlock failures, as reported by Domagoj Smoljanovic. (A contributing factor in the deadlock is that TRUNCATE on a child partition wants a non-exclusive lock on the parent. This seems likewise unnecessary, but the fix for it is more invasive so we won't consider back-patching it. Fortunately, getting rid of one of these two poor behaviors is enough to remove the deadlock.) Although support for partitioned primary keys came in with v11, this patch is dependent on the SET NOT NULL refactoring done by commit `f4a3fdfbd`, so we can only patch back to v12. Patch by me; thanks to Alvaro Herrera and Amit Langote for review. Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com	2020-09-16 13:38:26 -04:00
Tom Lane	3d65b0593c	Fix bogus cache-invalidation logic in logical replication worker. The code recorded cache invalidation events by zeroing the "localreloid" field of affected cache entries. However, it's possible for an inval event to occur even while we have the entry open and locked. So an ill-timed inval could result in "cache lookup failed for relation 0" errors, if the worker's code tried to use the cleared field. We can fix that by creating a separate bool field to record whether the entry needs to be revalidated. (In the back branches, cram the bool into what had been padding space, to avoid an ABI break in the somewhat unlikely event that any extension is looking at this struct.) Also, rearrange the logic in logicalrep_rel_open so that it does the right thing in cases where table_open would fail. We should retry the lookup by name in that case, but we didn't. The real-world impact of this is probably small. In the first place, the error conditions are very low probability, and in the second place, the worker would just exit and get restarted. We only noticed because in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly, preventing the worker from making progress. Nonetheless, it's clearly a bug, and it impedes a useful type of testing; so back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us	2020-09-16 12:07:31 -04:00
Jeff Davis	c8aeaf3ab3	Change LogicalTapeSetBlocks() to use nBlocksWritten. Previously, it was based on nBlocksAllocated to account for tapes with open write buffers that may not have made it to the BufFile yet. That was unnecessary, because callers do not need to get the number of blocks while a tape has an open write buffer; and it also conflicted with the preallocation logic added for HashAgg. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/ce5af05900fdbd0e9185747825a7423c48501964.camel@j-davis.com Backpatch-through: 13	2020-09-15 21:42:25 -07:00
Jeff Davis	3bd35d4f51	HashAgg: release write buffers sooner by rewinding tape. This was an oversight. The purpose of `7fdd919ae7` was to avoid keeping tape buffers around unnecessisarily, but HashAgg didn't rewind early enough. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/1fb1151c2cddf8747d14e0532da283c3f97e2685.camel@j-davis.com Backpatch-through: 13	2020-09-15 21:18:22 -07:00
Amit Kapila	69bd60672a	Fix initialization of RelationSyncEntry for streaming transactions. In commit `464824323e`, for each RelationSyncEntry we maintained the list of xids (streamed_txns) for which we have already sent the schema. This helps us to track when to send the schema to the downstream node for replication of streaming transactions. Before this list got initialized, we were processing invalidation messages which access this list and led to an assertion failure. In passing, clean up the nearby code: * Initialize the list of xids with NIL instead of NULL which is our usual coding practice. * Remove the MemoryContext switch for creating a RelationSyncEntry in dynahash. Diagnosed-by: Amit Kapila and Tom Lane Author: Amit Kapila Reviewed-by: Tom Lane and Dilip Kumar Discussion: https://postgr.es/m/904373.1600033123@sss.pgh.pa.us	2020-09-16 07:45:44 +05:30
David Rowley	19c60ad69a	Optimize compactify_tuples function This function could often be seen in profiles of vacuum and could often be a significant bottleneck during recovery. The problem was that a qsort was performed in order to sort an array of item pointers in reverse offset order so that we could use that to safely move tuples up to the end of the page without overwriting the memory of yet-to-be-moved tuples. i.e. we used to compact the page starting at the back of the page and move towards the front. The qsort that this required could be expensive for pages with a large number of tuples. In this commit, we take another approach to tuple compactification. Now, instead of sorting the remaining item pointers array we first check if the array is presorted and only memmove() the tuples that need to be moved. This presorted check can be done very cheaply in the calling functions when the array is being populated. This presorted case is very fast. When the item pointer array is not presorted we must copy tuples that need to be moved into a temp buffer before copying them back into the page again. This differs from what we used to do here as we're now copying the tuples back into the page in reverse line pointer order. Previously we left the existing order alone. Reordering the tuples results in an increased likelihood of hitting the pre-sorted case the next time around. Any newly added tuple which consumes a new line pointer will also maintain the correct sort order of tuples in the page which will also result in the presorted case being hit the next time. Only consuming an unused line pointer can cause the order of tuples to go out again, but that will be corrected next time the function is called for the page. Benchmarks have shown that the non-presorted case is at least equally as fast as the original qsort method even when the page just has a few tuples. As the number of tuples becomes larger the new method maintains its performance whereas the original qsort method became much slower when the number of tuples on the page became large. Author: David Rowley Reviewed-by: Thomas Munro Tested-by: Jakub Wartak Discussion: https://postgr.es/m/CA+hUKGKMQFVpjr106gRhwk6R-nXv0qOcTreZuQzxgpHESAL6dw@mail.gmail.com	2020-09-16 13:22:20 +12:00
Alvaro Herrera	ced138e8cb	Fix use-after-free bug with event triggers in an extension script ALTER TABLE commands in an extension script are added to an event trigger command list; but starting with commit `b5810de3f4` they do so in a memory context that's too short-lived, so when execution ends and time comes to use the entries, they've already been freed. (This would also be a problem with ALTER TABLE commands in a multi-command query string, but these serendipitously end in PortalContext -- which probably explains why it took so long for this to be reported.) Fix by using the memory context specifically set for that, instead. Backpatch to 13, where the aforementioned commit appeared. Reported-by: Philippe Beaudoin Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20200902193715.6e0269d4@firost	2020-09-15 21:03:14 -03:00
David Rowley	10a5b35a00	Report resource usage at the end of recovery Reporting this has been rather useful in some recent recovery speedup work. It also seems like something that will be useful to the average DBA too. Author: David Rowley Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CAApHDvqYVORiZxq2xPvP6_ndmmsTkvr6jSYv4UTNaFa5i1kd%3DQ%40mail.gmail.com	2020-09-16 11:25:46 +12:00
David Rowley	62e221e1c0	Allow incremental sorts for windowing functions This expands on the work done in `d2d8a229b` and allows incremental sort to be considered during create_window_paths(). Author: David Rowley Reviewed-by: Daniel Gustafsson, Tomas Vondra Discussion: https://postgr.es/m/CAApHDvoOHobiA2x13NtWnWLcTXYj9ddpCkv9PnAJQBMegYf_xw%40mail.gmail.com	2020-09-15 23:44:45 +12:00
David Rowley	fe4f36bcde	Fix compiler warning Introduced in `0aa8f7640`. MSVC warned about performing 32-bit bit shifting when it appeared like we might like a 64-bit result. We did, but it just so happened that none of the calls to this function could have caused the 32-bit shift to overflow. Here we just cast the constant to int64 to make the compiler happy. Discussion: https://postgr.es/m/CAApHDvofA_vsrpC13mq_hZyuye5B-ssKEaer04OouXYCO5-uXQ@mail.gmail.com	2020-09-15 15:07:57 +12:00
Tom Lane	f560209c6e	Make walsenders show their replication commands in pg_stat_activity. A walsender process that has executed a SQL command left the text of that command in pg_stat_activity.query indefinitely, which is quite confusing if it's in RUNNING state but not doing that query. An easy and useful fix is to treat replication commands as if they were SQL queries, and show them in pg_stat_activity according to the same rules as for regular queries. While we're at it, it seems also sensible to set debug_query_string, allowing error logging and debugging to see the replication command. While here, clean up assorted silliness in exec_replication_command: * The SQLCmd path failed to restore CurrentMemoryContext to the caller's value, and failed to delete the temp context created in this routine. It's only through great good fortune that these oversights did not result in long-term memory leaks or other problems. It seems cleaner to code SQLCmd as a separate early-exit path, so do it like that. * Remove useless duplicate call of SnapBuildClearExportedSnapshot(). * replication_scanner_finish() was never called. None of those things are significant enough to merit a backpatch, so this is for HEAD only. Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us	2020-09-14 12:35:00 -04:00
Fujii Masao	95233011a0	Fix typos. Author: Naoki Nakamichi Discussion: https://postgr.es/m/b6919d145af00295a8e86ce4d034b7cd@oss.nttdata.com	2020-09-14 14:16:07 +09:00
Michael Paquier	83158f74d3	Make index_set_state_flags() transactional `3c84046` is the original commit that introduced index_set_state_flags(), where the presence of SnapshotNow made necessary the use of an in-place update. SnapshotNow has been removed in `813fb03`, so there is no actual reasons to not make this operation transactional. Note that while making the operation more robust, using a transactional operation in this routine was not strictly necessary as there was no use case for it yet. However, some future features are going to need a transactional behavior, like support for CREATE/DROP INDEX CONCURRENTLY with partitioned tables, where indexes in a partition tree need to have all their pg_index.indis* flags updated in the same transaction to make the operation stable to the end-user by keeping partition trees consistent, even with a failure mid-flight. REINDEX CONCURRENTLY uses already transactional updates when swapping the old and new indexes, making this change more consistent with the index-swapping logic. Author: Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/20200903080440.GA8559@paquier.xyz	2020-09-14 13:56:41 +09:00
Peter Eisentraut	3e0242b24c	Message fixes and style improvements	2020-09-14 06:42:30 +02:00
Tom Lane	19f5a37b9f	Use the properly transformed RangeVar for expandTableLikeClause(). transformCreateStmt() adjusts the transformed statement's RangeVar to specify the target schema explicitly, for the express reason of making sure that auxiliary statements derived by parse transformation operate on the right table. But the refactoring I did in commit `502898192` got this wrong and passed the untransformed RangeVar to expandTableLikeClause(). This could lead to assertion failures or weird misbehavior if the wrong table was accessed. Per report from Alexander Lakhin. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/05051f9d-b32b-cb35-6735-0e9f2ab86b5f@gmail.com	2020-09-13 12:51:21 -04:00
Amit Kapila	03c7f1f37a	Fix inconsistency in determining the timestamp of the db statfile. We use the timestamp of the global statfile if we are not able to determine it for a particular database in case the entry for that database doesn't exist. However, we were using it even when the statfile is corrupt. As there is no user reported issue and it is not clear if there is any impact of this on actual application so decided not to backpatch. Reported-by: Amit Kapila Author: Amit Kapila Reviewed-by: Sawada Masahiko, Magnus Hagander and Alvaro Herrera Discussion: https://postgr.es/m/CAA4eK1J3oTJKyVq6v7K4d3jD+vtnruG9fHRib6UuWWsrwAR6Aw@mail.gmail.com	2020-09-12 08:02:54 +05:30
Amit Kapila	ddd5f6d260	Remove unused function declaration in logicalproto.h. In the passing, fix a typo in pgoutput.c. Reported-by: Tomas Vondra Author: Tomas Vondra Reviewed-by: Dilip Kumar Discussion: https://postgr.es/m/20200909084353.pncuclpbwlr7vylh@development	2020-09-12 07:47:53 +05:30
Jeff Davis	0758964963	logtape.c: do not preallocate for tapes when sorting The preallocation logic is only useful for HashAgg, so disable it when sorting. Also, adjust an out-of-date comment. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn_o7tE2+hRVvwSFghRb75AJ5g-nqGzDUqLYMexjOAe=g@mail.gmail.com Backpatch-through: 13	2020-09-11 17:10:02 -07:00
Tom Lane	7634bd4f6d	Accept SIGQUIT during error recovery in auxiliary processes. The bgwriter, checkpointer, walwriter, and walreceiver processes claimed to allow SIGQUIT "at all times". In reality SIGQUIT would get re-blocked during error recovery, because we didn't update the actual signal mask immediately, so sigsetjmp() would save and reinstate a mask that includes SIGQUIT. This appears to be simply a coding oversight. There's never a good reason to hold off SIGQUIT in these processes, because it's going to just call _exit(2) which should be safe enough, especially since the postmaster is going to tear down shared memory afterwards. Hence, stick in PG_SETMASK() calls to install the modified BlockSig mask immediately. Also try to improve the comments around sigsetjmp blocks. Most of them were just referencing postgres.c, which is misleading because actually postgres.c manages the signals differently. No back-patch, since there's no evidence that this is causing any problems in the field. Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com	2020-09-11 16:01:36 -04:00
Tom Lane	10095ca634	Log a message when resorting to SIGKILL during shutdown/crash recovery. Currently, no useful trace is left in the logs when the postmaster is forced to use SIGKILL to shut down children that failed to respond to SIGQUIT. Some questions were raised about how often that scenario happens in the buildfarm, so let's add a LOG-level message showing that it happened. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us	2020-09-11 12:24:46 -04:00
Tom Lane	6693a96b32	Don't run atexit callbacks during signal exits from ProcessStartupPacket. Although `58c6feccf` fixed the case for SIGQUIT, we were still calling proc_exit() from signal handlers for SIGTERM and timeout failures in ProcessStartupPacket. Fortunately, at the point where that code runs, we haven't yet connected to shared memory in any meaningful way, so there is nothing we need to undo in shared memory. This means it should be safe to use _exit(1) here, ie, not run any atexit handlers but also inform the postmaster that it's not a crash exit. To make sure nobody breaks the "nothing to undo" expectation, add a cross-check that no on-shmem-exit or before-shmem-exit handlers have been registered yet when we finish using these signal handlers. This change is simple enough that maybe it could be back-patched, but I won't risk that right now. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us	2020-09-11 12:20:16 -04:00
Alvaro Herrera	6a68a233ce	Update copyright year Thinko in `40b3e2c201`. Reported-by: "Wang, Shenhao" <wangsh.fnst@cn.fujitsu.com> Discussion: https://postgr.es/m/ed98706b82694b57a8c0d339a10732aa@G08CNEXMBPEKD06.g08.fujitsu.local	2020-09-11 12:55:13 -03:00
Alvaro Herrera	9f1cf97bb5	Print WAL logical message contents in pg_waldump This helps debuggability when looking at WAL streams containing logical messages. Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAExHW5sWx49rKmXbg5H1Xc1t+nRv9PaYKQmgw82HPt6vWDVmDg@mail.gmail.com	2020-09-10 19:37:02 -03:00
Tom Lane	58c6feccfa	Use _exit(2) for SIGQUIT during ProcessStartupPacket, too. Bring the signal handling for startup-packet collection into line with the policy established in commits `bedadc732` and `8e19a8264`, namely don't risk running atexit callbacks when handling SIGQUIT. Ideally, we'd not do so for SIGTERM or timeout interrupts either, but that change seems a bit too risky for the back branches. For now, just improve the comments in this area to describe the risk. Also relocate where BackendInitialize re-disables these interrupts, to minimize the code span where they're active. This doesn't buy a whole lot of safety, but it can't hurt. In passing, rename startup_die() to remove confusion about whether it is for the startup process. Like the previous commits, back-patch to all supported branches. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us	2020-09-10 12:06:37 -04:00
Etsuro Fujita	3857f98f14	Clean up some code and comments in partbounds.c. Do some minor cleanup for commit `c8434d64c`: 1) remove a useless assignment (in normal builds) and 2) improve comments a little. Back-patch to v13 where the aforementioned commit went in. Author: Etsuro Fujita Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/CAPmGK16yCd2R4=bQ4g8N2dT9TtA5ZU+qNmJ3LPc_nypbNy4_2A@mail.gmail.com	2020-09-10 18:00:00 +09:00
Michael Paquier	aad546bd0a	doc: Fix some grammar and inconsistencies Some comments are fixed while on it. Author: Justin Pryzby Discussion: https://postgr.es/m/20200818171702.GK17022@telsasoft.com Backpatch-through: 9.6	2020-09-10 15:50:19 +09:00
Noah Misch	fe4d022c8e	Fix rd_firstRelfilenodeSubid for nailed relations, in parallel workers. Move applicable code out of RelationBuildDesc(), which nailed relations bypass. Non-assert builds experienced no known problems. Back-patch to v13, where commit `c6b92041d3` introduced rd_firstRelfilenodeSubid. Kyotaro Horiguchi. Reported by Justin Pryzby. Discussion: https://postgr.es/m/20200907023737.GA7158@telsasoft.com	2020-09-09 18:50:24 -07:00
Tom Lane	bedadc7322	Make archiver's SIGQUIT handler exit via _exit(). Commit `8e19a8264` changed the SIGQUIT handlers of almost all server processes not to run atexit callbacks. The archiver process was skipped, perhaps because it's not connected to shared memory; but it's just as true here that running atexit callbacks in a signal handler is unsafe. So let's make it work like the rest. In HEAD and v13, we can use the common SignalHandlerForCrashExit handler. Before that, just tweak pgarch_exit to use _exit(2) explicitly. Like the previous commit, back-patch to all supported branches. Kyotaro Horiguchi, back-patching by me Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us	2020-09-09 15:32:45 -04:00
Peter Eisentraut	0aa8f76408	Expose internal function for converting int64 to numeric Existing callers had to take complicated detours via DirectFunctionCall1(). This simplifies a lot of code. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu	2020-09-09 20:16:28 +02:00
Tom Lane	f3e1e66196	Minor fixes in docs and error messages. Alexander Lakhin Discussion: https://postgr.es/m/ce7debdd-c943-d7a7-9b41-687107b27831@gmail.com	2020-09-09 11:53:39 -04:00
Alvaro Herrera	f481d28232	Check default partitions constraints while descending Partitioning tuple route code assumes that the partition chosen while descending the partition hierarchy is always the correct one. This is true except when the partition is the default partition and another partition has been added concurrently: the partition constraint changes and we don't recheck it. This can lead to tuples mistakenly being added to the default partition that should have been rejected. Fix by rechecking the default partition constraint while descending the hierarchy. An isolation test based on the reproduction steps described by Hao Wu (with tweaks for extra coverage) is included. Backpatch to 12, where this bug came in with `898e5e3290`. Reported by: Hao Wu <hawu@vmware.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CA+HiwqFqBmcSSap4sFnCBUEL_VfOMmEKaQ3gwUhyfa4c7J_-nA@mail.gmail.com Discussion: https://postgr.es/m/DM5PR0501MB3910E97A9EDFB4C775CF3D75A42F0@DM5PR0501MB3910.namprd05.prod.outlook.com	2020-09-08 19:35:15 -03:00
Tom Lane	c9ae5cbb88	Install an error check into cancel_before_shmem_exit(). Historically, cancel_before_shmem_exit() just silently did nothing if the specified callback wasn't the top-of-stack. The folly of ignoring this case was exposed by the bugs fixed in `303640199` and `bab150045`, so let's make it throw elog(ERROR) instead. There is a decent argument to be made that PG_ENSURE_ERROR_CLEANUP should use some separate infrastructure, so it wouldn't break if something inside the guarded code decides to register a new before_shmem_exit callback. However, a survey of the surviving uses of before_shmem_exit() and PG_ENSURE_ERROR_CLEANUP doesn't show any plausible conflicts of that sort today, so for now we'll forgo the extra complexity. (It will almost certainly become necessary if anyone ever wants to wrap PG_ENSURE_ERROR_CLEANUP around arbitrary user-defined actions, though.) No backpatch, since this is developer support not a production issue. Bharath Rupireddy, per advice from Andres Freund, Robert Haas, and myself Discussion: https://postgr.es/m/CALj2ACWk7j4F2v2fxxYfrroOF=AdFNPr1WsV+AGtHAFQOqm_pw@mail.gmail.com	2020-09-08 15:54:25 -04:00
Andres Freund	5871f09c98	Fix autovacuum cancellation. The problem is caused by me (Andres) having ProcSleep() look at the wrong PGPROC entry in `5788e258bb`. Unfortunately it seems hard to write a reliable test for autovacuum cancellations. Perhaps somebody will come up with a good approach, but it seems worth fixing the issue even without a test. Reported-By: Jeff Janes <jeff.janes@gmail.com> Author: Jeff Janes <jeff.janes@gmail.com> Discussion: https://postgr.es/m/CAMkU=1wH2aUy+wDRDz+5RZALdcUnEofV1t9PzXS_gBJO9vZZ0Q@mail.gmail.com	2020-09-08 11:25:34 -07:00
Tom Lane	3438c988fd	Use plain memset() in numeric.c, not MemSet and friends. This essentially reverts a micro-optimization I made years ago, as part of the much larger commit `d72f6c750`. It's doubtful that there was any hard evidence for it being helpful even then, and the case is even more dubious now that modern compilers are so much smarter about inlining memset(). The proximate reason for undoing it is to get rid of the type punning inherent in MemSet, for fear that that may cause problems now that we're applying additional optimization switches to numeric.c. At the very least this'll silence some warnings from a few old buildfarm animals. (It's probably past time for another look at whether MemSet is still worth anything at all, but I do not propose to tackle that question right now.) Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com	2020-09-08 11:47:37 -04:00
Peter Eisentraut	728d4bc16b	Use <unnamed> for name of unnamed portal's memory context Otherwise just printing an empty string makes the memory context debug output slightly confusing. Discussion: https://www.postgresql.org/message-id/flat/ccb353ef-89ff-09b3-8046-1d2514624b9c%402ndquadrant.com	2020-09-08 17:15:00 +02:00
Michael Paquier	a6642b3ae0	Add support for partitioned tables and indexes in REINDEX Until now, REINDEX was not able to work with partitioned tables and indexes, forcing users to reindex partitions one by one. This extends REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned index and table in input, respectively, to reindex all the partitions assigned to them with physical storage (foreign tables, partitioned tables and indexes are then discarded). This shares some logic with schema and database REINDEX as each partition gets processed in its own transaction after building a list of relations to work on. This choice has the advantage to minimize the number of invalid indexes to one partition with REINDEX CONCURRENTLY in the event a cancellation or failure in-flight, as the only indexes handled at once in a single REINDEX CONCURRENTLY loop are the ones from the partition being working on. Isolation tests are added to emulate some cases I bumped into while developing this feature, particularly with the concurrent drop of a leaf partition reindexed. However, this is rather limited as LOCK would cause REINDEX to block in the first transaction building the list of partitions. Per its multi-transaction nature, this new flavor cannot run in a transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE. Author: Justin Pryzby, Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com	2020-09-08 10:09:22 +09:00
Jeff Davis	a547e68675	Adjust cost model for HashAgg that spills to disk. Tomas Vondra observed that the IO behavior for HashAgg tends to be worse than for Sort. Penalize HashAgg IO costs accordingly. Also, account for the CPU effort of spilling the tuples and reading them back. Discussion: https://postgr.es/m/20200906212112.nzoy5ytrzjjodpfh@development Reviewed-by: Tomas Vondra Backpatch-through: 13	2020-09-07 13:31:59 -07:00
Tom Lane	53367e6c62	Clarify comments in enforce_generic_type_consistency(). Some of the pre-existing comments were vague about whether they referred to all polymorphic types or only the old-style ones. Also be more consistent about using the "family 1" vs "family 2" terminology. Himanshu Upadhyaya and Tom Lane Discussion: https://postgr.es/m/CAPF61jBUg9XoMPNuLpoZ+h6UZ2VxKdNt3rQL1xw1GOBwjWzAXQ@mail.gmail.com	2020-09-07 14:52:33 -04:00
Tom Lane	9c79e646c6	Frob numeric.c loop so that clang will auto-vectorize it too. Experimentation shows that clang will auto-vectorize the critical multiplication loop if the termination condition is written "i2 < limit" rather than "i2 <= limit". This seems unbelievably stupid, but I've reproduced it on both clang 9.0.1 (RHEL8) and 11.0.3 (macOS Catalina). gcc doesn't care, so tweak the code to do it that way. Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com	2020-09-07 12:03:04 -04:00
Thomas Munro	861c6e7c8e	Skip unnecessary stat() calls in walkdir(). Some kernels can tell us the type of a "dirent", so we can avoid a call to stat() or lstat() in many cases. Define a new function get_dirent_type() to contain that logic, for use by the backend and frontend versions of walkdir(), and perhaps other callers in future. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com	2020-09-07 18:28:06 +12:00
Tom Lane	8870917623	Apply auto-vectorization to the inner loop of numeric multiplication. Compile numeric.c with -ftree-vectorize where available, and adjust the innermost loop of mul_var() so that it is amenable to being auto-vectorized. (Mainly, that involves making it process the arrays left-to-right not right-to-left.) Applying -ftree-vectorize actually makes numeric.o smaller, at least with my compiler (gcc 8.3.1 on x86_64), and it's a little faster too. Independently of that, fixing the inner loop to be vectorizable also makes things a bit faster. But doing both is a huge win for multiplications with lots of digits. For me, the numeric regression test is the same speed to within measurement noise, but numeric_big is a full 45% faster. We also looked into applying -funroll-loops, but that makes numeric.o bloat quite a bit, and the additional speed improvement is very marginal. Amit Khandekar, reviewed and edited a little by me Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com	2020-09-06 21:40:39 -04:00
Tom Lane	695de5d1ed	Split Makefile symbol CFLAGS_VECTOR into two symbols. Replace CFLAGS_VECTOR with CFLAGS_UNROLL_LOOPS and CFLAGS_VECTORIZE, allowing us to distinguish whether we want to apply -funroll-loops, -ftree-vectorize, or both to a particular source file. Up to now the only consumer of the symbol has been checksum.c which wants both, so that there was no need to distinguish; but that's about to change. Amit Khandekar, reviewed and edited a little by me Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com	2020-09-06 21:28:16 -04:00
Tom Lane	8e3c58e6e4	Refactor pg_get_line() to expose an alternative StringInfo-based API. Letting the caller provide a StringInfo to read into is helpful when the caller needs to merge lines or otherwise modify the data after it's been read. Notably, now the code added by commit `8f8154a50` can use pg_get_line_append() instead of having its own copy of that logic. A follow-on commit will also make use of this. Also, since StringInfo buffers are a minimum of 1KB long, blindly using pg_get_line() in a loop can eat a lot more memory than one would expect. I discovered for instance that commit `e0f05cd5b` caused initdb to consume circa 10MB to read postgres.bki, even though that's under 1MB worth of data. A less memory-hungry alternative is to re-use the same StringInfo for all lines and pg_strdup the results. Discussion: https://postgr.es/m/1315832.1599345736@sss.pgh.pa.us	2020-09-06 14:13:19 -04:00
Magnus Hagander	2a093355aa	Fix typo in comment Author: Hou, Zhijie	2020-09-06 19:26:55 +02:00
Tom Lane	19ad7e1d7b	Fix misleading error message about inconsistent moving-aggregate types. We reported the wrong types when complaining that an aggregate's moving-aggregate implementation is inconsistent with its regular implementation. This was wrong since the feature was introduced, so back-patch to all supported branches. Jeff Janes Discussion: https://postgr.es/m/CAMkU=1x808LH=LPhZp9mNSP0Xd1xDqEd+XeGcvEe48dfE6xV=A@mail.gmail.com	2020-09-06 12:55:13 -04:00
Tom Lane	e0f05cd5ba	Improve some ancient, crufty code in bootstrap + initdb. At some point back in the last century, somebody felt that reading all of pg_type twice was cheaper, or at least easier, than using repalloc() to resize the Typ[] array dynamically. That seems like an entirely wacko proposition, so rewrite the code to do it the other way. (To add insult to injury, there were two not-quite-identical copies of said code.) initdb.c's readfile() function had the same disease of preferring to do double the I/O to avoid resizing its output array. Here, we can make things easier by using the just-invented pg_get_line() function to handle reading individual lines without a predetermined notion of how long they are. On my machine, it's difficult to detect any net change in the overall runtime of initdb from these changes; but they should help on slower buildfarm machines (especially since a buildfarm cycle involves a lot of initdb's these days). My attention was drawn to these places by scan-build complaints, but on inspection they needed a lot more work than just suppressing dead stores :-(	2020-09-05 16:20:04 -04:00
Tom Lane	a5cc4dab6d	Yet more elimination of dead stores and useless initializations. I'm not sure what tool Ranier was using, but the ones I contributed were found by using a newer version of scan-build than I tried before. Ranier Vilela and Tom Lane Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com	2020-09-05 13:17:32 -04:00
Michael Paquier	8febfd1855	Switch to multi-inserts when registering dependencies for many code paths This commit improves the dependency registrations by taking advantage of the preliminary work done in `63110c62`, to group together the insertion of dependencies of the same type to pg_depend. With the current layer of routines available, and as only dependencies of the same type can be grouped, there are code paths still doing more than one multi-insert when it is necessary to register dependencies of multiple types (constraint and index creation are two cases doing that). While on it, this refactors some of the code to use ObjectAddressSet() when manipulating object addresses. Author: Daniel Gustafsson, Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz	2020-09-05 21:33:53 +09:00
Peter Eisentraut	556cbdfce4	Fix typo in comment	2020-09-05 11:32:20 +02:00
Michael Paquier	63110c6264	Use multi-inserts for pg_depend This is a follow-up of the work done in `e3931d01`. This case is a bit different than pg_attribute and pg_shdepend: the maximum number of items to insert is known in advance, but there is no need to handle pinned dependencies. Hence, the base allocation for slots is done based on the number of items and the maximum allowed with a cap at 64kB. Slots are initialized once used to minimize the overhead of the operation. The insertions can be done for dependencies of the same type. More could be done by grouping the insertion of multiple dependency types in a single batch. This is left as future work. Some of the multi-insert logic is also simplified for pg_shdepend, as per the feedback discussed for this specific patch. This also moves to indexing.h the variable capping the maximum amount of data that can be used at once for a multi-insert, instead of having separate definitions for pg_attribute, pg_depend and pg_shdepend. Author: Daniel Gustafsson, Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz	2020-09-05 13:52:47 +09:00
Tom Lane	c8746f999e	Fix over-eager ping'ing in logical replication receiver. Commit `3f60f690f` only partially fixed the broken-status-tracking issue in LogicalRepApplyLoop: we need ping_sent to have the same lifetime as last_recv_timestamp. The effects are much less serious than what that commit fixed, though. AFAICS this would just lead to extra ping requests being sent, once per second until the sender responds. Still, it's a bug, so backpatch to v10 as before. Discussion: https://postgr.es/m/959627.1599248476@sss.pgh.pa.us	2020-09-04 20:33:36 -04:00
Tom Lane	9a851039aa	Remove still more useless assignments. Fix some more things scan-build pointed to as dead stores. In some of these cases, rearranging the code a little leads to more readable code IMO. It's all cosmetic, though. Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com	2020-09-04 20:33:36 -04:00
Jeff Davis	0852006a94	Fix bogus MaxAllocSize check in logtape.c. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=NZPZc3-fkdmvu=w2itx0PiB-G6QpxHXZOjuvFAzPdZw@mail.gmail.com Backpatch-through: 13	2020-09-04 12:09:52 -07:00
Alvaro Herrera	f43e295f68	Report expected contrecord length on mismatch When reading a WAL record fails to find continuation record(s) of the proper length, report what it expects, for clarity. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20200903212152.GA15319@alvherre.pgsql	2020-09-04 14:58:32 -04:00
Tom Lane	38a2d70329	Remove some more useless assignments. Found with clang's scan-build tool. It also whines about a lot of other dead stores that we should not change IMO, either as a matter of style or future-proofing. But these places seem like clear oversights. Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com	2020-09-04 14:32:19 -04:00
Amit Kapila	ac15b499f7	Fix inline marking introduced in commit `464824323e`. Forgot to add inline marking in changes_filename() declaration. In the passing, add inline marking for a similar function subxact_filename(). Reported-By: Nathan Bossart Discussion: https://postgr.es/m/E98FBE8F-B878-480D-A728-A60C6EED3047@amazon.com	2020-09-04 11:25:16 +05:30
Bruce Momjian	e36e936e0e	remove redundant initializations Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com Author: Ranier Vilela Backpatch-through: master	2020-09-03 22:57:35 -04:00
Michael Paquier	844c05abc3	Remove variable "concurrent" from ReindexStmt This node already handles multiple options using a bitmask, so having a separate boolean flag is not necessary. This simplifies the code a bit with less arguments to give to the reindex routines, by replacing the boolean with an equivalent bitmask value. Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/20200902110326.GA14963@paquier.xyz	2020-09-04 10:43:32 +09:00
Tom Lane	67a472d71c	Remove arbitrary restrictions on password length. This patch started out with the goal of harmonizing various arbitrary limits on password length, but after awhile a better idea emerged: let's just get rid of those fixed limits. recv_password_packet() has an arbitrary limit on the packet size, which we don't really need, so just drop it. (Note that this doesn't really affect anything for MD5 or SCRAM password verification, since those will hash the user's password to something shorter anyway. It does matter for auth methods that require a cleartext password.) Likewise remove the arbitrary error condition in pg_saslprep(). The remaining limits are mostly in client-side code that prompts for passwords. To improve those, refactor simple_prompt() so that it allocates its own result buffer that can be made as big as necessary. Actually, it proves best to make a separate routine pg_get_line() that has essentially the semantics of fgets(), except that it allocates a suitable result buffer and hence will never return a truncated line. (pg_get_line has a lot of potential applications to replace randomly-sized fgets buffers elsewhere, but I'll leave that for another patch.) I built pg_get_line() atop stringinfo.c, which requires moving that code to src/common/; but that seems fine since it was a poor fit for src/port/ anyway. This patch is mostly mine, but it owes a good deal to Nathan Bossart who pressed for a solution to the password length problem and created a predecessor patch. Also thanks to Peter Eisentraut and Stephen Frost for ideas and discussion. Discussion: https://postgr.es/m/09512C4F-8CB9-4021-B455-EF4C4F0D55A0@amazon.com	2020-09-03 20:09:18 -04:00
Tom Lane	be4b0c0077	Avoid lockup of a parallel worker when reporting a long error message. Because sigsetjmp() will restore the initial state with signals blocked, the code path in bgworker.c for reporting an error and exiting would execute that way. Usually this is fairly harmless; but if a parallel worker had an error message exceeding the shared-memory communication buffer size (16K) it would lock up, because it would wait for a resume-sending signal from its parallel leader which it would never detect. To fix, just unblock signals at the appropriate point. This can be shown to fail back to 9.6. The lack of parallel query infrastructure makes it difficult to provide a simple test case for 9.5; but I'm pretty sure the issue exists in some form there as well, so apply the code change there too. Vignesh C, reviewed by Bharath Rupireddy, Robert Haas, and myself Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com	2020-09-03 16:52:09 -04:00
Tom Lane	8f8154a503	Allow records to span multiple lines in pg_hba.conf and pg_ident.conf. A backslash at the end of a line now causes the next line to be appended to the current one (effectively, the backslash and newline are discarded). This allows long HBA entries to be created without legibility problems. While we're here, get rid of the former hard-wired length limit on pg_hba.conf lines, by using an expansible StringInfo buffer instead of a fixed-size local variable. Since the same code is used to read the ident map file, these changes apply there as well. Fabien Coelho, reviewed by Justin Pryzby and David Zhang Discussion: https://postgr.es/m/alpine.DEB.2.21.2003251906140.15243@pseudo	2020-09-03 12:16:48 -04:00
Amit Kapila	464824323e	Add support for streaming to built-in logical replication. To add support for streaming of in-progress transactions into the built-in logical replication, we need to do three things: * Extend the logical replication protocol, so identify in-progress transactions, and allow adding additional bits of information (e.g. XID of subtransactions). * Modify the output plugin (pgoutput) to implement the new stream API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle streamed in-progress transaction by spilling the data to disk and then replaying them on commit. We however must explicitly disable streaming replication during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover we don't have a replication connection open so we don't have where to send the data anyway. Author: Tomas Vondra, Dilip Kumar and Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2020-09-03 07:54:07 +05:30
Tom Lane	66f1630680	Add string_to_table() function. This splits a string at occurrences of a delimiter. It is exactly like string_to_array() except for producing a set of values instead of an array of values. Thus, the relationship of these two functions is the same as between regexp_split_to_table() and regexp_split_to_array(). Although the same results could be had from unnest(string_to_array()), this is somewhat faster than that, and anyway it seems reasonable to have it for symmetry with the regexp functions. Pavel Stehule, reviewed by Peter Smith Discussion: https://postgr.es/m/CAFj8pRD8HOpjq2TqeTBhSo_QkzjLOhXzGCpKJ4nCs7Y9SQkuPw@mail.gmail.com	2020-09-02 18:23:56 -04:00
Fujii Masao	be9788e998	Avoid unnecessary acquisition of SyncRepLock in transaction commit time. In SyncRepWaitForLSN() routine called in transaction commit time, SyncRepLock is necessary to atomically both check the shared sync_standbys_defined flag and operate the sync replication wait-queue. On the other hand, when the flag is false, the lock is not necessary because the wait-queue is not touched. But due to the changes by commit `48c9f49265`, previously the lock was taken whatever the flag was. This could cause unnecessary performance overhead in every transaction commit time. Therefore this commit avoids that unnecessary aquisition of SyncRepLock. Author: Fujii Masao Reviewed-by: Asim Praveen, Masahiko Sawada, Discussion: https://postgr.es/m/20200406050332.nsscfqjzk2d57zyx@alap3.anarazel.de	2020-09-02 10:55:55 +09:00
Michael Paquier	1d65416661	Improve handling of dropped relations for REINDEX DATABASE/SCHEMA/SYSTEM When multiple relations are reindexed, a scan of pg_class is done first to build the list of relations to work on. However the REINDEX logic has never checked if a relation listed still exists when beginning the work on it, causing for example sudden cache lookup failures. This commit adds safeguards against dropped relations for REINDEX, similarly to VACUUM or CLUSTER where we try to open the relation, ignoring it if it is missing. A new option is added to the REINDEX routines to control if a missed relation is OK to ignore or not. An isolation test, based on REINDEX SCHEMA, is added for the concurrent and non-concurrent cases. Author: Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/20200813043805.GE11663@paquier.xyz	2020-09-02 09:08:12 +09:00
Tom Lane	a7212be8b9	Set cutoff xmin more aggressively when vacuuming a temporary table. Since other sessions aren't allowed to look into a temporary table of our own session, we do not need to worry about the global xmin horizon when setting the vacuum XID cutoff. Indeed, if we're not inside a transaction block, we may set oldestXmin to be the next XID, because there cannot be any in-doubt tuples in a temp table, nor any tuples that are dead but still visible to some snapshot of our transaction. (VACUUM, of course, is never inside a transaction block; but we need to test that because CLUSTER shares the same code.) This approach allows us to always clean out a temp table completely during VACUUM, independently of concurrent activity. Aside from being useful in its own right, that simplifies building reproducible test cases. Discussion: https://postgr.es/m/3490536.1598629609@sss.pgh.pa.us	2020-09-01 18:40:43 -04:00
Alvaro Herrera	afc7e0ad55	Raise error on concurrent drop of partitioned index We were already raising an error for DROP INDEX CONCURRENTLY on a partitioned table, albeit a different and confusing one: ERROR: DROP INDEX CONCURRENTLY must be first action in transaction Change that to throw a more comprehensible error: ERROR: cannot drop partitioned index \"%s\" concurrently Michael Paquier authored the test case for indexes on temporary partitioned tables. Backpatch to 11, where indexes on partitioned tables were added. Reported-by: Jan Mussler <jan.mussler@zalando.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org	2020-09-01 13:40:43 -04:00
Amit Kapila	4ab77697f6	Fix the SharedFileSetUnregister API. Commit `808e13b282` introduced a few APIs to extend the existing Buffile interface. In SharedFileSetDeleteOnProcExit, it tries to delete the list element while traversing the list with 'foreach' construct which makes the behavior of list traversal unpredictable. Author: Amit Kapila Reviewed-by: Dilip Kumar Tested-by: Dilip Kumar and Neha Sharma Discussion: https://postgr.es/m/CAA4eK1JhLatVcQ2OvwA_3s0ih6Hx9+kZbq107cXVsSWWukH7vA@mail.gmail.com	2020-09-01 08:11:39 +05:30
Tom Lane	3d351d916b	Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE. Historically, we've considered the state with relpages and reltuples both zero as indicating that we do not know the table's tuple density. This is problematic because it's impossible to distinguish "never yet vacuumed" from "vacuumed and seen to be empty". In particular, a user cannot use VACUUM or ANALYZE to override the planner's normal heuristic that an empty table should not be believed to be empty because it is probably about to get populated. That heuristic is a good safety measure, so I don't care to abandon it, but there should be a way to override it if the table is indeed intended to stay empty. Hence, represent the initial state of ignorance by setting reltuples to -1 (relpages is still set to zero), and apply the minimum-ten-pages heuristic only when reltuples is still -1. If the table is empty, VACUUM or ANALYZE (but not CREATE INDEX) will override that to reltuples = relpages = 0, and then we'll plan on that basis. This requires a bunch of fiddly little changes, but we can get rid of some ugly kluges that were formerly needed to maintain the old definition. One notable point is that FDWs' GetForeignRelSize methods will see baserel->tuples = -1 when no ANALYZE has been done on the foreign table. That seems like a net improvement, since those methods were formerly also in the dark about what baserel->tuples = 0 really meant. Still, it is an API change. I bumped catversion because code predating this change would get confused by seeing reltuples = -1. Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com	2020-08-30 12:21:51 -04:00
Michael Paquier	9511fb37ac	Reset indisreplident for an invalid index in DROP INDEX CONCURRENTLY A failure when dropping concurrently an index used in a replica identity could leave in pg_index an index marked as !indisvalid and indisreplident. Reindexing this index would switch back indisvalid to true, and if the replica identity of the parent relation was switched to use a different index, it would be possible to finish with more than one index marked as indisreplident. If that were to happen, this could mess up with the relation cache as an incorrect index could be used for the replica identity. Indexes marked as invalid are discarded as candidates for the replica identity, as of RelationGetIndexList(), so similarly to what is done with indisclustered, resetting indisreplident when the index is marked as invalid keeps things consistent. REINDEX CONCURRENTLY's swapping already resets the flag for the old index, while the new index inherits the value of the old index to-be-dropped, so only DROP INDEX was an issue. Even if this is a bug, the sequence able to reproduce a problem requires a failure while running DROP INDEX CONCURRENTLY, something unlikely going to happen in the field, so no backpatch is done. Author: Michael Paquier Reviewed-by: Dmitry Dolgov Discussion: https://postgr.es/m/20200827025721.GN2017@paquier.xyz	2020-08-30 14:14:34 +09:00
Tom Lane	10564ee02c	Fix code for re-finding scan position in a multicolumn GIN index. collectMatchBitmap() needs to re-find the index tuple it was previously looking at, after transiently dropping lock on the index page it's on. The tuple should still exist and be at its prior position or somewhere to the right of that, since ginvacuum never removes tuples but concurrent insertions could add one. However, there was a thinko in that logic, to the effect of expecting any inserted tuples to have the same index "attnum" as what we'd been scanning. Since there's no physical separation of tuples with different attnums, it's not terribly hard to devise scenarios where this fails, leading to transient "lost saved point in index" errors. (While I've duplicated this with manual testing, it seems impossible to make a reproducible test case with our available testing technology.) Fix by just continuing the scan when the attnum doesn't match. While here, improve the error message used if we do fail, so that it matches the wording used in btree for a similar case. collectMatchBitmap()'s posting-tree code path was previously not exercised at all by our regression tests. While I can't make a regression test that exhibits the bug, I can at least improve the code coverage here, so do that. The test case I made for this is an extension of one added by `4b754d6c1`, so it only works in HEAD and v13; didn't seem worth trying hard to back-patch it. Per bug #16595 from Jesse Kinkead. This has been broken since multicolumn capability was added to GIN (commit `27cb66fdf`), so back-patch to all supported branches. Discussion: https://postgr.es/m/16595-633118be8eef9ce2@postgresql.org	2020-08-27 17:36:13 -04:00
Michael Paquier	77c7267c37	Fix comment in procarray.c The description of GlobalVisDataRels was missing, GlobalVisCatalogRels being mentioned instead. Author: Jim Nasby Discussion: https://postgr.es/m/8e06c883-2858-1fd4-07c5-560c28b08dcd@amazon.com	2020-08-27 16:40:34 +09:00
Tom Lane	e942af7b82	Suppress compiler warning in non-cassert builds. Oversight in `808e13b28`, reported by Bruce Momjian. Discussion: https://postgr.es/m/20200826160251.GB21909@momjian.us	2020-08-26 17:08:11 -04:00
Amit Kapila	7e453634bb	Add additional information in the vacuum error context. The additional information added will be an offset number for heap operations. This information will help us in finding the exact tuple due to which the error has occurred. Author: Mahendra Singh Thalor and Amit Kapila Reviewed-by: Sawada Masahiko, Justin Pryzby and Amit Kapila Discussion: https://postgr.es/m/CAKYtNApK488TDF4bMbw+1QH8HJf9cxdNDXquhU50TK5iv_FtCQ@mail.gmail.com	2020-08-26 09:40:52 +05:30
Amit Kapila	808e13b282	Extend the BufFile interface. Allow BufFile to support temporary files that can be used by the single backend when the corresponding files need to be survived across the transaction and need to be opened and closed multiple times. Such files need to be created as a member of a SharedFileSet. Additionally, this commit implements the interface for BufFileTruncate to allow files to be truncated up to a particular offset and extends the BufFileSeek API to support the SEEK_END case. This also adds an option to provide a mode while opening the shared BufFiles instead of always opening in read-only mode. These enhancements in BufFile interface are required for the upcoming patch to allow the replication apply worker, to handle streamed in-progress transactions. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2020-08-26 07:36:43 +05:30
Fujii Masao	50db5964ee	Move codes for pg_backend_memory_contexts from mmgr/mcxt.c to adt/mcxtfuncs.c. Previously the codes for pg_backend_memory_contexts were in src/backend/utils/mmgr/mcxt.c. This commit moves them to src/backend/utils/adt/mcxtfuncs.c so that mcxt.c basically includes only the low-level interface for memory contexts. Author: Atsushi Torikoshi Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/20200819135545.GC19121@paquier.xyz	2020-08-26 10:51:31 +09:00
Fujii Masao	29dd6d8bc6	Prevent non-superusers from reading pg_backend_memory_contexts, by default. pg_backend_memory_contexts view contains some internal information of memory contexts. Since exposing them to any users by default may cause security issue, this commit allows only superusers to read this view, by default, like we do for pg_shmem_allocations view. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/1414992.1597849297@sss.pgh.pa.us	2020-08-26 10:50:02 +09:00
David Rowley	c34605daed	Fixup some misusages of bms_num_members() It's a bit inefficient to test if a Bitmapset is empty by counting all the members and seeing if that number is zero. It's much better just to use bms_is_empty(). Likewise for checking if there are at least two members, just use bms_membership(), which does not need to do anything more after finding two members. Discussion: https://postgr.es/m/CAApHDvpvwm_QjbDOb5xga%2BKmX9XkN9xQavNGm3SvDbVnCYOerQ%40mail.gmail.com Reviewed-by: Tomas Vondra	2020-08-26 10:51:36 +12:00
Amit Kapila	a3c66de6c5	Improve the vacuum error context phase information. We were displaying the wrong phase information for 'info' message in the index clean up phase because we were switching to the previous phase a bit early. We were also not displaying context information for heap phase unless the block number is valid which is fine for error cases but for messages at 'info' or lower error level it appears to be inconsistent with index phase information. Reported-by: Sawada Masahiko Author: Sawada Masahiko Reviewed-by: Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/CA+fd4k4HcbhPnCs7paRTw1K-AHin8y4xKomB9Ru0ATw0UeTy2w@mail.gmail.com	2020-08-24 08:16:19 +05:30
Tom Lane	4d346def15	Avoid pushing quals down into sub-queries that have grouping sets. The trouble with doing this is that an apparently-constant subquery output column isn't really constant if it is a grouping column that appears in only some of the grouping sets. A qual using such a column would be subject to incorrect const-folding after push-down, as seen in bug #16585 from Paul Sivash. To fix, just disable qual pushdown altogether if the sub-query has nonempty groupingSets. While we could imagine far less restrictive solutions, there is not much point in working harder right now, because subquery_planner() won't move HAVING clauses to WHERE within such a subquery. If the qual stays in HAVING it's not going to be a lot more useful than if we'd kept it at the outer level. Having said that, this restriction could be removed if we used a parsetree representation that distinguished such outputs from actual constants, which is something I hope to do in future. Hence, make the patch a minimal addition rather than integrating it more tightly (e.g. by renumbering the existing items in subquery_is_pushdown_safe's comment). Back-patch to 9.5 where grouping sets were introduced. Discussion: https://postgr.es/m/16585-9d8c340d23ade8c1@postgresql.org	2020-08-22 14:46:40 -04:00
Tom Lane	5b02d68e75	Fix ALTER TABLE's scheduling rules for AT_AddConstraint subcommands. Commit `1281a5c90` rearranged the logic in this area rather drastically, and it broke the case of adding a foreign key constraint in the same ALTER that adds the pkey or unique constraint it depends on. While self-referential fkeys are surely a pretty niche case, this used to work so we shouldn't break it. To fix, reorganize the scheduling rules in ATParseTransformCmd so that a transformed AT_AddConstraint subcommand will be delayed into a later pass in all cases, not only when it's been spit out as a side-effect of parsing some other command type. Also tweak the logic so that we won't run ATParseTransformCmd twice while doing this. It seems to work even without that, but it's surely wasting cycles to do so. Per bug #16589 from Jeremy Evans. Back-patch to v13 where the new code was introduced. Discussion: https://postgr.es/m/16589-31c8d981ca503896@postgresql.org	2020-08-22 12:34:17 -04:00
Tom Lane	5028981923	Fix handling of CREATE TABLE LIKE with inheritance. If a CREATE TABLE command uses both LIKE and traditional inheritance, Vars in CHECK constraints and expression indexes that are absorbed from a LIKE parent table tended to get mis-numbered, resulting in wrong answers and/or bizarre error messages (though probably not any actual crashes, thanks to validation occurring in the executor). In v12 and up, the same could happen to Vars in GENERATED expressions, even in cases with no LIKE clause but multiple traditional-inheritance parents. The cause of the problem for LIKE is that parse_utilcmd.c supposed it could renumber such Vars correctly during transformCreateStmt(), which it cannot since we have not yet accounted for columns added via inheritance. Fix that by postponing processing of LIKE INCLUDING CONSTRAINTS, DEFAULTS, GENERATED, INDEXES till after we've performed DefineRelation(). The error with GENERATED and multiple inheritance is a simple oversight in MergeAttributes(); it knows it has to renumber Vars in inherited CHECK constraints, but forgot to apply the same processing to inherited GENERATED expressions (a/k/a defaults). Per bug #16272 from Tom Gottfried. The non-GENERATED variants of the issue are ancient, presumably dating right back to the addition of CREATE TABLE LIKE; hence back-patch to all supported branches. Discussion: https://postgr.es/m/16272-6e32da020e9a9381@postgresql.org	2020-08-21 15:00:47 -04:00

... 3 4 5 6 7 ...

21298 Commits