postgresql

Commit Graph

Author	SHA1	Message	Date
Michael Paquier	f1f10a1ba9	Add declaration-level assertions for compile-time checks Those new assertions can be used at file scope, outside of any function for compilation checks. This commit provides implementations for C and C++, and fallback implementations. Author: Peter Smith Reviewed-by: Andres Freund, Kyotaro Horiguchi, Dagfinn Ilmari Mannsåker, Michael Paquier Discussion: https://postgr.es/m/201DD0641B056142AC8C6645EC1B5F62014B8E8030@SYD1217	2020-02-03 14:48:42 +09:00
Andrew Gierth	1fd687a035	Optimizations for integer to decimal output. Using a lookup table of digit pairs reduces the number of divisions needed, and calculating the length upfront saves some work; these ideas are taken from the code previously committed for floats. David Fetter, reviewed by Kyotaro Horiguchi, Tels, and me. Discussion: https://postgr.es/m/20190924052620.GP31596%40fetter.org	2020-02-01 21:57:14 +00:00
Thomas Munro	93745f1e01	Fix memory leak on DSM slot exhaustion. If we attempt to create a DSM segment when no slots are available, we should return the memory to the operating system. Previously we did that if the DSM_CREATE_NULL_IF_MAXSEGMENTS flag was passed in, but we didn't do it if an error was raised. Repair. Back-patch to 9.4, where DSM segments arrived. Author: Thomas Munro Reviewed-by: Robert Haas Reported-by: Julian Backes Discussion: https://postgr.es/m/CA%2BhUKGKAAoEw-R4om0d2YM4eqT1eGEi6%3DQot-3ceDR-SLiWVDw%40mail.gmail.com	2020-02-01 14:29:13 +13:00
Tom Lane	870ad6a59b	Fix not-quite-right string comparison in parse_jsonb_index_flags(). This code would accept "strinX", where X is any 1-byte character, as meaning "string". Clearly it wasn't meant to do that. No back-patch, since this doesn't affect correct queries and there's some tiny chance we'd break somebody's incorrect query in a minor release. Report and patch by Dominik Czarnota. Discussion: https://postgr.es/m/CABEVAa1dU0mDCAfaT8WF2adVXTDsLVJy_izotg6ze_hh-cn8qQ@mail.gmail.com	2020-01-31 17:26:40 -05:00
Tom Lane	74b35eb468	Fix CheckAttributeType's handling of collations for ranges. Commit `fc7695891` changed CheckAttributeType to recurse into ranges, but made it pass down the wrong collation (always InvalidOid, since ranges as such have no collation). This would result in guaranteed failure when considering a range type whose subtype is collatable. Embarrassingly, we lack any regression tests that would expose such a problem (but fortunately, somebody noticed before we shipped this bug in any release). Fix it to pass down the range's subtype collation property instead, and add some regression test cases to exercise collatable-subtype ranges a bit more. Back-patch to all supported branches, as the previous patch was. Report and patch by Julien Rouhaud, test cases tweaked by me Discussion: https://postgr.es/m/CAOBaU_aBWqNweiGUFX0guzBKkcfJ8mnnyyGC_KBQmO12Mj5f_A@mail.gmail.com	2020-01-31 17:03:55 -05:00
Peter Eisentraut	7c23bfd25c	Sprinkle some const decorations This might help clarify the API a bit.	2020-01-31 12:52:08 +01:00
Thomas Munro	ef02fb15a3	Report time spent in posix_fallocate() as a wait event. When allocating DSM segments with posix_fallocate() on Linux (see commit `899bd785`), report this activity as a wait event exactly as we would if we were using file-backed DSM rather than shm_open()-backed DSM. Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKGKCSh4GARZrJrQZwqs5SYp0xDMRr9Bvb%2BHQzJKvRgL6ZA%40mail.gmail.com	2020-01-31 17:29:41 +13:00
Thomas Munro	d061ea21fc	Adjust DSM and DSA slot usage constants. When running a lot of large parallel queries concurrently, or a plan with a lot of separate Gather nodes, it is possible to run out of DSM slots. There are better solutions to these problems requiring architectural redesign work, but for now, let's adjust the constants so that it's more difficult to hit the limit. 1. Previously, a DSA area would create up to four segments at each size before doubling the size. After this commit, it will create only two at each size, so it ramps up faster and therefore needs fewer slots. 2. Previously, the total limit on DSM slots allowed for 2 per connection. Switch to 5 per connection. Also remove an obsolete nearby comment. Author: Thomas Munro Reviewed-by: Robert Haas, Andres Freund Discussion: https://postre.es/m/CA%2BhUKGL6H2BpGbiF7Lj6QiTjTGyTLW_vLR%3DSn2tEBeTcYXiMKw%40mail.gmail.com	2020-01-31 17:29:38 +13:00
Thomas Munro	74618e77b4	Handle lack of DSM slots in parallel btree build. If no DSM slots are available, a ParallelContext can still be created, but its seg pointer is NULL. Teach parallel btree build to cope with that by falling back to a regular non-parallel build, to avoid crashing with a segmentation fault. Back-patch to 11, where parallel CREATE INDEX landed. Reported-by: Nicola Contu Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CA%2BhUKGJgJEBnkuODBVomyK3MWFvDBbMVj%3Dgdt6DnRPU-5sQ6UQ%40mail.gmail.com	2020-01-31 10:25:34 +13:00
Alvaro Herrera	c9d2977519	Clean up newlines following left parentheses We used to strategically place newlines after some function call left parentheses to make pgindent move the argument list a few chars to the left, so that the whole line would fit under 80 chars. However, pgindent no longer does that, so the newlines just made the code vertically longer for no reason. Remove those newlines, and reflow some of those lines for some extra naturality. Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql	2020-01-30 13:42:14 -03:00
Alvaro Herrera	4e89c79a52	Remove excess parens in ereport() calls Cosmetic cleanup, not worth backpatching. Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql Reviewed-by: Tom Lane, Michael Paquier	2020-01-30 13:32:04 -03:00
Fujii Masao	e6f1e560e4	Make inherited TRUNCATE perform access permission checks on parent table only. Previously, TRUNCATE command through a parent table checked the permissions on not only the parent table but also the children tables inherited from it. This was a bug and inherited queries should perform access permission checks on the parent table only. This commit fixes that bug. Back-patch to all supported branches. Author: Amit Langote Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CAHGQGwFHdSvifhJE+-GSNqUHSfbiKxaeQQ7HGcYz6SC2n_oDcg@mail.gmail.com	2020-01-31 00:42:06 +09:00
Michael Paquier	b0afdcad21	Fix slot data persistency when advancing physical replication slots Advancing a physical replication slot with pg_replication_slot_advance() did not mark the slot as dirty if any advancing was done, preventing the follow-up checkpoint to flush the slot data to disk. This caused the advancing to be lost even on clean restarts. This does not happen for logical slots as any advancing marked the slot as dirty. Per discussion, the original feature has been implemented so as in the event of a crash the slot may move backwards to a past LSN. This property is kept and more documentation is added about that. This commit adds some new TAP tests to check the persistency of physical and logical slots after advancing across clean restarts. Author: Alexey Kondratov, Michael Paquier Reviewed-by: Andres Freund, Kyotaro Horiguchi, Craig Ringer Discussion: https://postgr.es/m/059cc53a-8b14-653a-a24d-5f867503b0ee@postgrespro.ru Backpatch-through: 11	2020-01-30 11:14:02 +09:00
Tom Lane	50fc694e43	Invent "trusted" extensions, and remove the pg_pltemplate catalog. This patch creates a new extension property, "trusted". An extension that's marked that way in its control file can be installed by a non-superuser who has the CREATE privilege on the current database, even if the extension contains objects that normally would have to be created by a superuser. The objects within the extension will (by default) be owned by the bootstrap superuser, but the extension itself will be owned by the calling user. This allows replicating the old behavior around trusted procedural languages, without all the special-case logic in CREATE LANGUAGE. We have, however, chosen to loosen the rules slightly: formerly, only a database owner could take advantage of the special case that allowed installation of a trusted language, but now anyone who has CREATE privilege can do so. Having done that, we can delete the pg_pltemplate catalog, moving the knowledge it contained into the extension script files for the various PLs. This ends up being no change at all for the in-core PLs, but it is a large step forward for external PLs: they can now have the same ease of installation as core PLs do. The old "trusted PL" behavior was only available to PLs that had entries in pg_pltemplate, but now any extension can be marked trusted if appropriate. This also removes one of the stumbling blocks for our Python 2 -> 3 migration, since the association of "plpythonu" with Python 2 is no longer hard-wired into pg_pltemplate's initial contents. Exactly where we go from here on that front remains to be settled, but one problem is fixed. Patch by me, reviewed by Peter Eisentraut, Stephen Frost, and others. Discussion: https://postgr.es/m/5889.1566415762@sss.pgh.pa.us	2020-01-29 18:42:43 -05:00
Robert Haas	beb4699091	Move jsonapi.c and jsonapi.h to src/common. To make this work, (1) makeJsonLexContextCstringLen now takes the encoding to be used as an argument; (2) check_stack_depth() is made to do nothing in frontend code, and (3) elog(ERROR, ...) is changed to pg_log_fatal + exit in frontend code. Mark Dilger, reviewed and slightly revised by me. Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com	2020-01-29 10:22:51 -05:00
Peter Eisentraut	dc788668bb	Fail if recovery target is not reached Before, if a recovery target is configured, but the archive ended before the target was reached, recovery would end and the server would promote without further notice. That was deemed to be pretty wrong. With this change, if the recovery target is not reached, it is a fatal error. Based-on-patch-by: Leif Gunnar Erlandsen <leif@lako.no> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/993736dd3f1713ec1f63fc3b653839f5@lako.no	2020-01-29 15:58:14 +01:00
Tom Lane	01d9676a53	Fix dangling pointer in EvalPlanQual machinery. EvalPlanQualStart() supposed that it could re-use the relsubs_rowmark and relsubs_done arrays from a prior instantiation. But since they are allocated in the es_query_cxt of the recheckestate, that's just wrong; EvalPlanQualEnd() will blow away that storage. Therefore we were using storage that could have been reallocated to something else, causing all sorts of havoc. I think this was modeled on the old code's handling of es_epqTupleSlot, but since the code was anyway clearing the arrays at re-use, there's clearly no expectation of importing any outside state. So it's just a dubious savings of a couple of pallocs, which is negligible compared to setting up a new planstate tree. Therefore, just allocate the arrays always. (I moved the allocations slightly for readability.) In principle this bug could cause a problem whenever EPQ rechecks are needed in more than one target table of a ModifyTable plan node. In practice it seems not quite so easy to trigger as that; I couldn't readily duplicate a crash with a partitioned target table, for instance. That's probably down to incidental choices about when to free or reallocate stuff. The added isolation test case does seem to reliably show an assertion failure, though. Per report from Oleksii Kliukin. Back-patch to v12 where the bug was introduced (evidently by commit `3fb307bc4`). Discussion: https://postgr.es/m/EEF05F66-2871-4786-992B-5F45C92FEE2E@hintbits.com	2020-01-28 17:26:37 -05:00
Heikki Linnakangas	30012a04a6	Fix randAccess setting in ReadRecord() Commit `38a957316d` got this backwards. Author: Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/20200128.194408.2260703306774646445.horikyota.ntt@gmail.com	2020-01-28 12:55:30 +02:00
Thomas Munro	11da6bccd1	Fix compile error on HP C. Per build farm animal anole, after commit `6f38d4dac3`.	2020-01-28 20:30:40 +13:00
Thomas Munro	78aaa0e823	Don't reset latch in ConditionVariablePrepareToSleep(). It's not OK to do that without calling CHECK_FOR_INTERRUPTS(). Let the next wait loop deal with it, following the usual pattern. One consequence of this bug was that a SIGTERM delivered in a very narrow timing window could leave a parallel worker process waiting forever for a condition variable that will never be signaled, after an error was raised in other process. The code is a bit different in the stable branches due to commit `1321509f`, making problems less likely there. No back-patch for now, but we may finish up deciding to make a similar change after more discussion. Author: Thomas Munro Reviewed-by: Shawn Debnath Reported-by: Tomas Vondra Discussion: https://postgr.es/m/CA%2BhUKGJOm8zZHjVA8svoNT3tHY0XdqmaC_kHitmgXDQM49m1dA%40mail.gmail.com	2020-01-28 15:28:36 +13:00
Amit Kapila	05f18c6b6b	Added relation name in error messages for constraint checks. This gives more information to the user about the error and it makes such messages consistent with the other similar messages in the code. Reported-by: Simon Riggs Author: Mahendra Singh and Simon Riggs Reviewed-by: Beena Emerson and Amit Kapila Discussion: https://postgr.es/m/CANP8+j+7YUvQvGxTrCiw77R23enMJ7DFmyA3buR+fa2pKs4XhA@mail.gmail.com	2020-01-28 07:48:10 +05:30
Michael Paquier	ff8ca5fadd	Add connection parameters to control SSL protocol min/max in libpq These two new parameters, named sslminprotocolversion and sslmaxprotocolversion, allow to respectively control the minimum and the maximum version of the SSL protocol used for the SSL connection attempt. The default setting is to allow any version for both the minimum and the maximum bounds, causing libpq to rely on the bounds set by the backend when negotiating the protocol to use for an SSL connection. The bounds are checked when the values are set at the earliest stage possible as this makes the checks independent of any SSL implementation. Author: Daniel Gustafsson Reviewed-by: Michael Paquier, Cary Huang Discussion: https://postgr.es/m/4F246AE3-A7AE-471E-BD3D-C799D3748E03@yesql.se	2020-01-28 10:40:48 +09:00
Thomas Munro	6f38d4dac3	Remove dependency on HeapTuple from predicate locking functions. The following changes make the predicate locking functions more generic and suitable for use by future access methods: - PredicateLockTuple() is renamed to PredicateLockTID(). It takes ItemPointer and inserting transaction ID instead of HeapTuple. - CheckForSerializableConflictIn() takes blocknum instead of buffer. - CheckForSerializableConflictOut() no longer takes HeapTuple or buffer. Author: Ashwin Agrawal Reviewed-by: Andres Freund, Kuntal Ghosh, Thomas Munro Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com	2020-01-28 13:13:04 +13:00
Tom Lane	4589c6a2a3	Apply project best practices to switches over enum values. In the wake of `1f3a02173`, assorted buildfarm members were warning about "control reaches end of non-void function" or the like. Do what we've done elsewhere: in place of a "default" switch case that will prevent the compiler from warning about unhandled enum values, put a catchall elog() after the switch. And return a dummy value to satisfy compilers that don't know elog() doesn't return.	2020-01-27 18:46:30 -05:00
Robert Haas	73ce2a03f3	Move some code from jsonapi.c to jsonfuncs.c. Specifically, move those functions that depend on ereport() from jsonapi.c to jsonfuncs.c, in preparation for allowing jsonapi.c to be used from frontend code. A few cases where elog(ERROR, ...) is used for can't-happen conditions are left alone; we can handle those in some other way in frontend code. Reviewed by Mark Dilger and Andrew Dunstan. Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com	2020-01-27 11:22:13 -05:00
Robert Haas	1f3a021730	Adjust pg_parse_json() so that it does not directly ereport(). Instead, it now returns a value indicating either success or the type of error which occurred. The old behavior is still available by calling pg_parse_json_or_ereport(). If the new interface is used, an error can be thrown by passing the return value of pg_parse_json() to json_ereport_error(). pg_parse_json() can still elog() in can't-happen cases, but it seems like that issue is best handled separately. Adjust json_lex() and json_count_array_elements() to return an error code, too. This is all in preparation for making the backend's json parser available to frontend code. Reviewed and/or tested by Mark Dilger and Andrew Dunstan. Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com	2020-01-27 11:04:51 -05:00
Thomas Munro	3e4818e9dd	Avoid unnecessary shm writes in Parallel Hash Join. Currently, Parallel Hash Join cannot be used for full/right joins, so there is no point in setting the match flag. It turns out that the cache coherence traffic generated by those writes slows down large systems running many-core joins, so let's stop doing that. In future, if we need to use match bits in parallel joins, we might want to consider setting them only if not already set. Back-patch to 11, where Parallel Hash Join arrived. Reported-by: Deng, Gang Discussion: https://postgr.es/m/0F44E799048C4849BAE4B91012DB910462E9897A%40SHSMSX103.ccr.corp.intel.com	2020-01-27 15:07:03 +13:00
Michael Paquier	10a525230f	Fix some memory leaks and improve restricted token handling on Windows The leaks have been detected by a Coverity run on Windows. No backpatch is done as the leaks are minor. While on it, make restricted token creation more consistent in its error handling by logging an error instead of a warning if missing advapi32.dll, which was missing in the NT4 days. Any modern platform should have this DLL around. Now, if the library is not there, an error is still reported back to the caller, and nothing is done do there is no behavior change done in this commit. Author: Ranier Vilela Discussion: https://postgr.es/m/CAEudQApa9MG0foPkgPX87fipk=vhnF2Xfg+CfUyR08h4R7Mywg@mail.gmail.com	2020-01-27 11:02:05 +09:00
Tom Lane	3ec20c7091	Fix EXPLAIN (SETTINGS) to follow policy about when to print empty fields. In non-TEXT output formats, the "Settings" field should appear when requested, even if it would be empty. Also, get rid of the premature optimization of counting all the GUC_EXPLAIN variables at startup. Since there was no provision for adjusting that count later, all it'd take would be some extension marking a parameter as GUC_EXPLAIN to risk an assertion failure or memory stomp. We could make get_explain_guc_options() count those variables on-the-fly, or dynamically resize its array ... but TBH I do not think that making a transient array of pointers a bit smaller is worth any extra complication, especially when you consider all the other transient space EXPLAIN eats. So just allocate that array at the max possible size. In HEAD, also add some regression test coverage for this feature. Because of the memory-stomp hazard, back-patch to v12 where this feature was added. Discussion: https://postgr.es/m/19416.1580069629@sss.pgh.pa.us	2020-01-26 16:32:19 -05:00
Thomas Munro	f37ff03478	Refactor confusing code in _mdfd_openseg(). As reported independently by a couple of people, _mdfd_openseg() is coded in a way that seems to imply that the segments could be opened in an order that isn't strictly sequential. Even if that were true, it's also using the wrong comparison. It's not an active bug, since the condition is always true anyway, but it's confusing, so replace it with an assertion. Author: Thomas Munro Reviewed-by: Andres Freund, Kyotaro Horiguchi, Noah Misch Discussion: https://postgr.es/m/CA%2BhUKG%2BNBw%2BuSzxF1os-SO6gUuw%3DcqO5DAybk6KnHKzgGvxhxA%40mail.gmail.com Discussion: https://postgr.es/m/20191222091930.GA1280238%40rfd.leadboat.com	2020-01-27 09:12:56 +13:00
Heikki Linnakangas	38a957316d	Refactor XLogReadRecord(), adding XLogBeginRead() function. The signature of XLogReadRecord() required the caller to pass the starting WAL position as argument, or InvalidXLogRecPtr to continue reading at the end of previous record. That's slightly awkward to the callers, as most of them don't want to randomly jump around in the WAL stream, but start reading at one position and then read everything from that point onwards. Remove the 'RecPtr' argument and add a new function XLogBeginRead() to specify the starting position instead. That's more convenient for the callers. Also, xlogreader holds state that is reset when you change the starting position, so having a separate function for doing that feels like a more natural fit. This changes XLogFindNextRecord() function so that it doesn't reset the xlogreader's state to what it was before the call anymore. Instead, it positions the xlogreader to the found record, like XLogBeginRead(). Reviewed-by: Kyotaro Horiguchi, Alvaro Herrera Discussion: https://www.postgresql.org/message-id/5382a7a3-debe-be31-c860-cb810c08f366%40iki.fi	2020-01-26 11:39:00 +02:00
Tom Lane	1001368497	Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com	2020-01-25 18:16:42 -05:00
Dean Rasheed	13661ddd7e	Add functions gcd() and lcm() for integer and numeric types. These compute the greatest common divisor and least common multiple of a pair of numbers using the Euclidean algorithm. Vik Fearing, reviewed by Fabien Coelho. Discussion: https://postgr.es/m/adbd3e0b-e3f1-5bbc-21db-03caf1cef0f7@2ndquadrant.com	2020-01-25 14:00:59 +00:00
Robert Haas	530609aa42	Remove jsonapi.c's lex_accept(). At first glance, this function seems useful, but it actually increases the amount of code required rather than decreasing it. Inline the logic into the callers instead; most callers don't use the 'lexeme' argument for anything and as a result considerable simplification is possible. Along the way, fix the header comment for the nearby function lex_expect(), which mislabeled it as lex_accept(). Patch by me, reviewed by David Steele, Mark Dilger, and Andrew Dunstan. Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com	2020-01-24 10:29:52 -08:00
Robert Haas	11b5e3e35d	Split JSON lexer/parser from 'json' data type support. Keep the code that pertains to the 'json' data type in json.c, but move the lexing and parsing code to a new file jsonapi.c, a name I chose because the corresponding prototypes are in jsonapi.h. This seems like a logical division, because the JSON lexer and parser are also used by the 'jsonb' data type, but the SQL-callable functions in json.c are a separate thing. Also, the new jsonapi.c file needs to include far fewer header files than json.c, which seems like a good sign that this is an appropriate place to insert an abstraction boundary. I took the opportunity to remove a few apparently-unneeded includes from json.c at the same time. Patch by me, reviewed by David Steele, Mark Dilger, and Andrew Dunstan. The previous commit was, too, but I forgot to note it in the commit message. Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com	2020-01-24 10:17:43 -08:00
Robert Haas	ce0425b162	Adjust src/include/utils/jsonapi.h so it's not backend-only. The major change here is that we no longer include jsonb.h into jsonapi.h. The reason that was necessary is that jsonapi.h included several prototypes functions in jsonfuncs.c that depend on the Jsonb type. Move those prototypes to a new header, jsonfuncs.h, and include it where needed. The other change is that JsonEncodeDateTime is now declared in json.h rather than jsonapi.h. Taken together, these steps eliminate all dependencies of jsonapi.h on backend-only data types and header files, so that it can potentially be included in frontend code.	2020-01-24 09:58:37 -08:00
Fujii Masao	d694e0bb79	Add pg_file_sync() to adminpack extension. This function allows us to fsync the specified file or directory. It's useful, for example, when we want to sync the file that pg_file_write() writes out or that COPY TO exports the data into, for durability. Author: Fujii Masao Reviewed-By: Julien Rouhaud, Arthur Zakirov, Michael Paquier, Atsushi Torikoshi Discussion: https://www.postgresql.org/message-id/CAHGQGwGY8uzZ_k8dHRoW1zDcy1Z7=5GQ+So4ZkVy2u=nLsk=hA@mail.gmail.com	2020-01-24 20:42:52 +09:00
Tom Lane	9a3a75cb81	Fix an oversight in commit `4c70098ff`. I had supposed that the from_char_seq_search() call sites were all passing the constant arrays you'd expect them to pass ... but on looking closer, the one for DY format was passing the days[] array not days_short[]. This accidentally worked because the day abbreviations in English are all the same as the first three letters of the full day names. However, once we took out the "maximum comparison length" logic, it stopped working. As penance for that oversight, add regression test cases covering this, as well as every other switch case in DCH_from_char() that was not reached according to the code coverage report. Also, fold the DCH_RM and DCH_rm cases into one --- now that seq_search is case independent, there's no need to pass different comparison arrays for those cases. Back-patch, as the previous commit was.	2020-01-23 16:15:32 -05:00
Tom Lane	4c70098ffa	Clean up formatting.c's logic for matching constant strings. seq_search(), which is used to match input substrings to constants such as month and day names, had a lot of bizarre and unnecessary behaviors. It was mostly possible to avert our eyes from that before, but we don't want to duplicate those behaviors in the upcoming patch to allow recognition of non-English month and day names. So it's time to clean this up. In particular: * seq_search scribbled on the input string, which is a pretty dangerous thing to do, especially in the badly underdocumented way it was done here. Fortunately the input string is a temporary copy, but that was being made three subroutine levels away, making it something easy to break accidentally. The behavior is externally visible nonetheless, in the form of odd case-folding in error reports about unrecognized month/day names. The scribbling is evidently being done to save a few calls to pg_tolower, but that's such a cheap function (at least for ASCII data) that it's pretty pointless to worry about. In HEAD I switched it to be pg_ascii_tolower to ensure it is cheap in all cases; but there are corner cases in Turkish where this'd change behavior, so leave it as pg_tolower in the back branches. * seq_search insisted on knowing the case form (all-upper, all-lower, or initcap) of the constant strings, so that it didn't have to case-fold them to perform case-insensitive comparisons. This likewise seems like excessive micro-optimization, given that pg_tolower is certainly very cheap for ASCII data. It seems unsafe to assume that we know the case form that will come out of pg_locale.c for localized month/day names, so it's better just to define the comparison rule as "downcase all strings before comparing". (The choice between downcasing and upcasing is arbitrary so far as English is concerned, but it might not be in other locales, so follow citext's lead here.) * seq_search also had a parameter that'd cause it to report a match after a maximum number of characters, even if the constant string were longer than that. This was not actually used because no caller passed a value small enough to cut off a comparison. Replicating that behavior for localized month/day names seems expensive as well as useless, so let's get rid of that too. * from_char_seq_search used the maximum-length parameter to truncate the input string in error reports about not finding a matching name. This leads to rather confusing reports in many cases. Worse, it is outright dangerous if the input string isn't all-ASCII, because we risk truncating the string in the middle of a multibyte character. That'd lead either to delivering an illegible error message to the client, or to encoding-conversion failures that obscure the actual data problem. Get rid of that in favor of truncating at whitespace if any (a suggestion due to Alvaro Herrera). In addition to fixing these things, I const-ified the input string pointers of DCH_from_char and its subroutines, to make sure there aren't any other scribbling-on-input problems. The risk of generating a badly-encoded error message seems like enough of a bug to justify back-patching, so patch all supported branches. Discussion: https://postgr.es/m/29432.1579731087@sss.pgh.pa.us	2020-01-23 13:42:09 -05:00
Michael Paquier	f942dfb952	Clarify some comments in vacuumlazy.c Author: Justin Pryzby Discussion: https://postgr.es/m/20200113004542.GA26045@telsasoft.com	2020-01-23 15:56:56 +09:00
Fujii Masao	41c184bc64	Add GUC ignore_invalid_pages. Detection of WAL records having references to invalid pages during recovery causes PostgreSQL to raise a PANIC-level error, aborting the recovery. Setting ignore_invalid_pages to on causes the system to ignore those WAL records (but still report a warning), and continue recovery. This behavior may cause crashes, data loss, propagate or hide corruption, or other serious problems. However, it may allow you to get past the PANIC-level error, to finish the recovery, and to cause the server to start up. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAHGQGwHCK6f77yeZD4MHOnN+PaTf6XiJfEB+Ce7SksSHjeAWtg@mail.gmail.com	2020-01-22 11:56:34 +09:00
Amit Kapila	79a3efb84d	Fix the computation of max dead tuples during the vacuum. In commit `40d964ec99`, we changed the way memory is allocated for dead tuples but forgot to update the place where we compute the maximum number of dead tuples. This could lead to invalid memory requests. Reported-by: Andres Freund Diagnosed-by: Andres Freund Author: Masahiko Sawada Reviewed-by: Amit Kapila and Dilip Kumar Discussion: https://postgr.es/m/20200121060020.e3cr7s7fj5rw4lok@alap3.anarazel.de	2020-01-22 07:43:51 +05:30
Michael Paquier	a904abe2e2	Fix concurrent indexing operations with temporary tables Attempting to use CREATE INDEX, DROP INDEX or REINDEX with CONCURRENTLY on a temporary relation with ON COMMIT actions triggered unexpected errors because those operations use multiple transactions internally to complete their work. Here is for example one confusing error when using ON COMMIT DELETE ROWS: ERROR: index "foo" already contains data Issues related to temporary relations and concurrent indexing are fixed in this commit by enforcing the non-concurrent path to be taken for temporary relations even if using CONCURRENTLY, transparently to the user. Using a non-concurrent path does not matter in practice as locks cannot be taken on a temporary relation by a session different than the one owning the relation, and the non-concurrent operation is more effective. The problem exists with REINDEX since v12 with the introduction of CONCURRENTLY, and with CREATE/DROP INDEX since CONCURRENTLY exists for those commands. In all supported versions, this caused only confusing error messages to be generated. Note that with REINDEX, it was also possible to issue a REINDEX CONCURRENTLY for a temporary relation owned by a different session, leading to a server crash. The idea to enforce transparently the non-concurrent code path for temporary relations comes originally from Andres Freund. Reported-by: Manuel Rigger Author: Michael Paquier, Heikki Linnakangas Reviewed-by: Andres Freund, Álvaro Herrera, Heikki Linnakangas Discussion: https://postgr.es/m/CA+u7OA6gP7YAeCguyseusYcc=uR8+ypjCcgDDCTzjQ+k6S9ksQ@mail.gmail.com Backpatch-through: 9.4	2020-01-22 09:49:18 +09:00
Tom Lane	9b9c5f279e	Clarify behavior of adding and altering a column in same ALTER command. The behavior of something like ALTER TABLE transactions ADD COLUMN status varchar(30) DEFAULT 'old', ALTER COLUMN status SET default 'current'; is to fill existing table rows with 'old', not 'current'. That's intentional and desirable for a couple of reasons: * It makes the behavior the same whether you merge the sub-commands into one ALTER command or give them separately; * If we applied the new default while filling the table, there would be no way to get the existing behavior in one SQL command. The same reasoning applies in cases that add a column and then manipulate its GENERATED/IDENTITY status in a second sub-command, since the generation expression is really just a kind of default. However, that wasn't very obvious (at least not to me; earlier in the referenced discussion thread I'd thought it was a bug to be fixed). And it certainly wasn't documented. Hence, add documentation, code comments, and a test case to clarify that this behavior is all intentional. In passing, adjust ATExecAddColumn's defaults-related relkind check so that it matches up exactly with ATRewriteTables, instead of being effectively (though not literally) the negated inverse condition. The reasoning can be explained a lot more concisely that way, too (not to mention that the comment now matches the code, which it did not before). Discussion: https://postgr.es/m/10365.1558909428@sss.pgh.pa.us	2020-01-21 16:17:21 -05:00
Andres Freund	affdde2e15	Fix edge case leading to agg transitions skipping ExecAggTransReparent() calls. The code checking whether an aggregate transition value needs to be reparented into the current context has always only compared the transition return value with the previous transition value by datum, i.e. without regard for NULLness. This normally works, because when the transition function returns NULL (via fcinfo->isnull), it'll return a value that won't be the same as its input value. But there's no hard requirement that that's the case. And it turns out, it's possible to hit this case (see discussion or reproducers), leading to a non-null transition value not being reparented, followed by a crash caused by that. Instead of adding another comparison of NULLness, instead have ExecAggTransReparent() ensure that pergroup->transValue ends up as 0 when the new transition value is NULL. That avoids having to add an additional branch to the much more common cases of the transition function returning the old transition value (which is a pointer in this case), and when the new value is different, but not NULL. In branches since `69c3936a14`, also deduplicate the reparenting code between the expression evaluation based transitions, and the path for ordered aggregates. Reported-By: Teodor Sigaev, Nikita Glukhov Author: Andres Freund Discussion: https://postgr.es/m/bd34e930-cfec-ea9b-3827-a8bc50891393@sigaev.ru Backpatch: 9.4-, this issue has existed since at least 7.4	2020-01-20 23:26:51 -08:00
Tom Lane	31f403e95f	Further tweaking of jsonb_set_lax(). Some buildfarm members were still warning about this, because in `9c679a08f` I'd missed decorating one of the ereport() code paths with a dummy return. Also, adjust the error messages to be more in line with project style guide.	2020-01-20 14:26:56 -05:00
Heikki Linnakangas	4c87010981	Fix crash in BRIN inclusion op functions, due to missing datum copy. The BRIN add_value() and union() functions need to make a longer-lived copy of the argument, if they want to store it in the BrinValues struct also passed as argument. The functions for the "inclusion operator classes" used with box, range and inet types didn't take into account that the union helper function might return its argument as is, without making a copy. Check for that case, and make a copy if necessary. That case arises at least with the range_union() function, when one of the arguments is an 'empty' range: CREATE TABLE brintest (n numrange); CREATE INDEX brinidx ON brintest USING brin (n); INSERT INTO brintest VALUES ('empty'); INSERT INTO brintest VALUES (numrange(0, 2^1000::numeric)); INSERT INTO brintest VALUES ('(-1, 0)'); SELECT brin_desummarize_range('brinidx', 0); SELECT brin_summarize_range('brinidx', 0); Backpatch down to 9.5, where BRIN was introduced. Discussion: https://www.postgresql.org/message-id/e6e1d6eb-0a67-36aa-e779-bcca59167c14%40iki.fi Reviewed-by: Emre Hasegeli, Tom Lane, Alvaro Herrera	2020-01-20 10:36:35 +02:00
Amit Kapila	40d964ec99	Allow vacuum command to process indexes in parallel. This feature allows the vacuum to leverage multiple CPUs in order to process indexes. This enables us to perform index vacuuming and index cleanup with background workers. This adds a PARALLEL option to VACUUM command where the user can specify the number of workers that can be used to perform the command which is limited by the number of indexes on a table. Specifying zero as a number of workers will disable parallelism. This option can't be used with the FULL option. Each index is processed by at most one vacuum process. Therefore parallel vacuum can be used when the table has at least two indexes. The parallel degree is either specified by the user or determined based on the number of indexes that the table has, and further limited by max_parallel_maintenance_workers. The index can participate in parallel vacuum iff it's size is greater than min_parallel_index_scan_size. Author: Masahiko Sawada and Amit Kapila Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra, Mahendra Singh and Sergei Kornilov Tested-by: Mahendra Singh and Prabhat Sahu Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com	2020-01-20 07:57:49 +05:30
Tom Lane	9c679a08f0	Silence minor compiler warnings. Ensure that ClassifyUtilityCommandAsReadOnly() has defined behavior even if TransactionStmt.kind has a value that's not one of the declared values for its enum. Suppress warnings from compilers that don't know that elog(ERROR) doesn't return, in ClassifyUtilityCommandAsReadOnly() and jsonb_set_lax(). Per Coverity and buildfarm.	2020-01-19 16:04:36 -05:00
Heikki Linnakangas	7aaefadaac	Remove separate files for the initial contents of pg_(sh)description This data was only in separate files because it was the most convenient way to handle it with a shell script. Now that we use a general-purpose programming language, it's easy to assemble the data into the same format as the rest of the catalogs and output it into postgres.bki. This allows removal of some special-purpose code from initdb.c. Discussion: https://www.postgresql.org/message-id/CACPNZCtVFtjHre6hg9dput0qRPp39pzuyA2A6BT8wdgrRy%2BQdA%40mail.gmail.com Author: John Naylor	2020-01-19 13:54:58 +02:00
Michael Paquier	41aadeeb12	Add GUC checks for ssl_min_protocol_version and ssl_max_protocol_version Mixing incorrect bounds set in the SSL context leads to confusing error messages generated by OpenSSL which are hard to act on. New checks are added within the GUC machinery to improve the user experience as they apply to any SSL implementation, not only OpenSSL, and doing the checks beforehand avoids the creation of a SSL during a reload (or startup) which we know will never be used anyway. Backpatch down to 12, as those parameters have been introduced by `e73e67c`. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20200114035420.GE1515@paquier.xyz Backpatch-through: 12	2020-01-18 12:32:43 +09:00
Alexander Korotkov	4b754d6c16	Avoid full scan of GIN indexes when possible The strategy of GIN index scan is driven by opclass-specific extract_query method. This method that needed search mode is GIN_SEARCH_MODE_ALL. This mode means that matching tuple may contain none of extracted entries. Simple example is '!term' tsquery, which doesn't need any term to exist in matching tsvector. In order to handle such scan key GIN calculates virtual entry, which contains all TIDs of all entries of attribute. In fact this is full scan of index attribute. And typically this is very slow, but allows to handle some queries correctly in GIN. However, current algorithm calculate such virtual entry for each GIN_SEARCH_MODE_ALL scan key even if they are multiple for the same attribute. This is clearly not optimal. This commit improves the situation by introduction of "exclude only" scan keys. Such scan keys are not capable to return set of matching TIDs. Instead, they are capable only to filter TIDs produced by normal scan keys. Therefore, each attribute should contain at least one normal scan key, while rest of them may be "exclude only" if search mode is GIN_SEARCH_MODE_ALL. The same optimization might be applied to the whole scan, not per-attribute. But that leads to NULL values elimination problem. There is trade-off between multiple possible ways to do this. We probably want to do this later using some cost-based decision algorithm. Discussion: https://postgr.es/m/CAOBaU_YGP5-BEt5Cc0%3DzMve92vocPzD%2BXiZgiZs1kjY0cj%3DXBg%40mail.gmail.com Author: Nikita Glukhov, Alexander Korotkov, Tom Lane, Julien Rouhaud Reviewed-by: Julien Rouhaud, Tomas Vondra, Tom Lane	2020-01-18 01:11:39 +03:00
Tom Lane	41c6f9db25	Repair more failures with SubPlans in multi-row VALUES lists. Commit `9b63c13f0` turns out to have been fundamentally misguided: the parent node's subPlan list is by no means the only way in which a child SubPlan node can be hooked into the outer execution state. As shown in bug #16213 from Matt Jibson, we can also get short-lived tuple table slots added to the outer es_tupleTable list. At this point I have little faith that there aren't other possible connections as well; the long time it took to notice this problem shows that this isn't a heavily-exercised situation. Therefore, revert that fix, returning to the coding that passed a NULL parent plan pointer down to the transiently-built subexpressions. That gives us a pretty good guarantee that they won't hook into the outer executor state in any way. But then we need some other solution to make SubPlans work. Adopt the solution speculated about in the previous commit's log message: do expression initialization at plan startup for just those VALUES rows containing SubPlans, abandoning the goal of reclaiming memory intra-query for those rows. In practice it seems unlikely that queries containing a vast number of VALUES rows would be using SubPlans in them, so this should not give up much. (BTW, this test case also refutes my claim in connection with the prior commit that the issue only arises with use of LATERAL. That was just wrong: some variants of SubLink always produce SubPlans.) As with previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org	2020-01-17 16:17:31 -05:00
Alvaro Herrera	15cac3a523	Set ReorderBufferTXN->final_lsn more eagerly ... specifically, set it incrementally as each individual change is spilled down to disk. This way, it is set correctly when the transaction disappears without trace, ie. without leaving an XACT_ABORT wal record. (This happens when the server crashes midway through a transaction.) Failing to have final_lsn prevents ReorderBufferRestoreCleanup() from working, since it needs the final_lsn in order to know the endpoint of its iteration through spilled files. Commit `df9f682c7b` already tried to fix the problem, but it didn't set the final_lsn in all cases. Revert that, since it's no longer needed. Author: Vignesh C Reviewed-by: Amit Kapila, Dilip Kumar Discussion: https://postgr.es/m/CALDaNm2CLk+K9JDwjYST0sPbGg5AQdvhUt0jbKyX_HdAE0jk3A@mail.gmail.com	2020-01-17 18:00:39 -03:00
Tomas Vondra	543852fd8b	Allocate freechunks bitmap as part of SlabContext The bitmap used by SlabCheck to cross-check free chunks in a block used to be allocated for each SlabCheck call, and was never freed. The memory leak could be fixed by simply adding a pfree call, but it's actually a bad idea to do any allocations in SlabCheck at all as it assumes the state of the memory management as a whole is sane. So instead we allocate the bitmap as part of SlabContext, which means we don't need to do any allocations in SlabCheck and the bitmap goes away together with the SlabContext. Backpatch to 10, where the Slab context was introduced. Author: Tomas Vondra Reported-by: Andres Freund Reviewed-by: Tom Lane Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20200116044119.g45f7pmgz4jmodxj%40alap3.anarazel.de	2020-01-17 15:29:11 +01:00
Andrew Dunstan	a83586b554	Add a non-strict version of jsonb_set jsonb_set_lax() is the same as jsonb_set, except that it takes and extra argument that specifies what to do if the value argument is NULL. The default is 'use_json_null'. Other possibilities are 'raise_exception', 'return_target' and 'delete_key', all these behaviours having been suggested as reasonable by various users. Discussion: https://postgr.es/m/375873e2-c957-3a8d-64f9-26c43c2b16e7@2ndQuadrant.com Reviewed by: Pavel Stehule	2020-01-17 11:52:39 +10:30
Michael Paquier	f7cd5896a6	Move OpenSSL routines for min/max protocol setting to src/common/ Two routines have been added in OpenSSL 1.1.0 to set the protocol bounds allowed within a given SSL context: - SSL_CTX_set_min_proto_version - SSL_CTX_set_max_proto_version As Postgres supports OpenSSL down to 1.0.1 (as of HEAD), equivalent replacements exist in the tree, which are only available for the backend. A follow-up patch is planned to add control of the SSL protocol bounds for libpq, so move those routines to src/common/ so as libpq can use them. Author: Daniel Gustafsson Discussion: https://postgr.es/m/4F246AE3-A7AE-471E-BD3D-C799D3748E03@yesql.se	2020-01-17 10:06:17 +09:00
Tom Lane	5afaa2e426	Rationalize code placement between wchar.c, encnames.c, and mbutils.c. Move all the backend-only code that'd crept into wchar.c and encnames.c into mbutils.c. To remove the last few #ifdef dependencies from wchar.c and encnames.c, also make the following changes: * Adjust get_encoding_name_for_icu to return NULL, not throw an error, for unsupported encodings. Its sole caller can perfectly well throw an error instead. (While at it, I also made this function and its sibling is_encoding_supported_by_icu proof against out-of-range encoding IDs.) * Remove the overlength-name error condition from pg_char_to_encoding. It's completely silly not to treat that just like any other the-name-is-not-in-the-table case. Also, get rid of pg_mic_mblen --- there's no obvious reason why conv.c shouldn't call pg_mule_mblen instead. Other than that, this is just code movement and comment-polishing with no functional changes. Notably, I reordered declarations in pg_wchar.h to show which functions are frontend-accessible and which are not. Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com	2020-01-16 18:08:21 -05:00
Tom Lane	e6afa8918c	Move wchar.c and encnames.c to src/common/. Formerly, various frontend directories symlinked these two sources and then built them locally. That's an ancient, ugly hack, and we now have a much better way: put them into libpgcommon. So do that. (The immediate motivation for this is the prospect of having to introduce still more symlinking if we don't.) This commit moves these two files absolutely verbatim, for ease of reviewing the git history. There's some follow-on work to be done that will modify them a bit. Robert Haas, Tom Lane Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com	2020-01-16 15:58:55 -05:00
Robert Haas	2eb34ac369	Fix problems with "read only query" checks, and refactor the code. Previously, check_xact_readonly() was responsible for determining which types of queries could not be run in a read-only transaction, standard_ProcessUtility() was responsibility for prohibiting things which were allowed in read only transactions but not in recovery, and utility commands were basically prohibited in bulk in parallel mode by calls to CommandIsReadOnly() in functions.c and spi.c. This situation was confusing and error-prone. Accordingly, move all the checks to a new function ClassifyUtilityCommandAsReadOnly(), which determines the degree to which a given statement is read only. In the old code, check_xact_readonly() inadvertently failed to handle several statement types that actually should have been prohibited, specifically T_CreatePolicyStmt, T_AlterPolicyStmt, T_CreateAmStmt, T_CreateStatsStmt, T_AlterStatsStmt, and T_AlterCollationStmt. As a result, thes statements were erroneously allowed in read only transactions, parallel queries, and standby operation. Generally, they would fail anyway due to some lower-level error check, but we shouldn't rely on that. In the new code structure, future omissions of this type should cause ClassifyUtilityCommandAsReadOnly() to complain about an unrecognized node type. As a fringe benefit, this means we can allow certain types of utility commands in parallel mode, where it's safe to do so. This allows ALTER SYSTEM, CALL, DO, CHECKPOINT, COPY FROM, EXPLAIN, and SHOW. It might be possible to allow additional commands with more work and thought. Along the way, document the thinking process behind the current set of checks, as per discussion especially with Peter Eisentraut. There is some interest in revising some of these rules, but that seems like a job for another patch. Patch by me, reviewed by Tom Lane, Stephen Frost, and Peter Eisentraut. Discussion: http://postgr.es/m/CA+TgmoZ_rLqJt5sYkvh+JpQnfX0Y+B2R+qfi820xNih6x-FQOQ@mail.gmail.com	2020-01-16 12:11:31 -05:00
Tom Lane	0db7c67051	Minor code beautification in regexp.c. Remove duplicated code (apparently introduced by commit `c8ea87e4b`). Also get rid of some PG_USED_FOR_ASSERTS_ONLY variables we don't really need to have. Li Japin, Tom Lane Discussion: https://postgr.es/m/PS1PR0601MB3770A5595B6E5E3FD6F35724B6360@PS1PR0601MB3770.apcprd06.prod.outlook.com	2020-01-16 11:31:30 -05:00
Tom Lane	1281a5c907	Restructure ALTER TABLE execution to fix assorted bugs. We've had numerous bug reports about how (1) IF NOT EXISTS clauses in ALTER TABLE don't behave as-expected, and (2) combining certain actions into one ALTER TABLE doesn't work, though executing the same actions as separate statements does. This patch cleans up all of the cases so far reported from the field, though there are still some oddities associated with identity columns. The core problem behind all of these bugs is that we do parse analysis of ALTER TABLE subcommands too soon, before starting execution of the statement. The root of the bugs in group (1) is that parse analysis schedules derived commands (such as a CREATE SEQUENCE for a serial column) before it's known whether the IF NOT EXISTS clause should cause a subcommand to be skipped. The root of the bugs in group (2) is that earlier subcommands may change the catalog state that later subcommands need to be parsed against. Hence, postpone parse analysis of ALTER TABLE's subcommands, and do that one subcommand at a time, during "phase 2" of ALTER TABLE which is the phase that does catalog rewrites. Thus the catalog effects of earlier subcommands are already visible when we analyze later ones. (The sole exception is that we do parse analysis for ALTER COLUMN TYPE subcommands during phase 1, so that their USING expressions can be parsed against the table's original state, which is what we need. Arguably, these bugs stem from falsely concluding that because ALTER COLUMN TYPE must do early parse analysis, every other command subtype can too.) This means that ALTER TABLE itself must deal with execution of any non-ALTER-TABLE derived statements that are generated by parse analysis. Add a suitable entry point to utility.c to accept those recursive calls, and create a struct to pass through the information needed by the recursive call, rather than making the argument lists of AlterTable() and friends even longer. Getting this to work correctly required a little bit of fiddling with the subcommand pass structure, in particular breaking up AT_PASS_ADD_CONSTR into multiple passes. But otherwise it's mostly a pretty straightforward application of the above ideas. Fixing the residual issues for identity columns requires refactoring of where the dependency link from an identity column to its sequence gets set up. So that seems like suitable material for a separate patch, especially since this one is pretty big already. Discussion: https://postgr.es/m/10365.1558909428@sss.pgh.pa.us	2020-01-15 18:49:24 -05:00
Alvaro Herrera	a166d408eb	Report progress of ANALYZE commands This uses the progress reporting infrastructure added by `c16dc1aca5`, adding support for ANALYZE. Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Co-authored-by: Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> Reviewed-by: Julien Rouhaud, Robert Haas, Anthony Nowocien, Kyotaro Horiguchi, Vignesh C, Amit Langote	2020-01-15 11:14:39 -03:00
Michael Paquier	ac5bdf6261	Fix buggy logic in isTempNamespaceInUse() The logic introduced in this routine as of `246a6c8` would report an incorrect result when a session calls it to check if the temporary namespace owned by the session is in use or not. It is possible to optimize more the routine in this case to avoid a PGPROC lookup, but let's keep the logic simple. As this routine is used only by autovacuum for now, there were no live bugs, still let's be correct for any future code involving it. Author: Michael Paquier Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/20200113093703.GA41902@paquier.xyz Backpatch-through: 11	2020-01-15 13:58:33 +09:00
Amit Kapila	4d8a8d0c73	Introduce IndexAM fields for parallel vacuum. Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions in IndexAmRoutine for parallel vacuum. The amusemaintenanceworkmem tells whether a particular IndexAM uses maintenance_work_mem or not. This will help in controlling the memory used by individual workers as otherwise, each worker can consume memory equal to maintenance_work_mem. The amparallelvacuumoptions tell whether a particular IndexAM participates in a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of vacuum. Author: Masahiko Sawada and Amit Kapila Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com	2020-01-15 07:24:14 +05:30
Peter Eisentraut	fe233366f2	Fix compiler warning about format on Windows On 64-bit Windows, pid_t is long long int, so a %d format isn't enough.	2020-01-14 23:59:18 +01:00
Peter Eisentraut	3297308278	walreceiver uses a temporary replication slot by default If no permanent replication slot is configured using primary_slot_name, the walreceiver now creates and uses a temporary replication slot. A new setting wal_receiver_create_temp_slot can be used to disable this behavior, for example, if the remote instance is out of replication slots. Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com	2020-01-14 14:40:41 +01:00
Peter Eisentraut	ee4ac46c8e	Expose PQbackendPID() through walreceiver API This will be used by a subsequent patch. Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com	2020-01-14 14:40:41 +01:00
Peter Eisentraut	f595117e24	ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSION Add an ALTER TABLE subcommand for dropping the generated property from a column, per SQL standard. Reviewed-by: Sergei Kornilov <sk@zsrv.org> Discussion: https://www.postgresql.org/message-id/flat/2f7f1d9c-946e-0453-d841-4f38eb9d69b6%402ndquadrant.com	2020-01-14 13:36:03 +01:00
Dean Rasheed	d751ba5235	Make rewriter prevent auto-updates on views with conditional INSTEAD rules. A view with conditional INSTEAD rules and no unconditional INSTEAD rules or INSTEAD OF triggers is not auto-updatable. Previously we relied on a check in the executor to catch this, but that's problematic since the planner may fail to properly handle such a query and thus return a particularly unhelpful error to the user, before reaching the executor check. Instead, trap this in the rewriter and report the correct error there. Doing so also allows us to include more useful error detail than the executor check can provide. This doesn't change the existing behaviour of updatable views; it merely ensures that useful error messages are reported when a view isn't updatable. Per report from Pengzhou Tang, though not adopting that suggested fix. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAG4reAQn+4xB6xHJqWdtE0ve_WqJkdyCV4P=trYr4Kn8_3_PEA@mail.gmail.com	2020-01-14 09:52:21 +00:00
Tom Lane	7f380c59f8	Reduce size of backend scanner's tables. Previously, the core scanner's yy_transition[] array had 37045 elements. Since that number is larger than INT16_MAX, Flex generated the array to contain 32-bit integers. By reimplementing some of the bulkier scanner rules, this patch reduces the array to 20495 elements. The much smaller total length, combined with the consequent use of 16-bit integers for the array elements reduces the binary size by over 200kB. This was accomplished in two ways: 1. Consolidate handling of quote continuations into a new start condition, rather than duplicating that logic for five different string types. 2. Treat Unicode strings and identifiers followed by a UESCAPE sequence as three separate tokens, rather than one. The logic to de-escape Unicode strings is moved to the filter code in parser.c, which already had the ability to provide special processing for token sequences. While we could have implemented the conversion in the grammar, that approach was rejected for performance and maintainability reasons. Performance in microbenchmarks of raw parsing seems equal or slightly faster in most cases, and it's reasonable to expect that in real-world usage (with more competition for the CPU cache) there will be a larger win. The exception is UESCAPE sequences; lexing those is about 10% slower, primarily because the scanner now has to be called three times rather than one. This seems acceptable since that feature is very rarely used. The psql and epcg lexers are likewise modified, primarily because we want to keep them all in sync. Since those lexers don't use the space-hogging -CF option, the space savings is much less, but it's still good for perhaps 10kB apiece. While at it, merge the ecpg lexer's handling of C-style comments used in SQL and in C. Those have different rules regarding nested comments, but since we already have the ability to keep track of the previous start condition, we can use that to handle both cases within a single start condition. This matches the core scanner more closely. John Naylor Discussion: https://postgr.es/m/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw@mail.gmail.com	2020-01-13 15:04:31 -05:00
Peter Eisentraut	259bbe1778	Fix base backup with database OIDs larger than INT32_MAX The use of pg_atoi() for parsing a string into an Oid fails for values larger than INT32_MAX, since OIDs are unsigned. Instead, use atooid(). While this has less error checking, the contents of the data directory are expected to be trustworthy, so we don't need to go out of our way to do full error checking. Discussion: https://www.postgresql.org/message-id/flat/dea47fc8-6c89-a2b1-07e3-754ff1ab094b%402ndquadrant.com	2020-01-13 13:41:12 +01:00
Michael Paquier	7689d907bb	Fix comment in heapam.c Improvement per suggestion from Tom Lane. Author: Daniel Gustafsson Discussion: https://postgr.es/m/FED18699-4270-4778-8DA8-10F119A5ECF3@yesql.se	2020-01-13 17:57:38 +09:00
Amit Kapila	4e514c6180	Delete empty pages in each pass during GIST VACUUM. Earlier, we use to postpone deleting empty pages till the second stage of vacuum to amortize the cost of scanning internal pages. However, that can sometimes (say vacuum is canceled or errored between first and second stage) delay the pages to be recycled. Another thing is that to facilitate deleting empty pages in the second stage, we need to share the information about internal and empty pages between different stages of vacuum. It will be quite tricky to share this information via DSM which is required for the upcoming parallel vacuum patch. Also, it will bring the logic to reclaim deleted pages closer to nbtree where we delete empty pages in each pass. Overall, the advantages of deleting empty pages in each pass outweigh the advantages of postponing the same. Author: Dilip Kumar, with changes by Amit Kapila Reviewed-by: Sawada Masahiko and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com	2020-01-13 07:59:44 +05:30
Tomas Vondra	eae056c19e	Apply multiple multivariate MCV lists when possible Until now we've only used a single multivariate MCV list per relation, covering the largest number of clauses. So for example given a query SELECT * FROM t WHERE a = 1 AND b =1 AND c = 1 AND d = 1 and extended statistics on (a,b) and (c,d), we'd only pick and use one of them. This commit improves this by repeatedly picking and applying the best statistics (matching the largest number of remaining clauses) until no additional statistics is applicable. This greedy algorithm is simple, but may not be optimal. A different choice of statistics may leave fewer clauses unestimated and/or give better estimates for some other reason. This can however happen only when there are overlapping statistics, and selecting one makes it impossible to use the other. E.g. with statistics on (a,b), (c,d), (b,c,d), we may pick either (a,b) and (c,d) or (b,c,d). But it's not clear which option is the best one. We however assume cases like this are rare, and the easiest solution is to define statistics covering the whole group of correlated columns. In the future we might support overlapping stats, using some of the clauses as conditions (in conditional probability sense). Author: Tomas Vondra Reviewed-by: Mark Dilger, Kyotaro Horiguchi Discussion: https://postgr.es/m/20191028152048.jc6pqv5hb7j77ocp@development	2020-01-13 01:21:17 +01:00
Tomas Vondra	aaa6761876	Apply all available functional dependencies When considering functional dependencies during selectivity estimation, it's not necessary to bother with selecting the best extended statistic object and then use just dependencies from it. We can simply consider all applicable functional dependencies at once. This means we need to deserialie all (applicable) dependencies before applying them to the clauses. This is a bit more expensive than picking the best statistics and deserializing dependencies for it. To minimize the additional cost, we ignore statistics that are not applicable. Author: Tomas Vondra Reviewed-by: Mark Dilger Discussion: https://postgr.es/m/20191028152048.jc6pqv5hb7j77ocp@development	2020-01-13 01:21:06 +01:00
Tom Lane	652686a334	Fix edge-case crashes and misestimation in range containment selectivity. When estimating the selectivity of "range_var <@ range_constant" or "range_var @> range_constant", if the upper (or respectively lower) bound of the range_constant was above the last bin of the range_var's histogram, the code would access uninitialized memory and potentially crash (though it seems the probability of a crash is quite low). Handle the endpoint cases explicitly to fix that. While at it, be more paranoid about the possibility of getting NaN or other silly results from the range type's subdiff function. And improve some comments. Ordinarily we'd probably add a regression test case demonstrating the bug in unpatched code. But it's too hard to get it to crash reliably because of the uninitialized-memory dependence, so skip that. Per bug #16122 from Adam Scott. It's been broken from the beginning, apparently, so backpatch to all supported branches. Diagnosis by Michael Paquier, patch by Andrey Borodin and Tom Lane. Discussion: https://postgr.es/m/16122-eb35bc248c806c15@postgresql.org	2020-01-12 14:36:59 -05:00
Michael Paquier	1088729e84	Remove incorrect assertion for INSERT in logical replication's publisher On the publisher, it was assumed that an INSERT change cannot happen for a relation with no replica identity. However this is true only for a change that needs references to old rows, aka UPDATE or DELETE, so trying to use logical replication with a relation that has no replica identity led to an assertion failure in the publisher when issuing an INSERT. This commit removes the incorrect assertion, and adds more regression tests to provide coverage for relations without replica identity. Reported-by: Neha Sharma Author: Dilip Kumar, Michael Paquier Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CANiYTQsL1Hb8_Km08qd32svrqNumXLJeoGo014O7VZymgOhZEA@mail.gmail.com Backpatch-through: 10	2020-01-12 22:43:45 +09:00
Tom Lane	2c0cdc8183	Extensive code review for GSSAPI encryption mechanism. Fix assorted bugs in handling of non-blocking I/O when using GSSAPI encryption. The encryption layer could return the wrong status information to its caller, resulting in effectively dropping some data (or possibly in aborting a not-broken connection), or in a "livelock" situation where data remains to be sent but the upper layers think transmission is done and just go to sleep. There were multiple small thinkos contributing to that, as well as one big one (failure to think through what to do when a send fails after having already transmitted data). Note that these errors could cause failures whether the client application asked for non-blocking I/O or not, since both libpq and the backend always run things in non-block mode at this level. Also get rid of use of static variables for GSSAPI inside libpq; that's entirely not okay given that multiple connections could be open at once inside a single client process. Also adjust a bunch of random small discrepancies between the frontend and backend versions of the send/receive functions -- except for error handling, they should be identical, and now they are. Also extend the Kerberos TAP tests to exercise cases where nontrivial amounts of data need to be pushed through encryption. Before, those tests didn't provide any useful coverage at all for the cases of interest here. (They still might not, depending on timing, but at least there's a chance.) Per complaint from pmc@citylink and subsequent investigation. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/20200109181822.GA74698@gate.oper.dinoex.org	2020-01-11 17:14:08 -05:00
Peter Eisentraut	c67a55da4e	Make lsn argument of walrcv_create_slot() optional Some callers are not using it, so it's wasteful to have to specify it. Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/CA+fd4k4BcYrYucNfTnK-CQX3+jsG+PRPEhHAUSo-W4P0Lec57A@mail.gmail.com	2020-01-11 09:07:14 +01:00
Peter Eisentraut	c096a804d9	Remove STATUS_FOUND Replace the solitary use with a bool. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/a6f91ead-0ce4-2a34-062b-7ab9813ea308%402ndquadrant.com	2020-01-11 07:48:57 +01:00
Noah Misch	38fc056074	Maintain valid md.c state when FileClose() fails. FileClose() failure ordinarily causes a PANIC. Suppose the user disables that PANIC via data_sync_retry=on. After mdclose() issued a FileClose() that failed, calls into md.c raised SIGSEGV. This fix adds repalloc() calls during mdclose(); update a comment about ignoring repalloc() cost. The rate of relation segment count change is a minor factor; more relevant to overall performance is the rate of mdclose() and subsequent re-opening of segments. Back-patch to v10, where commit `45e191e3aa` introduced the bug. Reviewed by Kyotaro Horiguchi. Discussion: https://postgr.es/m/20191222091930.GA1280238@rfd.leadboat.com	2020-01-10 18:31:22 -08:00
Alvaro Herrera	a7b6ab5db1	Clean up representation of flags in struct ReorderBufferTXN This simplifies addition of further flags. Author: Nikhil Sontakke Discussion: https://postgr.es/m/CAMGcDxeViP+R-OL7QhzUV9eKCVjURobuY1Zijik4Ay_Ddwo4Cg@mail.gmail.com	2020-01-10 17:46:57 -03:00
Tom Lane	9ce77d75c5	Reconsider the representation of join alias Vars. The core idea of this patch is to make the parser generate join alias Vars (that is, ones with varno pointing to a JOIN RTE) only when the alias Var is actually different from any raw join input, that is a type coercion and/or COALESCE is necessary to generate the join output value. Otherwise just generate varno/varattno pointing to the relevant join input column. In effect, this means that the planner's flatten_join_alias_vars() transformation is already done in the parser, for all cases except (a) columns that are merged by JOIN USING and are transformed in the process, and (b) whole-row join Vars. In principle that would allow us to skip doing flatten_join_alias_vars() in many more queries than we do now, but we don't have quite enough infrastructure to know that we can do so --- in particular there's no cheap way to know whether there are any whole-row join Vars. I'm not sure if it's worth the trouble to add a Query-level flag for that, and in any case it seems like fit material for a separate patch. But even without skipping the work entirely, this should make flatten_join_alias_vars() faster, particularly where there are nested joins that it previously had to flatten recursively. An essential part of this change is to replace Var nodes' varnoold/varoattno fields with varnosyn/varattnosyn, which have considerably more tightly-defined meanings than the old fields: when they differ from varno/varattno, they identify the Var's position in an aliased JOIN RTE, and the join alias is what ruleutils.c should print for the Var. This is necessary because the varno change destroyed ruleutils.c's ability to find the JOIN RTE from the Var's varno. Another way in which this change broke ruleutils.c is that it's no longer feasible to determine, from a JOIN RTE's joinaliasvars list, which join columns correspond to which columns of the join's immediate input relations. (If those are sub-joins, the joinaliasvars entries may point to columns of their base relations, not the sub-joins.) But that was a horrid mess requiring a lot of fragile assumptions already, so let's just bite the bullet and add some more JOIN RTE fields to make it more straightforward to figure that out. I added two integer-List fields containing the relevant column numbers from the left and right input rels, plus a count of how many merged columns there are. This patch depends on the ParseNamespaceColumn infrastructure that I added in commit `5815696bc`. The biggest bit of code change is restructuring transformFromClauseItem's handling of JOINs so that the ParseNamespaceColumn data is propagated upward correctly. Other than that and the ruleutils fixes, everything pretty much just works, though some processing is now inessential. I grabbed two pieces of low-hanging fruit in that line: 1. In find_expr_references, we don't need to recurse into join alias Vars anymore. There aren't any except for references to merged USING columns, which are more properly handled when we scan the join's RTE. This change actually fixes an edge-case issue: we will now record a dependency on any type-coercion function present in a USING column's joinaliasvar, even if that join column has no references in the query text. The odds of the missing dependency causing a problem seem quite small: you'd have to posit somebody dropping an implicit cast between two data types, without removing the types themselves, and then having a stored rule containing a whole-row Var for a join whose USING merge depends on that cast. So I don't feel a great need to change this in the back branches. But in theory this way is more correct. 2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse into join alias Vars either, because the cases they care about don't apply to alias Vars for USING columns that are semantically distinct from the underlying columns. This removes the only case in which markVarForSelectPriv could be called with NULL for the RTE, so adjust the comments to describe that hack as being strictly internal to markRTEForSelectPriv. catversion bump required due to changes in stored rules. Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us	2020-01-09 11:56:59 -05:00
Robert Haas	ed10f32e37	Add pg_shmem_allocations view. This tells you about allocations that have been made from the main shared memory segment. The original patch also tried to show information about dynamic shared memory allocation as well, but I decided to leave that problem for another time. Andres Freund and Robert Haas, reviewed by Michael Paquier, Marti Raudsepp, Tom Lane, Álvaro Herrera, and Kyotaro Horiguchi. Discussion: http://postgr.es/m/20140504114417.GM12715@awork2.anarazel.de	2020-01-09 10:59:07 -05:00
Peter Eisentraut	f85a485f89	Add support for automatically updating Unicode derived files We currently have several sets of files generated from data provided by Unicode. These all have ad hoc rules and instructions for updating when new Unicode versions appear, and it's not done consistently. This patch centralizes and automates the process and makes it part of the release checklist. The Unicode and CLDR versions are specified in Makefile.global.in. There is a new make target "update-unicode" that downloads all the relevant files and runs the generation script. There is also a new script for generating the table of combining characters for ucs_wcwidth(). That table is now in a separate include file rather than hardcoded into the middle of other code. This is based on the script that was used for generating `d8594d123c`, but the script itself wasn't committed at that time. Reviewed-by: John Naylor <john.naylor@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/c8d05f42-443e-6c23-819b-05b31759a37c@2ndquadrant.com	2020-01-09 10:08:14 +01:00
Alvaro Herrera	f5d28710c7	Reimplement nullification of walsender timestamp Make the value null only at pg_stat_activity-output time, as suggested by Tom Lane, instead of messing with the internal state. This should appease buildfarm members with force_parallel_mode=regress, which are running parallel queries on logical replication walsenders. The fact that walsenders can run parallel queries should perhaps be studied more carefully, but for the moment let's get rid of the red blots in buildfarm. Backpatch to pg10, like the previous commit. Discussion: https://postgr.es/m/30804.1578438763@sss.pgh.pa.us	2020-01-08 14:33:49 -03:00
Tom Lane	913bbd88dc	Improve the handling of result type coercions in SQL functions. Use the parser's standard type coercion machinery to convert the output column(s) of a SQL function's final SELECT or RETURNING to the type(s) they should have according to the function's declared result type. We'll allow any case where an assignment-level coercion is available. Previously, we failed unless the required coercion was a binary-compatible one (and the documentation ignored this, falsely claiming that the types must match exactly). Notably, the coercion now accounts for typmods, so that cases where a SQL function is declared to return a composite type whose columns are typmod-constrained now behave as one would expect. Arguably this aspect is a bug fix, but the overall behavioral change here seems too large to consider back-patching. A nice side-effect is that functions can now be inlined in a few cases where we previously failed to do so because of type mismatches. Discussion: https://postgr.es/m/18929.1574895430@sss.pgh.pa.us	2020-01-08 11:07:59 -05:00
Tom Lane	4ac8aaa36f	Fix handling of generated columns in ALTER TABLE. ALTER TABLE failed if a column referenced in a GENERATED expression had been added or changed in type earlier in the ALTER command. That's because the GENERATED expression needs to be evaluated against the table's updated tuples, but it was being evaluated against the original tuples. (Fortunately the executor has adequate cross-checks to notice the mismatch, so we just got an obscure error message and not anything more dangerous.) Per report from Andreas Joseph Krogh. Back-patch to v12 where GENERATED was added. Discussion: https://postgr.es/m/VisenaEmail.200.231b0a41523275d0.16ea7f800c7@tc7-visena	2020-01-08 09:42:53 -05:00
Michael Paquier	65192e0244	Revert "Forbid DROP SCHEMA on temporary namespaces" This reverts commit `a052f6c`, following complains from Robert Haas and Tom Lane. Backpatch down to 9.4, like the previous commit. Discussion: https://postgr.es/m/CA+TgmobL4npEX5=E5h=5Jm_9mZun3MT39Kq2suJFVeamc9skSQ@mail.gmail.com Backpatch-through: 9.4	2020-01-08 10:36:12 +09:00
Alvaro Herrera	b175bd59fa	pg_stat_activity: show NULL stmt start time for walsenders Returning a non-NULL time is pointless, sinc a walsender is not a process that would be running normal transactions anyway, but the code was unintentionally exposing the process start time intermittently, which was not only bogus but it also confused monitoring systems looking for idle transactions. Fix by avoiding all updates in walsenders. Backpatch to 11, where walsenders started appearing in pg_stat_activity. Reported-by: Tomas Vondra Discussion: https://postgr.es/m/20191209234409.exe7osmyalwkt5j4@development	2020-01-07 17:38:48 -03:00
Robert Haas	ce242ae154	tableam: New callback relation_fetch_toast_slice. Instead of always calling heap_fetch_toast_slice during detoasting, invoke a table AM callback which, when the toast table is a heap table, will be heap_fetch_toast_slice. This makes it possible for a table AM other than heap to be used as a TOAST table. It also completes the series of commits intended to improve the interaction of tableam with TOAST that began with commit 8b94dab06617ef80a0901ab103ebd8754427ef5a; detoast.c is now, hopefully, fully AM-independent. Patch by me, reviewed by Andres Freund and Peter Eisentraut. Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com	2020-01-07 14:36:38 -05:00
Robert Haas	83322e38da	tableam: Allow choice of toast AM. Previously, the toast table had to be implemented by the same AM that was used for the main table, which was bad, because the detoasting code won't work with anything but heap. This commit doesn't fix the latter problem, although there's another patch coming which does, but it does let you pick something that works (i.e. heap, right now). Patch by me, reviewed by Andres Freund. Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com	2020-01-07 14:23:25 -05:00
Robert Haas	8147278589	Increase the maximum value of track_activity_query_size. This one-line change provoked a lot of discussion, but ultimately the consensus seems to be that allowing a larger value might be useful to somebody, and probably won't hurt anyone who chooses not to take advantage of the higher maximum limit. Vyacheslav Makarov, reviewed by many people. Discussion: http://postgr.es/m/7b5ecc5a9991045e2f13c84e3047541d@postgrespro.ru	2020-01-07 12:14:19 -05:00
Tom Lane	e369f37086	Reduce the number of GetFlushRecPtr() calls done by walsenders. Since the WAL flush position only moves forward, it's safe to cache its previous value within each walsender process, and update from shared memory only once we've caught up to the previously-seen value. When there are many active walsenders, this makes for a very significant reduction in the amount of contention on the XLogCtl->info_lck spinlock. This patch also adjusts the logic so that we update our idea of the flush position after processing a WAL record, rather than beforehand. This may cause us to realize we're not caught up when the preceding coding would've thought that we were, but that seems all to the good; it may avoid a useless sleep-and-wakeup cycle. Back-patch to v12. The contention problem exists in prior branches, but it's much less severe (due to inefficiencies elsewhere) so there seems no need to take any risk of back-patching further. Pierre Ducroquet, reviewed by Julien Rouhaud Discussion: https://postgr.es/m/2931018.Vxl9zapr77@pierred-pdoc	2020-01-06 16:42:20 -05:00
Tom Lane	20d6225d16	Add functions min_scale(numeric) and trim_scale(numeric). These allow better control of trailing zeroes in numeric values. Pavel Stehule, based on an old proposal of Marko Tiikkaja's; review by Karl Pinc Discussion: https://postgr.es/m/CAFj8pRDjs-navGASeF0Wk74N36YGFJ+v=Ok9_knRa7vDc-qugg@mail.gmail.com	2020-01-06 12:13:53 -05:00
Peter Eisentraut	b9c130a1fd	Have logical replication subscriber fire column triggers The logical replication apply worker did not fire per-column update triggers because the updatedCols bitmap in the RTE was not populated. This fixes that. Reviewed-by: Euler Taveira <euler@timbira.com.br> Discussion: https://www.postgresql.org/message-id/flat/21673e2d-597c-6afe-637e-e8b10425b240%402ndquadrant.com	2020-01-06 08:40:00 +01:00
Michael Paquier	7b283d0e1d	Remove support for OpenSSL 0.9.8 and 1.0.0 Support is out of scope from all the major vendors for these versions (for example RHEL5 uses a version based on 0.9.8, and RHEL6 uses 1.0.1), and it created some extra maintenance work. Upstream has stopped support of 0.9.8 in December 2015 and of 1.0.0 in February 2016. Since `b1abfec`, note that the default SSL protocol version set with ssl_min_protocol_version is TLSv1.2, whose support was added in OpenSSL 1.0.1, so there is no point to enforce ssl_min_protocol_version to TLSv1 in the SSL tests. Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Tom Lane Discussion: https://postgr.es/m/20191205083252.GE5064@paquier.xyz	2020-01-06 12:51:44 +09:00
Peter Geoghegan	fc31001123	Remove redundant incomplete split assertion. The fastpath insert optimization's incomplete split flag Assert() is redundant. We'll reach the more general Assert() within _bt_findinsertloc() in all cases. (Besides, Assert()'ing that the rightmost page doesn't have the flag set never made much sense.)	2020-01-05 17:42:13 -08:00
Peter Eisentraut	3fd40b628c	Make better use of ParseState in ProcessUtility Pass ParseState into the functions called from standard_ProcessUtility() instead passing the query string and query environment separately. No functionality change, but it makes the notation consistent. We had already started moving things into that direction piece by piece, and this completes it. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6e7aa4a1-be6a-1a75-b1f9-83a678e5184a@2ndquadrant.com	2020-01-04 13:12:41 +01:00
Peter Geoghegan	d2e5e20e57	Add xl_btree_delete optimization. Commit `558a9165e0` taught _bt_delitems_delete() to produce its own XID horizon on the primary. Standbys no longer needed to generate their own latestRemovedXid, since they could just use the explicitly logged value from the primary instead. The deleted offset numbers array from the xl_btree_delete WAL record was no longer used by the REDO routine for anything other than deleting the items. This enables a minor optimization: We now treat the array as buffer state, not generic WAL data, following _bt_delitems_vacuum()'s example. This should be a minor win, since it allows us to avoid including the deleted items array in cases where XLogInsert() stores the whole buffer anyway. The primary goal here is to make the code more maintainable, though. Removing inessential differences between the two functions highlights the fundamental differences that remain. Also change xl_btree_delete to use uint32 for the size of the array of item offsets being deleted. This brings xl_btree_delete closer to xl_btree_vacuum. Furthermore, it seems like a good idea to use an explicit-width integer type (the field was previously an "int"). Bump XLOG_PAGE_MAGIC because xl_btree_delete changed. Discussion: https://postgr.es/m/CAH2-Wzkz4TjmezzfAbaV1zYrh=fr0bCpzuJTvBe5iUQ3aUPsCQ@mail.gmail.com	2020-01-03 12:18:13 -08:00
Peter Geoghegan	0c41c83d8f	Clear up btree_xlog_split() alignment comment. Adjust a comment that describes how alignment of the new left page high key works in btree_xlog_split(), the nbtree page split REDO routine. The wording used before commit `2c03216d83` is much clearer, so go back to that.	2020-01-02 18:30:25 -08:00
Peter Geoghegan	44e44bd258	Correct _bt_delitems_vacuum() lock comments. The expectation within _bt_delitems_vacuum() is that caller has a super-exclusive/cleanup buffer lock (not just a pin and a write lock).	2020-01-02 13:30:40 -08:00
Alvaro Herrera	1fa846f1c9	Fix cloning of row triggers to sub-partitions When row triggers exist in partitioned partitions that are not either part of FKs or deferred unique constraints, they are not correctly cloned to their partitions. That's because they are marked "internal", and those are purposefully skipped when doing the clone triggers dance. Fix by relaxing the condition on which internal triggers are skipped. Amit Langote initially diagnosed the problem and proposed a fix, but I used a different approach. Reported-by: Petr Fedorov Discussion: https://postgr.es/m/6b3f0646-ba8c-b3a9-c62d-1c6651a1920f@phystech.edu	2020-01-02 17:04:24 -03:00
Tom Lane	915c04f091	Fix typmod exposed for scalar function in FROM, too. On further reflection about commit `4d02eb017`, it occurs to me that expandRTE() had better agree with what addRangeTableEntryForFunction() is doing. So teach that about functions possibly having typmods, too.	2020-01-02 14:02:55 -05:00
Tom Lane	4d02eb017e	Fix collation exposed for scalar function in FROM. One code path in addRangeTableEntryForFunction() neglected to assign a collation to the tupdesc entry it constructs (which is a bit odd considering the other path did do so). This didn't matter before commit `5815696bc`, because nothing would look at the type data in this tupdesc; but now it does. While at it, make sure we assign the correct typmod as well. Most function expressions don't have a determinate typmod, but some do. Per buildfarm, which showed failures in non-C collations, a case I'd not thought to test for this patch :-(	2020-01-02 13:48:54 -05:00
Tom Lane	5815696bc6	Make parser rely more heavily on the ParseNamespaceItem data structure. When I added the ParseNamespaceItem data structure (in commit `5ebaaa494`), it wasn't very tightly integrated into the parser's APIs. In the wake of adding p_rtindex to that struct (commit `b541e9acc`), there is a good reason to make more use of it: by passing around ParseNamespaceItem pointers instead of bare RTE pointers, we can get rid of various messy methods for passing back or deducing the rangetable index of an RTE during parsing. Hence, refactor the addRangeTableEntryXXX functions to build and return a ParseNamespaceItem struct, not just the RTE proper; and replace addRTEtoQuery with addNSItemToQuery, which is passed a ParseNamespaceItem rather than building one internally. Also, add per-column data (a ParseNamespaceColumn array) to each ParseNamespaceItem. These arrays are built during addRangeTableEntryXXX, where we have column type data at hand so that it's nearly free to fill the data structure. Later, when we need to build Vars referencing RTEs, we can use the ParseNamespaceColumn info to avoid the rather expensive operations done in get_rte_attribute_type() or expandRTE(). get_rte_attribute_type() is indeed dead code now, so I've removed it. This makes for a useful improvement in parse analysis speed, around 20% in one moderately-complex test query. The ParseNamespaceColumn structs also include Var identity information (varno/varattno). That info isn't actually being used in this patch, except that p_varno == 0 is a handy test for a dropped column. A follow-on patch will make more use of it. Discussion: https://postgr.es/m/2461.1577764221@sss.pgh.pa.us	2020-01-02 11:29:01 -05:00
Amit Kapila	d207038053	Fix running out of file descriptors for spill files. Currently while decoding changes, if the number of changes exceeds a certain threshold, we spill those to disk. And this happens for each (sub)transaction. Now, while reading all these files, we don't close them until we read all the files. While reading these files, if the number of such files exceeds the maximum number of file descriptors, the operation errors out. Use PathNameOpenFile interface to open these files as that internally has the mechanism to release kernel FDs as needed to get us under the max_safe_fds limit. Reported-by: Amit Khandekar Author: Amit Khandekar Reviewed-by: Amit Kapila Backpatch-through: 9.4 Discussion: https://postgr.es/m/CAJ3gD9c-sECEn79zXw4yBnBdOttacoE-6gAyP0oy60nfs_sabQ@mail.gmail.com	2020-01-02 11:41:04 +05:30
Peter Geoghegan	4b25f5d0ba	Revise BTP_HAS_GARBAGE nbtree VACUUM comments. _bt_delitems_vacuum() comments claimed that it isn't worth another scan of the page to avoid falsely unsetting the BTP_HAS_GARBAGE page flag hint (this happens to be the same wording that was removed from _bt_delitems_delete() by my recent commit `fe97c61c`). The comments made little sense, though. The issue can't have much to do with performing a second scan of the target leaf page, since an LP_DEAD test could easily be performed in the first scan of the page anyway (the scan that takes place in btvacuumpage() caller). Revise the explanation. It makes much more sense to frame this as an issue about recovery conflicts. _bt_delitems_vacuum() cannot easily generate an XID cutoff in the same way that _bt_delitems_delete() is designed to. Falsely unsetting the page flag is not ideal, and is likely to happen more often than was supposed by the original comments. Explain why it usually isn't a problem in practice. There may be an argument for _bt_delitems_vacuum() not clearing the BTP_HAS_GARBAGE bit, removing the question of it being falsely unset by VACUUM (there may even be an argument for not using a page level hint at all). This can be revisited later.	2020-01-01 17:29:41 -08:00
Peter Geoghegan	c5f3b53b0e	Update btree_xlog_delete() comments. Commit `fe97c61c` updated LP_DEAD item deletion comments, but missed a minor discrepancy on the REDO side. Fix it now. In passing, don't talk about the btree_xlog_vacuum() behavior within btree_xlog_delete(). The reliance on XLOG_HEAP2_CLEANUP_INFO records for recovery conflicts is already discussed within btvacuumpage() and mentioned again in passing above btree_xlog_vacuum(), which seems sufficient.	2020-01-01 11:32:07 -08:00
Bruce Momjian	7559d8ebfa	Update copyrights for 2020 Backpatch-through: update all files in master, backpatch legal files through 9.4	2020-01-01 12:21:45 -05:00
Tom Lane	0ce38730ac	Micro-optimize AllocSetFreeIndex() by reference to pg_bitutils code. Use __builtin_clz() where available. Where it isn't, we can still win a little by using the pg_leftmost_one_pos[] lookup table instead of having a private table. Also drop the initial right shift by ALLOC_MINBITS in favor of subtracting ALLOC_MINBITS from the leftmost-one-pos result. This is a win because the compiler can fold that adjustment into other constants it'd have to add anyway, making the shift-removal free. Also, we can explain this coding as an unrolled form of pg_leftmost_one_pos32(), even though that's a bit ahistorical since it long predates pg_bitutils.h. John Naylor, with some cosmetic adjustments by me Discussion: https://postgr.es/m/CACPNZCuNUGMxjK7WTn_=WZnRbfASDdBxmjsVf2+m9MdmeNw_sg@mail.gmail.com	2019-12-28 17:21:17 -05:00
Michael Paquier	a052f6cbb8	Forbid DROP SCHEMA on temporary namespaces This operation was possible for the owner of the schema or a superuser. Down to 9.4, doing this operation would cause inconsistencies in a session whose temporary schema was dropped, particularly if trying to create new temporary objects after the drop. A more annoying consequence is a crash of autovacuum on an assertion failure when logging information about an orphaned temp table dropped. Note that because of `246a6c8` (present in v11~), which has made the removal of orphaned temporary tables more aggressive, the failure could be triggered more easily, but it is possible to reproduce down to 9.4. Reported-by: Mahendra Singh, Prabhat Sahu Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, Mahendra Singh Discussion: https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com Backpatch-through: 9.4	2019-12-27 17:58:43 +09:00
Michael Paquier	7854e07f25	Revert "Rename files and headers related to index AM" This follows multiple complains from Peter Geoghegan, Andres Freund and Alvaro Herrera that this issue ought to be dug more before actually happening, if it happens. Discussion: https://postgr.es/m/20191226144606.GA5659@alvherre.pgsql	2019-12-27 08:09:00 +09:00
Tom Lane	b541e9accb	Refactor parser's generation of Var nodes. Instead of passing around a pointer to the RangeTblEntry that provides the desired column, pass a pointer to the associated ParseNamespaceItem. The RTE is trivially reachable from the nsitem, and having the ParseNamespaceItem allows access to additional information. As proof of concept for that, add the rangetable index to ParseNamespaceItem, and use that to get rid of RTERangeTablePosn searches. (I have in mind to teach the parser to generate some different representation for Vars that are nullable by outer joins, and keeping the necessary information in ParseNamespaceItems seems like a reasonable approach to that. But whether that ever happens or not, this seems like good cleanup.) Also refactor the code around scanRTEForColumn so that the "fuzzy match" stuff does not leak out of parse_relation.c. Discussion: https://postgr.es/m/26144.1576858373@sss.pgh.pa.us	2019-12-26 11:16:42 -05:00
Michael Paquier	044b319cd7	Fix some comments related to logical repslot advancing confirmed_flush is part of a replication slot's information, but not confirmed_lsn. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20191226.175919.17237335658671970.horikyota.ntt@gmail.com Backpatch-through: 11	2019-12-26 22:26:09 +09:00
Michael Paquier	1ab41a3c8e	Refactor code dedicated to index vacuuming in vacuumlazy.c The part in charge of doing the vacuum on all the indexes of a relation was duplicated, with the same handling for progress reporting done. While on it, update the progress reporting for heap vacuuming in the subroutine doing the actual work, keeping the status update local. This way, any future caller of lazy_vacuum_heap() does not have to worry about doing any progress reporting update. Author: Justin Pryzby, Michael Paquier Discussion: https://postgr.es/m/20191120210600.GC30362@telsasoft.com	2019-12-26 17:01:23 +09:00
Tom Lane	bb4114a4e2	Allow whole-row Vars to be used in partitioning expressions. In the wake of commit `5b9312378`, there's no particular reason for this restriction (previously, it was problematic because of the implied rowtype reference). A simple constraint on a whole-row Var probably isn't that useful, but conceivably somebody would want to pass one to a function that extracts a partitioning key. Besides which, we're expending much more code to enforce the restriction than we save by having it, since the latter quantity is now zero. So drop the restriction. Amit Langote Discussion: https://postgr.es/m/CA+HiwqFUzjfj9HEsJtYWcr1SgQ_=iCAvQ=O2Sx6aQxoDu4OiHw@mail.gmail.com	2019-12-25 15:44:15 -05:00
Tom Lane	42f74f4936	Remove equalPartitionDescs(). This is dead code in the wake of the previous commit. We can always add it back if we need it again someday. Discussion: https://postgr.es/m/CA+HiwqFUzjfj9HEsJtYWcr1SgQ_=iCAvQ=O2Sx6aQxoDu4OiHw@mail.gmail.com	2019-12-25 14:45:57 -05:00
Tom Lane	5b9312378e	Load relcache entries' partitioning data on-demand, not immediately. Formerly the rd_partkey and rd_partdesc data structures were always populated immediately when a relcache entry was built or rebuilt. This patch changes things so that they are populated only when they are first requested. (Hence, callers must now always use RelationGetPartitionKey or RelationGetPartitionDesc; just fetching the pointer directly is no longer acceptable.) This seems to have some performance benefits, but the main reason to do it is that it eliminates a recursive-reload failure that occurs if the partkey or partdesc expressions contain any references to the relation's rowtype (as discovered by Amit Langote). In retrospect, since loading these data structures might result in execution of nearly-arbitrary code via eval_const_expressions, it was a dumb idea to require that to happen during relcache entry rebuild. Also, fix things so that old copies of a relcache partition descriptor will be dropped when the cache entry's refcount goes to zero. In the previous coding it was possible for such copies to survive for the lifetime of the session, as I'd complained of in a previous discussion. (This management technique still isn't perfect, but it's better than before.) Improve the commentary explaining how that works and why it's safe to hand out direct pointers to these relcache substructures. In passing, improve RelationBuildPartitionDesc by using the same memory-context-parent-swap approach used by RelationBuildPartitionKey, thereby making it less dependent on strong assumptions about what partition_bounds_copy does. Avoid doing get_rel_relkind in the critical section, too. Patch by Amit Langote and Tom Lane; Robert Haas deserves some credit for prior work in the area, too. Although this is a pre-existing problem, no back-patch: the patch seems too invasive to be safe to back-patch, and the bug it fixes is a corner case that seems relatively unlikely to cause problems in the field. Discussion: https://postgr.es/m/CA+HiwqFUzjfj9HEsJtYWcr1SgQ_=iCAvQ=O2Sx6aQxoDu4OiHw@mail.gmail.com Discussion: https://postgr.es/m/CA+TgmoY3bRmGB6-DUnoVy5fJoreiBJ43rwMrQRCdPXuKt4Ykaw@mail.gmail.com	2019-12-25 14:43:13 -05:00
Michael Paquier	8ce3aa9b59	Rename files and headers related to index AM The following renaming is done so as source files related to index access methods are more consistent with table access methods (the original names used for index AMs ware too generic, and could be confused as including features related to table AMs): - amapi.h -> indexam.h. - amapi.c -> indexamapi.c. Here we have an equivalent with backend/access/table/tableamapi.c. - amvalidate.c -> indexamvalidate.c. - amvalidate.h -> indexamvalidate.h. - genam.c -> indexgenam.c. - genam.h -> indexgenam.h. This has been discussed during the development of v12 when table AM was worked on, but the renaming never happened. Author: Michael Paquier Reviewed-by: Fabien Coelho, Julien Rouhaud Discussion: https://postgr.es/m/20191223053434.GF34339@paquier.xyz	2019-12-25 10:23:39 +09:00
Alvaro Herrera	c4dcd9144b	Avoid splitting C string literals with \-newline Using \ is unnecessary and ugly, so remove that. While at it, stitch the literals back into a single line: we've long discouraged splitting error message literals even when they go past the 80 chars line limit, to improve greppability. Leave contrib/tablefunc alone. Discussion: https://postgr.es/m/20191223195156.GA12271@alvherre.pgsql	2019-12-24 12:44:12 -03:00
Thomas Munro	e69d644547	Rotate instead of shifting hash join batch number. Our algorithm for choosing batch numbers turned out not to work effectively for multi-billion key inner relations. We would use more hash bits than we have, and effectively concentrate all tuples into a smaller number of batches than we intended. While ideally we should switch to wider hashes, for now, change the algorithm to one that effectively gives up bits from the bucket number when we don't have enough bits. That means we'll finish up with longer bucket chains than would be ideal, but that's better than having batches that don't fit in work_mem and can't be divided. Batch-patch to all supported releases. Author: Thomas Munro Reviewed-by: Tom Lane, thanks also to Tomas Vondra, Alvaro Herrera, Andres Freund for testing and discussion Reported-by: James Coleman Discussion: https://postgr.es/m/16104-dc11ed911f1ab9df%40postgresql.org	2019-12-24 13:05:43 +13:00
Tom Lane	39ebb943de	Disallow partition key expressions that return pseudo-types. This wasn't checked originally, but it should have been, because in general pseudo-types can't be stored to and retrieved from disk. Notably, partition bound values of type "record" would not be interpretable by another session. In v12 and HEAD, add another flag to CheckAttributeType's repertoire so that it can produce a specific error message for this case. That's infeasible in older branches without an ABI break, so fall back to a slightly-less-nicely-worded error message in v10 and v11. Problem noted by Amit Langote, though this patch is not his initial solution. Back-patch to v10 where partitioning was introduced. Discussion: https://postgr.es/m/CA+HiwqFUzjfj9HEsJtYWcr1SgQ_=iCAvQ=O2Sx6aQxoDu4OiHw@mail.gmail.com	2019-12-23 12:53:12 -05:00
Tom Lane	fc7695891d	Prevent a rowtype from being included in itself via a range. We probably should have thought of this case when ranges were added, but we didn't. (It's not the fault of commit `eb51af71f`, because ranges didn't exist then.) It's an old bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/7782.1577051475@sss.pgh.pa.us	2019-12-23 12:08:23 -05:00
Alvaro Herrera	0fd8cfb20d	GetPublicationByName: Don't repeat ourselves Use get_publication_oid() instead of reimplementing it. Discussion: https://postgr.es/m/20191220201017.GA17292@alvherre.pgsql	2019-12-23 12:47:36 -03:00
Peter Geoghegan	fe97c61c87	Update nbtree LP_DEAD item deletion comments. Comments about the consequences of clearing the BTP_HAS_GARBAGE page flag bit that apply only to VACUUM were added to code that deals with opportunistic deletion of LP_DEAD items by commit `a760893d`. The same comment block was added to both _bt_delitems_vacuum() and _bt_delitems_delete(). Correct _bt_delitems_delete()'s copy of the comment block. _bt_delitems_delete() reliably deletes items that were found by caller to have their LP_DEAD bit set. There is no question about whether or not unsetting the BTP_HAS_GARBAGE bit can miss some LP_DEAD items that were set recently. Also tweak a related section of the nbtree README.	2019-12-22 19:57:35 -08:00
Peter Eisentraut	df7fe9e2d7	Disallow dropping rules on system tables by default This was previously not covered by allow_system_table_mods, but now it is. The impact in practice is probably low, but this makes it consistent with most other DDL commands. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ee9df1af-c0d8-7c82-5be7-39ce4e3b0a9d%402ndquadrant.com	2019-12-20 08:27:37 +01:00
Peter Eisentraut	8c6d30f211	Fix compiler warnings on MSYS2 The PS_USE_NONE case in ps_status.c left a couple of unused variables exposed. Discussion: https://www.postgresql.org/message-id/flat/6b467edc-4018-521f-ab18-171f098557ca%402ndquadrant.com	2019-12-20 08:16:44 +01:00
Robert Haas	16a4e4aecd	Extend the ProcSignal mechanism to support barriers. A new function EmitProcSignalBarrier() can be used to emit a global barrier which all backends that participate in the ProcSignal mechanism must absorb, and a new function WaitForProcSignalBarrier() can be used to wait until all relevant backends have in fact absorbed the barrier. This can be used to coordinate global state changes, such as turning checksums on while the system is running. There's no real client of this mechanism yet, although two are proposed, but an enum has to have at least one element, so this includes a placeholder type (PROCSIGNAL_BARRIER_PLACEHOLDER) which should be replaced by the first real client of this mechanism to get committed. Andres Freund and Robert Haas, reviewed by Daniel Gustafsson and, in earlier versions, by Magnus Hagander. Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com	2019-12-19 14:56:20 -05:00
Peter Geoghegan	9f83468b35	Remove unneeded "pin scan" nbtree VACUUM code. The REDO routine for nbtree's xl_btree_vacuum record type hasn't performed a "pin scan" since commit `3e4b7d87` went in, so clearly there isn't any point in VACUUM WAL-logging information that won't actually be used. Finish off the work of commit `3e4b7d87` (and the closely related preceding commit `687f2cd7`) by removing the code that generates this unused information. Also remove the REDO routine code disabled by commit `3e4b7d87`. Replace the unneeded lastBlockVacuumed field in xl_btree_vacuum with a new "ndeleted" field. The new field isn't actually needed right now, since we could continue to infer the array length from the overall record length. However, an upcoming patch to add deduplication to nbtree needs to add an "items updated" field to xl_btree_vacuum, so we might as well start being explicit about the number of items now. (Besides, it doesn't seem like a good idea to leave the xl_btree_vacuum struct without any fields; the C standard says that that's undefined.) nbtree VACUUM no longer forces writing a WAL record for the last block in the index. Writing out a WAL record with no items for the final block was supposed to force processing of a lastBlockVacuumed field by a pin scan. Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed. Discussion: https://postgr.es/m/CAH2-WzmY_mT7UnTzFB5LBQDBkKpdV5UxP3B5bLb7uP%3D%3D6UQJRQ%40mail.gmail.com	2019-12-19 11:35:55 -08:00
Bruce Momjian	b93e9a5c94	revert: Remove meaningless assignments in nbtree code Reverts commit `05684c8255`. Reported-by: Tom Lane Discussion: https://postgr.es/m/404.1576770942@sss.pgh.pa.us Backpatch-through: master	2019-12-19 11:19:10 -05:00
Bruce Momjian	05684c8255	Remove meaningless assignments in nbtree code Reported-by: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB2927BB876D12A70FDBE8F35AE3450@MN2PR18MB2927.namprd18.prod.outlook.com Backpatch-through: master	2019-12-19 10:33:48 -05:00
Alvaro Herrera	2b93e3d96b	makeArrayTypeName: Remove pointless relation open/close Discussion: https://postgr.es/m/20191218221326.GA25537@alvherre.pgsql	2019-12-19 12:08:30 -03:00
Robert Haas	7cdcc747a9	Update neglected comment. Commit `d986d4e87f` renamed a variable but neglected to update the corresponding comment. Amit Langote	2019-12-19 09:24:44 -05:00
Robert Haas	303640199d	Fix minor problems with non-exclusive backup cleanup. The previous coding imagined that it could call before_shmem_exit() when a non-exclusive backup began and then remove the previously-added handler by calling cancel_before_shmem_exit() when that backup ended. However, this only works provided that nothing else in the system has registered a before_shmem_exit() hook in the interim, because cancel_before_shmem_exit() is documented to remove a callback only if it is the latest callback registered. It also only works if nothing can ERROR out between the time that sessionBackupState is reset and the time that cancel_before_shmem_exit(), which doesn't seem to be strictly true. To fix, leave the handler installed for the lifetime of the session, arrange to install it just once, and teach it to quietly do nothing if there isn't a non-exclusive backup in process. This is a bug, but for now I'm not going to back-patch, because the consequences are minor. It's possible to cause a spurious warning to be generated, but that doesn't really matter. It's also possible to trigger an assertion failure, but production builds shouldn't have assertions enabled. Patch by me, reviewed by Kyotaro Horiguchi, Michael Paquier (who preferred a different approach, but got outvoted), Fujii Masao, and Tom Lane, and with comments by various others. Discussion: http://postgr.es/m/CA+TgmobMjnyBfNhGTKQEDbqXYE3_rXWpc4CM63fhyerNCes3mA@mail.gmail.com	2019-12-19 09:06:54 -05:00
Robert Haas	9aafc4529f	Re-#include <time.h> in checkpointer.c. Commit `7dbfea3c45` thought it could get away with removing this, but Thomas Munro reports, on behalf of the buildfarm, that it's still needed at least on Windows to avoid compiler warnings.	2019-12-18 13:03:41 -05:00
Robert Haas	e9fd0415e6	Move heap-specific detoasting logic into a separate function. The new function, heap_fetch_toast_slice, is shared between toast_fetch_datum_slice and toast_fetch_datum, and does all the work of scanning the TOAST table, fetching chunks, and storing them into the space allocated for the result varlena. As an incidental side effect, this allows toast_fetch_datum_slice to perform the scan with only a single scankey if all chunks are being fetched, which might have some tiny performance benefit. Discussion: http://postgr.es/m/CA+TgmobBzxwFojJ0zV0Own3dr09y43hp+OzU2VW+nos4PMXWEg@mail.gmail.com	2019-12-18 11:08:59 -05:00
Michael Paquier	2032645b19	Fix compiler warning in non-assert builds Oversight in commit `e1551f9`. Reported-by: Erik Rijkers Discussion: https://postgr.es/m/b7ad911d3eaa29af9fcdb9ccb26c363c@xs4all.nl	2019-12-18 16:55:25 +09:00
Michael Paquier	e1551f96e6	Refactor attribute mappings used in logical tuple conversion Tuple conversion support in tupconvert.c is able to convert rowtypes between two relations, inner and outer, which are logically equivalent but have a different ordering or even dropped columns (used mainly for inheritance tree and partitions). This makes use of attribute mappings, which are simple arrays made of AttrNumber elements with a length matching the number of attributes of the outer relation. The length of the attribute mapping has been treated as completely independent of the mapping itself until now, making it easy to pass down an incorrect mapping length. This commit refactors the code related to attribute mappings and moves it into an independent facility called attmap.c, extracted from tupconvert.c. This merges the attribute mapping with its length, avoiding to try to guess what is the length of a mapping to use as this is computed once, when the map is built. This will avoid mistakes like what has been fixed in `dc816e58`, which has used an incorrect mapping length by matching it with the number of attributes of an inner relation (a child partition) instead of an outer relation (a partitioned table). Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20191121042556.GD153437@paquier.xyz	2019-12-18 16:23:02 +09:00
Amit Kapila	04c8a69c0c	Fix subscriber invalid memory access on DDL. This patch allows building the local relmap cache for a subscribed relation after processing pending invalidation messages and potential relcache updates. Without this, the attributes in the local cache don't tally with the updated relcache entry leading to invalid memory access. Reported-by Jehan-Guillaume de Rorthais Author: Jehan-Guillaume de Rorthais and Vignesh C Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/20191025175929.7e90dbf5@firost	2019-12-18 07:49:18 +05:30
Michael Paquier	aa3ef7ff50	Fix some OBJS lists in two Makefiles to be ordered alphabetically These have been missed in `01368e5`, and count for plpython and the backend's tsearch code. Author: Mahendra Singh Discussion: https://postgr.es/m/CAKYtNAo4mxRRyDB0YqE6QLh17XD7pPQotpGm3GnHS+gQKz4zQQ@mail.gmail.com	2019-12-18 10:42:40 +09:00
Bruce Momjian	181932a032	Remove redundant not-null test Reported-by: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB2927E73FADCA8967B2302469E3490@MN2PR18MB2927.namprd18.prod.outlook.com Author: Ranier Vilela Backpatch-through: master	2019-12-17 20:37:22 -05:00
Michael Paquier	70116493a8	Remove shadow variables linked to RedoRecPtr in xlog.c This changes the routines in charge of recycling WAL segments past the last redo LSN to not use anymore "RedoRecPtr" as a local variable, which is also available in the context of the session as a static declaration, replacing it with "lastredoptr". This confusion has been introduced by `d9fadbf`, so backpatch down to v11 like the other commit. Thanks to Tom Lane, Robert Haas, Alvaro Herrera, Mark Dilger and Kyotaro Horiguchi for the input provided. Author: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB2927F7B5F690065E1194B258E35D0@MN2PR18MB2927.namprd18.prod.outlook.com Backpatch-through: 11	2019-12-18 10:11:13 +09:00
Tom Lane	2acab054b3	Fix error reporting for index expressions of prohibited types. If CheckAttributeType() threw an error about the datatype of an index expression column, it would report an empty column name, which is pretty unhelpful and certainly not the intended behavior. I (tgl) evidently broke this in commit `cfc5008a5`, by not noticing that the column's attname was used above where I'd placed the assignment of it. In HEAD and v12, this is trivially fixable by moving up the assignment of attname. Before v12 the code is a bit more messy; to avoid doing substantial refactoring, I took the lazy way out and just put in two copies of the assignment code. Report and patch by Amit Langote. Back-patch to all supported branches. Discussion: https://postgr.es/m/CA+HiwqFA+BGyBFimjiYXXMa2Hc3fcL0+OJOyzUNjhU4NCa_XXw@mail.gmail.com	2019-12-17 17:44:27 -05:00
Robert Haas	5184f110aa	Fix bad formula in previous commit. Commit `d5406dea25` used a slightly novel, and wrong, approach to compute the length of the last toast chunk. It worked fine unless the last chunk happened to have the largest possible size.	2019-12-17 15:53:17 -05:00
Robert Haas	d5406dea25	Code cleanup for toast_fetch_datum and toast_fetch_datum_slice. Rework some of the checks for bad TOAST chunks to be a bit simpler and easier to understand. These checks verify that (1) we get all and only the chunk numbers we expect to see and (2) each chunk has the expected size. However, the existing code was a bit hard to understand, at least for me; try to make it clearer. As part of that, have toast_fetch_datum_slice check the relationship between endchunk and totalchunks only with an Assert() rather than checking every chunk number against both values. There's no need to check that relationship in production builds because it's not a function of whether on-disk corruption is present; it's just a question of whether the code does the right math. Also, have toast_fetch_datum_slice() use ereport(ERROR) rather than elog(ERROR). Commit `fd6ec93bf8` made the two functions inconsistent with each other. Rename assorted variables for better clarity and consistency, and move assorted variables from function scope to the function's main loop. Remove a few variables that are used only once entirely. Patch by me, reviewed by Peter Eisentraut. Discussion: http://postgr.es/m/CA+TgmobBzxwFojJ0zV0Own3dr09y43hp+OzU2VW+nos4PMXWEg@mail.gmail.com	2019-12-17 14:27:09 -05:00
Robert Haas	b1cc572f12	Add missing "void" to prototypes. Commit `5910d6c7e3` got this wrong. Report and patch by Andrew Gierth. Discussion: http://postgr.es/m/8736diaj98.fsf@news-spur.riddles.org.uk	2019-12-17 13:56:19 -05:00
Robert Haas	7dbfea3c45	Partially deduplicate interrupt handling for background processes. Where possible, share signal handler code and main loop interrupt checking. This saves quite a bit of code and should simplify maintenance, too. This commit intends not to change the way anything works, even though that might allow more code to be unified. It does unify a bunch of individual variables into a ShutdownRequestPending flag that has is now used by a bunch of different process types, though. Patch by me, reviewed by Andres Freund and Daniel Gustafsson. Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com	2019-12-17 13:14:28 -05:00
Robert Haas	1e53fe0e70	Use PostgresSigHupHandler in more places. There seems to be no reason for every background process to have its own flag indicating that a config-file reload is needed. Instead, let's just use ConfigFilePending for that purpose everywhere. Patch by me, reviewed by Andres Freund and Daniel Gustafsson. Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com	2019-12-17 13:03:57 -05:00
Robert Haas	5910d6c7e3	Move interrupt-handling code into subroutines. Some auxiliary processes, as well as the autovacuum launcher, have interrupt handling code directly in their main loops. Try to abstract things a little better by moving it into separate functions. This doesn't make any functional difference, and leaves in place relatively large differences among processes in how interrupts are handled, but hopefully it at least makes it easier to see the commonalities and differences across process types. Patch by me, reviewed by Andres Freund and Daniel Gustafsson. Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com	2019-12-17 12:55:13 -05:00
Amit Kapila	af3290f5e7	Change overly strict Assert in TransactionGroupUpdateXidStatus. This Assert thought that an overflowed transaction can never get registered for the group update. But that is not true, because even when the number of children for a transaction got reduced, the overflow flag is not changed. And, for group update, we only care about the current number of children for a transaction that is being committed. Based on comments by Andres Freund, remove a redundant Assert in TransactionIdSetPageStatus as we already had a static Assert for the same condition a few lines earlier. Reported-by: Vignesh C Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/CAFiTN-s5=uJw-Z6JC9gcqtBSjXsrHnU63PXBrA=pnBjqnkm5UA@mail.gmail.com	2019-12-17 09:29:22 +05:30
Peter Geoghegan	fcf3b6917b	Rename nbtree tuple macros. Rename two function-style macros, removing the word "inner". This makes things more consistent.	2019-12-16 17:49:45 -08:00
Tom Lane	b925a00f4e	Fix "force_parallel_mode = regress" to work with ANALYZE + VERBOSE. force_parallel_mode = regress is supposed to force use of a Gather node without having any impact on EXPLAIN output. But it failed to accomplish that if both ANALYZE and VERBOSE are given, because that enables per-worker output data that you wouldn't see if the Gather hadn't been inserted. Improve the logic so that we suppress the per-worker data too. This allows putting the new test case added by commit `5935917ce` back into the originally intended form (cf. `776a2c887`, `22864f6e0`). We can also get rid of a kluge in subselect.sql, which previously had to clean up after force_parallel_mode's failure to do what it said on the tin. Discussion: https://postgr.es/m/18445.1576177309@sss.pgh.pa.us	2019-12-16 20:14:35 -05:00
Peter Geoghegan	9067b83955	Update nbtree README's "Scans during Recovery". get_actual_variable_range() hasn't used a dirty snapshot since commit `3ca930fc3`, which invented a new snapshot type specifically to meet selfuncs.c's requirements (HeapTupleSatisfiesNonVacuumable() type snapshots were added). Discussion: https://postgr.es/m/CAH2-Wzn2pSqEOcBDAA40CnO82oEy-EOpE2bNh_XL_cfFoA86jw@mail.gmail.com	2019-12-16 17:11:35 -08:00
Alvaro Herrera	91fca4bb60	Demote variable from global to local recoveryDelayUntilTime was introduced by commit `36da3cfb45` as a global because its method of operation was devilishly intrincate. Commit `c945af80cf` removed all that complexity and could have turned it into a local variable, but didn't. Do so now. Discussion: https://postgr.es/m/20191213200751.GA10731@alvherre.pgsql Reviewed-by: Michaël Paquier, Daniel Gustafsson	2019-12-16 14:23:56 -03:00
Heikki Linnakangas	741b884353	Fix yet another crash in page split during GiST index creation. Commit `a7ee7c8513` fixed a bug in GiST page split during index creation, where we failed to re-find the position of a downlink after the page containing it was split. However, that fix was incomplete; the other call to gistinserttuples() in the same function needs to also clear 'downlinkoffnum'. Fixes bug #16134 reported by Alexander Lakhin, for real this time. The previous fix was enough to fix the crash with the reproducer script for bug #16162, but the original script for #16134 was still crashing. Backpatch to v12, like the previous incomplete fix. Discussion: https://www.postgresql.org/message-id/d869f537-abe4-d2ea-0510-38cd053f5152%40gmail.com	2019-12-16 13:57:41 +02:00
Peter Eisentraut	f14413b684	Sort out getpeereid() and peer auth handling on Windows The getpeereid() uses have so far been protected by HAVE_UNIX_SOCKETS, so they didn't ever care about Windows support. But in anticipation of Unix-domain socket support on Windows, that needs to be handled differently. Windows doesn't support getpeereid() at this time, so we use the existing not-supported code path. We let configure do its usual thing of picking up the replacement from libpgport, instead of the custom overrides that it was doing before. But then Windows doesn't have struct passwd, so this patch sprinkles some additional #ifdef WIN32 around to make it work. This is similar to existing code that deals with this issue. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/5974caea-1267-7708-40f2-6009a9d653b0@2ndquadrant.com	2019-12-16 09:36:08 +01:00
Michael Paquier	e5a02e0fc6	Remove duplicated progress reporting during heap scan of VACUUM This has been introduced by `c16dc1a` since progress reporting for VACUUM has been added. As this issue just causes some extra work and is harmless, no backpatch is done. Author: Justin Pryzby Discussion: https://postgr.es/m/20191213030831.GT2082@telsasoft.com	2019-12-15 22:05:33 +09:00
Tom Lane	6ea364e7e7	Prevent overly-aggressive collapsing of joins to RTE_RESULT relations. The RTE_RESULT simplification logic added by commit `4be058fe9` had a flaw: it would collapse out a RTE_RESULT that is due to compute a PlaceHolderVar, and reassign the PHV to the parent join level, even if another input relation of the join contained a lateral reference to the PHV. That can't work because the PHV would be computed too late. In practice it led to failures of internal sanity checks later in planning (either assertion failures or errors such as "failed to construct the join relation"). To fix, add code to check for the presence of such PHVs in relevant portions of the query tree. Notably, this required refactoring range_table_walker so that a caller could ask to walk individual RTEs not the whole list. (It might be a good idea to refactor range_table_mutator in the same way, if only to keep those functions looking similar; but I didn't do so here as it wasn't necessary for the bug fix.) This exercise also taught me that find_dependent_phvs(), as it stood, could only safely be used on the entire Query, not on subtrees. Adjust its API to reflect that; which in passing allows it to have a fast path for the common case of no PHVs anywhere. Per report from Will Leinweber. Back-patch to v12 where the bug was introduced. Discussion: https://postgr.es/m/CALLb-4xJMd4GZt2YCecMC95H-PafuWNKcmps4HLRx2NHNBfB4g@mail.gmail.com	2019-12-14 13:49:15 -05:00
Michael Paquier	e0e569e1d1	Fix memory leak when initializing DH parameters in backend When loading DH parameters used for the generation of ephemeral DH keys in the backend, the code has never bothered releasing the memory used for the DH information loaded from a file or from libpq's default. This commit makes sure that the information is properly free()'d. Note that as SSL parameters can be reloaded, this can cause an accumulation of memory leaked. As the leak is minor, no backpatch is done. Reported-by: Dmitry Uspenskiy Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org	2019-12-14 18:17:31 +09:00
Thomas Munro	7c85be08a2	Fix mdsyncfiletag(), take II. The previous commit failed to consider that FileGetRawDesc() might not return a valid fd, as discovered on the build farm. Switch to using the File interface only. Back-patch to 12, like the previous commit.	2019-12-14 18:35:58 +13:00
Thomas Munro	7bb3102cea	Don't use _mdfd_getseg() in mdsyncfiletag(). _mdfd_getseg() opens all segments up to the requested one. That causes problems for mdsyncfiletag(), if mdunlinkfork() has already unlinked other segment files. Open the file we want directly by name instead, if it's not already open. The consequence of this bug was a rare panic in the checkpointer, made more likely if you saturated the sync request queue so that the SYNC_FORGET_REQUEST messages for a given relation were more likely to be absorbed in separate cycles by the checkpointer. Back-patch to 12. Defect in commit `3eb77eba`. Author: Thomas Munro Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20191119115759.GI30362%40telsasoft.com	2019-12-14 16:32:03 +13:00
Heikki Linnakangas	a7ee7c8513	Fix crash when a page was split during GiST index creation. The bug was similar to the one that was fixed in commit `22251686f0`. When we split page X and insert the downlink for the new page, the parent page might also need to be split. When that happens, the downlink offset number we remembered for X is no longer valid. We correctly called gistFindCorrectParent() to re-find it, but gistFindCorrectParent() doesn't do anything if the LSN of the page hasn't changed, and we stopped updating LSNs during index build in commit `9155580fd5`. The buggy codepath was taken if the page was split into three or more pages, and inserting the downlink caused the parent page to split. To fix, explicitly mark the downlink offset number as invalid, to force gistFindCorrectParent() to re-find it. Fixes bug #16134 reported by Alexander Lakhin, reported again as #16162 by Andreas Kunert. Thanks to Jeff Janes, Tom Lane and Tomas Vondra for debugging. Backpatch to v12, where we stopped WAL-logging during index build. Discussion: https://www.postgresql.org/message-id/16134-0423f729671dec64%40postgresql.org Discussion: https://www.postgresql.org/message-id/16162-45d21b7b6c1a3105%40postgresql.org	2019-12-13 23:58:10 +02:00
Tom Lane	1a3efa1eb6	Fix EXTRACT(ISOYEAR FROM timestamp) for years BC. The test cases added by commit `26ae3aa80` exposed an old oversight in timestamp[tz]_part: they didn't correct the result of date2isoyear() for BC years, so that we produced an off-by-one answer for such years. Fix that, and back-patch to all supported branches. Discussion: https://postgr.es/m/SG2PR06MB37762CAE45DB0F6CA7001EA9B6550@SG2PR06MB3776.apcprd06.prod.outlook.com	2019-12-12 12:30:43 -05:00
Tom Lane	26ae3aa80e	Remove redundant function calls in timestamp[tz]_part(). The DTK_DOW/DTK_ISODOW and DTK_DOY switch cases in timestamp_part() and timestamptz_part() contained calls of timestamp2tm() that were fully redundant with the ones done just above the switch. This evidently crept in during commit `258ee1b63`, which relocated that code from another place where the calls were indeed needed. Just delete the redundant calls. I (tgl) noted that our test coverage of these functions left quite a bit to be desired, so extend timestamp.sql and timestamptz.sql to cover all the branches. Back-patch to all supported branches, as the previous commit was. There's no real issue here other than some wasted cycles in some not-too-heavily-used code paths, but the test coverage seems valuable. Report and patch by Li Japin; test case adjustments by me. Discussion: https://postgr.es/m/SG2PR06MB37762CAE45DB0F6CA7001EA9B6550@SG2PR06MB3776.apcprd06.prod.outlook.com	2019-12-12 12:12:49 -05:00
Etsuro Fujita	a41a1456c4	Remove extra parenthesis from comment.	2019-12-12 15:45:00 +09:00
Tom Lane	591d404b9c	Add readfuncs.c support for AppendRelInfo. This is made necessary by the fact that commit `6ef77cf46` added AppendRelInfos to plan trees. I'd concluded that this extra code was not necessary because we don't transmit that data to parallel workers ... but I forgot about -DWRITE_READ_PARSE_PLAN_TREES. Per buildfarm.	2019-12-11 19:08:16 -05:00
Tom Lane	5935917ce5	Allow executor startup pruning to prune all child nodes. Previously, if the startup pruning logic proved that all child nodes of an Append or MergeAppend could be pruned, we still kept one, just to keep EXPLAIN from failing. The previous commit removed the ruleutils.c limitation that required this kluge, so drop it. That results in less-confusing EXPLAIN output, as per a complaint from Yuzuko Hosoya. David Rowley Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp	2019-12-11 17:05:30 -05:00
Tom Lane	6ef77cf46e	Further adjust EXPLAIN's choices of table alias names. This patch causes EXPLAIN to always assign a separate table alias to the parent RTE of an append relation (inheritance set); before, such RTEs were ignored if not actually scanned by the plan. Since the child RTEs now always have that same alias to start with (cf. commit `55a1954da`), the net effect is that the parent RTE usually gets the alias used or implied by the query text, and the children all get that alias with "_N" appended. (The exception to "usually" is if there are duplicate aliases in different subtrees of the original query; then some of those original RTEs will also have "_N" appended.) This results in more uniform output for partitioned-table plans than we had before: the partitioned table itself gets the original alias, and all child tables have aliases with "_N", rather than the previous behavior where one of the children would get an alias without "_N". The reason for giving the parent RTE an alias, even if it isn't scanned by the plan, is that we now use the parent's alias to qualify Vars that refer to an appendrel output column and appear above the Append or MergeAppend that computes the appendrel. But below the append, Vars refer to some one of the child relations, and are displayed that way. This seems clearer than the old behavior where a Var that could carry values from any child relation was displayed as if it referred to only one of them. While at it, change ruleutils.c so that the code paths used by EXPLAIN deal in Plan trees not PlanState trees. This effectively reverts a decision made in commit `1cc29fe7c`, which seemed like a good idea at the time to make ruleutils.c consistent with explain.c. However, it's problematic because we'd really like to allow executor startup pruning to remove all the children of an append node when possible, leaving no child PlanState to resolve Vars against. (That's not done here, but will be in the next patch.) This requires different handling of subplans and initplans than before, but is otherwise a pretty straightforward change. Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp	2019-12-11 17:05:18 -05:00
Alvaro Herrera	ba79cb5dc8	Emit parameter values during query bind/execute errors This makes such log entries more useful, since the cause of the error can be dependent on the parameter values. Author: Alexey Bashtanov, Álvaro Herrera Discussion: https://postgr.es/m/0146a67b-a22a-0519-9082-bc29756b93a2@imap.cc Reviewed-by: Peter Eisentraut, Andres Freund, Tom Lane	2019-12-11 18:03:35 -03:00
Tom Lane	16114f2ea0	Use only one thread to handle incoming signals on Windows. Since its inception, our Windows signal emulation code has worked by running a main signal thread that just watches for incoming signal requests, and then spawns a new thread to handle each such request. That design is meant for servers in which requests can take substantial effort to process, and it's worth parallelizing the handling of requests. But those assumptions are just bogus for our signal code. It's not much more than pg_queue_signal(), which is cheap and can't parallelize at all, plus we don't really expect lots of signals to arrive at the same backend at once. More importantly, this approach creates failure modes that we could do without: either inability to spawn a new thread or inability to create a new pipe handle will risk loss of signals. Hence, dispense with the separate per-signal threads and just service each request in-line in the main signal thread. This should be a bit faster (for the normal case of one signal at a time) as well as more robust. Patch by me; thanks to Andrew Dunstan for testing and Amit Kapila for review. Discussion: https://postgr.es/m/4412.1575748586@sss.pgh.pa.us	2019-12-11 15:09:54 -05:00
Peter Eisentraut	105eb360f2	Remove ATPrepSetStatistics It was once possible to do ALTER TABLE ... SET STATISTICS on system tables without allow_sytem_table_mods. This was changed apparently by accident between PostgreSQL 9.1 and 9.2, but a code comment still claimed this was possible. Without that functionality, having a separate ATPrepSetStatistics() is useless, so use the generic ATSimplePermissions() instead and move the remaining custom code into ATExecSetStatistics(). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/cc8d2648-a0ec-7a86-13e5-db473484e19e%402ndquadrant.com	2019-12-11 09:04:04 +01:00
Michael Paquier	c341c7d391	Fix some compiler warnings with timestamp parsing in formatting.c gcc-7 used with a sufficient optimization level complains about warnings around do_to_timestamp() regarding the initialization and handling of some of its variables. Recent commits `66c74f8` and `d589f94` made things made the interface more confusing, so document which variables are always expected and initialize properly the optional ones when they are set. Author: Andrey Lepikhov, Michael Paquier Discussion: https://postgr.es/m/a7e28b83-27b1-4e1c-c76b-4268c4b785bc@postgrespro.ru	2019-12-11 10:01:06 +09:00
Tom Lane	8729fa7248	Fix tuple column count in pg_control_init(). Oversight in commit `2e4db241b`. Nathan Bossart Discussion: https://postgr.es/m/1B616360-396A-4482-AA28-375566C86160@amazon.com	2019-12-10 17:52:13 -05:00
Alvaro Herrera	6cafde1bd4	Add backend-only appendStringInfoStringQuoted This provides a mechanism to emit literal values in informative messages, such as query parameters. The new code is more complex than what it replaces, primarily because it wants to be more efficient. It also has the (currently unused) additional optional capability of specifying a maximum size to print. The new function lives out of common/stringinfo.c so that frontend users of that file need not pull in unnecessary multibyte-encoding support code. Author: Álvaro Herrera and Alexey Bashtanov, after a suggestion from Andres Freund Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20190920203905.xkv5udsd5dxfs6tr@alap3.anarazel.de	2019-12-10 17:12:56 -03:00
Etsuro Fujita	5a20b0219e	Fix handling of multiple AFTER ROW triggers on a foreign table. AfterTriggerExecute() retrieves a fresh tuple or pair of tuples from a tuplestore and then stores the tuple(s) in the passed-in slot(s) if AFTER_TRIGGER_FDW_FETCH, while it uses the most-recently-retrieved tuple(s) stored in the slot(s) if AFTER_TRIGGER_FDW_REUSE. This was done correctly before 12, but commit `ff11e7f4b` broke it by mistakenly clearing the tuple(s) stored in the slot(s) in that function, leading to an assertion failure as reported in bug #16139 from Alexander Lakhin. Also, fix some other issues with the aforementioned commit in passing: * For tg_newslot, which is a slot added to the TriggerData struct by the commit to store new updated tuples, it didn't ensure the slot was NULL if there was no such tuple. * The commit failed to update the documentation about the trigger interface. Author: Etsuro Fujita Backpatch-through: 12 Discussion: https://postgr.es/m/16139-94f9ccf0db6119ec%40postgresql.org	2019-12-10 18:00:30 +09:00
Tom Lane	28e6a2fd63	Fix race condition in our Windows signal emulation. pg_signal_dispatch_thread() responded to the client (signal sender) and disconnected the pipe before actually setting the shared variables that make the signal visible to the backend process's main thread. In the worst case, it seems, effective delivery of the signal could be postponed for as long as the machine has any other work to do. To fix, just move the pg_queue_signal() call so that we do it before responding to the client. This essentially makes pgkill() synchronous, which is a stronger guarantee than we have on Unix. That may be overkill, but on the other hand we have not seen comparable timing bugs on any Unix platform. While at it, add some comments to this sadly underdocumented code. Problem diagnosis and fix by Amit Kapila; I just added the comments. Back-patch to all supported versions, as it appears that this can cause visible NOTIFY timing oddities on all of them, and there might be other misbehavior due to slow delivery of other signals. Discussion: https://postgr.es/m/32745.1575303812@sss.pgh.pa.us	2019-12-09 15:03:51 -05:00
Amit Kapila	2d0fdfacce	Fix typos in miscinit.c. Commit `f13ea95f9e` moved the description of postmaster.pid file contents from miscadmin.h to pidfile.h, but missed to update the comments in miscinit.c. Author: Hadi Moshayedi Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/CAK=1=WpYEM9x3LGkaxgXaxeYQjnkdW8XLsxrYRTE2Gq-H83FMw@mail.gmail.com	2019-12-09 08:39:34 +05:30
Jeff Davis	30d47723fd	Fix comments in execGrouping.c Commit `5dfc1981` missed updating some comments. Also, fix a comment typo found in passing. Author: Jeff Davis Discussion: https://postgr.es/m/9723131d247b919f94699152647fa87ee0bc02c2.camel%40j-davis.com	2019-12-06 11:49:59 -08:00
Tom Lane	fbbf68094c	Disallow non-default collation in ADD PRIMARY KEY/UNIQUE USING INDEX. When creating a uniqueness constraint using a pre-existing index, we have always required that the index have the same properties you'd get if you just let a new index get built. However, when collations were added, we forgot to add the index's collation to that check. It's hard to trip over this without intentionally trying to break it: you'd have to explicitly specify a different collation in CREATE INDEX, then convert it to a pkey or unique constraint. Still, if you did that, pg_dump would emit a script that fails to reproduce the index's collation. The main practical problem is that after a pg_upgrade the index would be corrupt, because its actual physical order wouldn't match what pg_index says. A more theoretical issue, which is new as of v12, is that if you create the index with a nondeterministic collation then it wouldn't be enforcing the normal notion of uniqueness, causing the constraint to mean something different from a normally-created constraint. To fix, just add collation to the conditions checked for index acceptability in ADD PRIMARY KEY/UNIQUE USING INDEX. We won't try to clean up after anybody who's already created such a situation; it seems improbable enough to not be worth the effort involved. (If you do get into trouble, a REINDEX should be enough to fix it.) In principle this is a long-standing bug, but I chose not to back-patch --- the odds of causing trouble seem about as great as the odds of preventing it, and both risks are very low anyway. Per report from Alexey Bashtanov, though this is not his preferred fix. Discussion: https://postgr.es/m/b05ce36a-cefb-ca5e-b386-a400535b1c0b@imap.cc	2019-12-06 11:25:09 -05:00
Peter Eisentraut	b1abfec825	Update minimum SSL version Change default of ssl_min_protocol_version to TLSv1.2 (from TLSv1, which means 1.0). Older versions are still supported, just not by default. TLS 1.0 is widely deprecated, and TLS 1.1 only slightly less so. All OpenSSL versions that support TLS 1.1 also support TLS 1.2, so there would be very little reason to, say, set the default to TLS 1.1 instead on grounds of better compatibility. The test suite overrides this new setting, so it can still run with older OpenSSL versions. Discussion: https://www.postgresql.org/message-id/flat/b327f8df-da98-054d-0cc5-b76a857cfed9%402ndquadrant.com	2019-12-04 22:07:43 +01:00
Etsuro Fujita	4af77aa797	Fix whitespace.	2019-12-04 12:45:00 +09:00
Michael Paquier	68ab982906	Fix thinkos from commit `9989d37` Error messages referring to incorrect WAL segment names could have been generated for a fsync() failure or when creating a new segment at the end of recovery.	2019-12-03 18:59:09 +09:00
Michael Paquier	9989d37d1c	Remove XLogFileNameP() from the tree XLogFileNameP() is a wrapper routine able to build a palloc'd string for a WAL segment name, which is used for error string generation. There were several code paths where it gets called in a critical section, where memory allocation is not allowed. This results in triggering an assertion failure instead of generating the wanted error message. Another, more annoying, problem is that if the allocation to generate the WAL segment name fails on OOM, then the failure would be escalated to a PANIC. This removes the routine and all its callers are replaced with a logic using a fixed-size buffer. This way, all the existing mistakes are fixed and future ones are prevented. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Álvaro Herrera Discussion: https://postgr.es/m/CA+fd4k5gC9H4uoWMLg9K_QfNrnkkdEw+-AFveob9YX7z8JnKTA@mail.gmail.com	2019-12-03 15:06:04 +09:00
Tom Lane	55a1954da1	Fix EXPLAIN's column alias output for mismatched child tables. If an inheritance/partitioning parent table is assigned some column alias names in the query, EXPLAIN mapped those aliases onto the child tables' columns by physical position, resulting in bogus output if a child table's columns aren't one-for-one with the parent's. To fix, make expand_single_inheritance_child() generate a correctly re-mapped column alias list, rather than just copying the parent RTE's alias node. (We have to fill the alias field, not just adjust the eref field, because ruleutils.c will ignore eref in favor of looking at the real column names.) This means that child tables will now always have alias fields in plan rtables, where before they might not have. That results in a rather substantial set of regression test output changes: EXPLAIN will now always show child tables with aliases that match the parent table (usually with "_N" appended for uniqueness). But that seems like a net positive for understandability, since the parent alias corresponds to something that actually appeared in the original query, while the child table names didn't. (Note that this does not change anything for cases where an explicit table alias was written in the query for the parent table; it just makes cases without such aliases behave similarly to that.) Hence, while we could avoid these subsidiary changes if we made inherit.c more complicated, we choose not to. Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us	2019-12-02 19:08:10 -05:00
Tom Lane	ce76c0ba53	Add a reverse-translation column number array to struct AppendRelInfo. This provides for cheaper mapping of child columns back to parent columns. The one existing use-case in examine_simple_variable() would hardly justify this by itself; but an upcoming bug fix will make use of this array in a mainstream code path, and it seems likely that we'll find other uses for it as we continue to build out the partitioning infrastructure. Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us	2019-12-02 18:05:29 -05:00
Tom Lane	c35b714caf	Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables. We implement ON COMMIT DELETE ROWS by truncating tables marked that way, which requires also truncating/rebuilding their indexes. But RelationTruncateIndexes asks the relcache for up-to-date copies of any index expressions, which may cause execution of eval_const_expressions on them, which can result in actual execution of subexpressions. This is a bad thing to have happening during ON COMMIT. Manuel Rigger reported that use of a SQL function resulted in crashes due to expectations that ActiveSnapshot would be set, which it isn't. The most obvious fix perhaps would be to push a snapshot during PreCommit_on_commit_actions, but I think that would just open the door to more problems: CommitTransaction explicitly expects that no user-defined code can be running at this point. Fortunately, since we know that no tuples exist to be indexed, there seems no need to use the real index expressions or predicates during RelationTruncateIndexes. We can set up dummy index expressions instead (we do need something that will expose the right data type, as there are places that build index tupdescs based on this), and just ignore predicates and exclusion constraints. In a green field it'd likely be better to reimplement ON COMMIT DELETE ROWS using the same "init fork" infrastructure used for unlogged relations. That seems impractical without catalog changes though, and even without that it'd be too big a change to back-patch. So for now do it like this. Per private report from Manuel Rigger. This has been broken forever, so back-patch to all supported branches.	2019-12-01 13:09:26 -05:00
Peter Eisentraut	e6c2d17c53	Small code simplification FLOAT8PASSBYVAL can be used instead of USE_FLOAT8_BYVAL here.	2019-11-29 10:55:31 +01:00
Peter Eisentraut	c4a7a392ec	Make allow_system_table_mods settable at run time Make allow_system_table_mods settable at run time by superusers. It was previously postmaster start only. We don't want to make system catalog DDL wide-open, but there are occasionally useful things to do like setting reloptions or statistics on a busy system table, and blocking those doesn't help anyone. Also, this enables the possibility of writing a test suite for this setting. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8b00ea5e-28a7-88ba-e848-21528b632354%402ndquadrant.com	2019-11-29 10:22:13 +01:00
Peter Eisentraut	508bf95b76	Remove any-user DML capability from allow_system_table_mods Previously, allow_system_table_mods allowed a non-superuser to do DML on a system table without further permission checks. This has been removed, as it was quite inconsistent with the rest of the meaning of this setting. (Since allow_system_table_mods was previously only accessible with a server restart, it is unlikely that anyone was using this possibility.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8b00ea5e-28a7-88ba-e848-21528b632354%402ndquadrant.com	2019-11-29 10:22:13 +01:00
Peter Eisentraut	d4feadeca1	Add error position to an error message Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6e7aa4a1-be6a-1a75-b1f9-83a678e5184a@2ndquadrant.com	2019-11-29 09:10:17 +01:00
Tomas Vondra	6d61c3f1cb	Remove unnecessary clauses_attnums variable Commit `c676e659b2` reworked how choose_best_statistics() picks the best extended statistics, but failed to remove clauses_attnums which is now unnecessary. So get rid of it and backpatch to 12, same as `c676e659b2`. Author: Tomas Vondra Discussion: https://postgr.es/m/CA+u7OA7H5rcE2=8f263w4NZD6ipO_XOrYB816nuLXbmSTH9pQQ@mail.gmail.com Backpatch-through: 12	2019-11-28 23:25:14 +01:00
Tomas Vondra	c676e659b2	Fix choose_best_statistics to check clauses individually When picking the best extended statistics object for a list of clauses, it's not enough to look at attnums extracted from the clause list as a whole. Consider for example this query with OR clauses: SELECT * FROM t WHERE (t.a = 1) OR (t.b = 1) OR (t.c = 1) with a statistics defined on columns (a,b). Relying on attnums extracted from the whole OR clause, we'd consider the statistics usable. That does not work, as we see the conditions as a single OR-clause, referencing an attribute not covered by the statistic, leading to empty list of clauses to be estimated using the statistics and an assert failure. This changes choose_best_statistics to check which clauses are actually covered, and only using attributes from the fully covered ones. For the previous example this means the statistics object will not be considered as compatible with the OR-clause. Backpatch to 12, where MCVs were introduced. The issue does not affect older versions because functional dependencies don't handle OR clauses. Author: Tomas Vondra Reviewed-by: Dean Rasheed Reported-By: Manuel Rigger Discussion: https://postgr.es/m/CA+u7OA7H5rcE2=8f263w4NZD6ipO_XOrYB816nuLXbmSTH9pQQ@mail.gmail.com Backpatch-through: 12	2019-11-28 22:20:45 +01:00
Alvaro Herrera	3974c4a724	Remove useless "return;" lines Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql	2019-11-28 16:48:37 -03:00
Etsuro Fujita	47a3c7fa06	Fix typo in comment.	2019-11-27 16:00:45 +09:00
Tom Lane	553d2ec271	Allow access to child table statistics if user can read parent table. The fix for CVE-2017-7484 disallowed use of pg_statistic data for planning purposes if the user would not be able to select the associated column and a non-leakproof function is to be applied to the statistics values. That turns out to disable use of pg_statistic data in some common cases involving inheritance/partitioning, where the user does have permission to select from the parent table that was actually named in the query, but not from a child table whose stats are needed. Since, in non-corner cases, the user can select the child table's data via the parent, this restriction is not actually useful from a security standpoint. Improve the logic so that we also check the permissions of the originally-named table, and allow access if select permission exists for that. When checking access to stats for a simple child column, we can map the child column number back to the parent, and perform this test exactly (including not allowing access if the child column isn't exposed by the parent). For expression indexes, the current logic just insists on whole-table select access, and this patch allows access if the user can select the whole parent table. In principle, if the child table has extra columns, this might allow access to stats on columns the user can't read. In practice, it's unlikely that the planner is going to do any stats calculations involving expressions that are not visible to the query, so we'll ignore that fine point for now. Perhaps someday we'll improve that logic to detect exactly which columns are used by an expression index ... but today is not that day. Back-patch to v11. The issue was created in 9.2 and up by the CVE-2017-7484 fix, but this patch depends on the append_rel_array[] planner data structure which only exists in v11 and up. In practice the issue is most urgent with partitioned tables, so fixing v11 and later should satisfy much of the practical need. Dilip Kumar and Amit Langote, with some kibitzing by me Discussion: https://postgr.es/m/3876.1531261875@sss.pgh.pa.us	2019-11-26 14:41:48 -05:00
Michael Paquier	12198239c0	Add safeguards for pg_fsync() called with incorrectly-opened fds On some platforms, fsync() returns EBADFD when opening a file descriptor with O_RDONLY (read-only), leading ultimately now to a PANIC to prevent data corruption. This commit adds a new sanity check in pg_fsync() based on fcntl() to make sure that we don't repeat again mistakes with incorrectly-set file descriptors so as problems are detected at an early stage. Without that, such errors could only be detected after running Postgres on a specific supported platform for the culprit code path, which could take some time before being found. `b8e19b93` was a fix for such a problem, which got undetected for more than 5 years, and `a586cc4b` fixed another similar issue. Note that the new check added works as well when fsync=off is configured, so as all regression tests would detect problems as long as assertions are enabled. fcntl() being not available on Windows, the new checks do not happen there. Author: Michael Paquier Reviewed-by: Mark Dilger Discussion: https://postgr.es/m/20191009062640.GB21379@paquier.xyz	2019-11-26 13:32:52 +09:00
Amit Kapila	080313f829	Don't shut down Gather[Merge] early under Limit. Revert part of commit `19df1702f5`. Early shutdown was added by that commit so that we could collect statistics from workers, but unfortunately, it interacted badly with rescans. The problem is that we ended up destroying the parallel context which is required for rescans. This leads to rescans of a Limit node over a Gather node to produce unpredictable results as it tries to access destroyed parallel context. By reverting the early shutdown code, we might lose statistics in some cases of Limit over Gather [Merge], but that will require further study to fix. Reported-by: Jerry Sievers Diagnosed-by: Thomas Munro Author: Amit Kapila, testcase by Vignesh C Backpatch-through: 9.6 Discussion: https://postgr.es/m/87ims2amh6.fsf@jsievers.enova.com	2019-11-26 08:30:24 +05:30
Robert Haas	0d3c3aae33	Use procsignal_sigusr1_handler for auxiliary processes. AuxiliaryProcessMain does ProcSignalInit, so one might expect that auxiliary processes would need to respond to SendProcSignal, but none of the auxiliary processes do that. Change them to use procsignal_sigusr1_handler instead of their own private handlers so that they do. Besides seeming more correct, this is also less code. It shouldn't make any functional difference right now because, as far as we know, there are no current cases where SendProcSignal targets an auxiliary process, but there are plans to change that in the future. Andres Freund Discussion: http://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de	2019-11-25 16:16:27 -05:00
Alvaro Herrera	0dc8ead463	Refactor WAL file-reading code into WALRead() XLogReader, walsender and pg_waldump all had their own routines to read data from WAL files to memory, with slightly different approaches according to the particular conditions of each environment. There's a lot of commonality, so we can refactor that into a single routine WALRead in XLogReader, and move the differences to a separate (simpler) callback that just opens the next WAL-segment. This results in a clearer (ahem) code flow. The error reporting needs are covered by filling in a new error-info struct, WALReadError, and it's the caller's responsibility to act on it. The backend has WALReadRaiseError() to do so. We no longer ever need to seek in this interface; switch to using pg_pread(). Author: Antonin Houska, with contributions from Álvaro Herrera Reviewed-by: Michaël Paquier, Kyotaro Horiguchi Discussion: https://postgr.es/m/14984.1554998742@spoje.net	2019-11-25 15:04:54 -03:00
Tom Lane	5883f5fe27	Fix unportable printf format introduced in commit `9290ad198`. "%ld" is not an acceptable format spec for int64 variables, though it accidentally works on most non-Windows 64-bit platforms. Follow the lead of commit `6a1cd8b92`, and use "%lld" with an explicit cast to long long. Per buildfarm.	2019-11-25 10:48:36 -05:00
Amit Kapila	e0487223ec	Make the order of the header file includes consistent. Similar to commits `14aec03502`, `7e735035f2` and `dddf4cdc33`, this commit makes the order of header file inclusion consistent in more places. Author: Vignesh C Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com	2019-11-25 08:08:57 +05:30
Michael Paquier	2aa84520b3	Fix inconsistent variable name in static function of mac8.c Both argument names were reversed in the declaration of the function. Author: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB292755AEFF9A9144B220ABEEE34B0@MN2PR18MB2927.namprd18.prod.outlook.com	2019-11-25 09:57:35 +09:00
Michael Paquier	4cb658af70	Refactor reloption handling for index AMs in-core This reworks the reloption parsing and build of a couple of index AMs by creating new structures for each index AM's options. This split was already done for BRIN, GIN and GiST (which actually has a fillfactor parameter), but not for hash, B-tree and SPGiST which relied on StdRdOptions due to an overlap with the default option set. This saves a couple of bytes for rd_options in each relcache entry with indexes making use of relation options, and brings more consistency between all index AMs. While on it, add a couple of AssertMacro() calls to make sure that utility macros to grab values of reloptions are used with the expected index AM. Author: Nikolay Shaplov Reviewed-by: Amit Langote, Michael Paquier, Álvaro Herrera, Dent John Discussion: https://postgr.es/m/4127670.gFlpRb6XCm@x200m	2019-11-25 09:40:53 +09:00
Tom Lane	d3aa114ac4	Doc: improve discussion of race conditions involved in LISTEN. The user docs didn't really explain how to use LISTEN safely, so clarify that. Also clean up some fuzzy-headed explanations in comments. No code changes. Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com	2019-11-24 18:03:39 -05:00
Tom Lane	6b802cfc7f	Avoid assertion failure with LISTEN in a serializable transaction. If LISTEN is the only action in a serializable-mode transaction, and the session was not previously listening, and the notify queue is not empty, predicate.c reported an assertion failure. That happened because we'd acquire the transaction's initial snapshot during PreCommit_Notify, which was called after predicate.c expects any such snapshot to have been established. To fix, just swap the order of the PreCommit_Notify and PreCommit_CheckForSerializationFailure calls during CommitTransaction. This will imply holding the notify-insertion lock slightly longer, but the difference could only be meaningful in serializable mode, which is an expensive option anyway. It appears that this is just an assertion failure, with no consequences in non-assert builds. A snapshot used only to scan the notify queue could not have been involved in any serialization conflicts, so there would be nothing for PreCommit_CheckForSerializationFailure to do except assign it a prepareSeqNo and set the SXACT_FLAG_PREPARED flag. And given no conflicts, neither of those omissions affect the behavior of ReleasePredicateLocks. This admittedly once-over-lightly analysis is backed up by the lack of field reports of trouble. Per report from Mark Dilger. The bug is old, so back-patch to all supported branches; but the new test case only goes back to 9.6, for lack of adequate isolationtester infrastructure before that. Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us	2019-11-24 15:57:49 -05:00
Tom Lane	7900269724	Stabilize NOTIFY behavior by transmitting notifies before ReadyForQuery. This patch ensures that, if any notify messages were received during a just-finished transaction, they get sent to the frontend just before not just after the ReadyForQuery message. With libpq and other client libraries that act similarly, this guarantees that the client will see the notify messages as available as soon as it thinks the transaction is done. This probably makes no difference in practice, since in realistic use-cases the application would have to cope with asynchronous arrival of notify events anyhow. However, it makes it a lot easier to build cross-session-notify test cases with stable behavior. I'm a bit surprised now that we've not seen any buildfarm instability with the test cases added by commit `b10f40bf0`. Tests that I intend to add in an upcoming bug fix are definitely unstable without this. Back-patch to 9.6, which is as far back as we can do NOTIFY testing with the isolationtester infrastructure. Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us	2019-11-24 14:42:59 -05:00
Tom Lane	8b7ae5a82d	Stabilize the results of pg_notification_queue_usage(). This function wasn't touched in commit `51004c717`, but that turns out to be a bad idea, because its results now include any dead space that exists in the NOTIFY queue on account of our being lazy about advancing the queue tail. Notably, the isolation tests now fail if run twice without a server restart between, because async-notify's first test of the function will already show a positive value. It seems likely that end users would be equally unhappy about the result's instability. To fix, just make the function call asyncQueueAdvanceTail before computing its result. That should end in producing the same value as before, and it's hard to believe that there's any practical use-case where pg_notification_queue_usage() is called so often as to create a performance degradation, especially compared to what we did before. Out of paranoia, also mark this function parallel-restricted (it was volatile, but parallel-safe by default, before). Although the code seems to work fine when run in a parallel worker, that's outside the design scope of async.c, and it's a bit scary to have intentional side-effects happening in a parallel worker. There seems no plausible use-case where it'd be important to try to parallelize this, so let's not take any risk of introducing new bugs. In passing, re-pgindent async.c and run reformat-dat-files on pg_proc.dat, just because I'm a neatnik. Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us	2019-11-24 14:09:33 -05:00
Alvaro Herrera	45ff049e28	Remove debugging aid This Assert(false) was not supposed to be in the committed copy. Reported by: Tom Lane Discussion: https://postgr.es/m/26476.1574525468@sss.pgh.pa.us	2019-11-23 13:19:20 -03:00
Joe Conway	f7a2002e82	Add object TRUNCATE hook All operations with acl permissions checks should have a corresponding hook so that, for example, mandatory access control (MAC) may be enforced by an extension. The command TRUNCATE is missing this hook, so add it. Patch by Yuli Khodorkovskiy with some editorialization by me. Based on the discussion not back-patched. A separate patch will exercise the hook in the sepgsql extension. Author: Yuli Khodorkovskiy Reviewed-by: Joe Conway Discussion: https://postgr.es/m/CAFL5wJcomybj1Xdw7qWmPJRpGuFukKgNrDb6uVBaCMgYS9dkaA%40mail.gmail.com	2019-11-23 10:39:20 -05:00
Tom Lane	4d9ceb0018	Fix bogus tuple-slot management in logical replication UPDATE handling. slot_modify_cstrings seriously abused the TupleTableSlot API by relying on a slot's underlying data to stay valid across ExecClearTuple. Since this abuse was also quite undocumented, it's little surprise that the case got broken during the v12 slot rewrites. As reported in bug #16129 from Ondřej Jirman, this could lead to crashes or data corruption when a logical replication subscriber processes a row update. Problems would only arise if the subscriber's table contained columns of pass-by-ref types that were not being copied from the publisher. Fix by explicitly copying the datum/isnull arrays from the source slot that the old row was in already. This ends up being about the same thing that happened pre-v12, but hopefully in a less opaque and fragile way. We might've caught the problem sooner if there were any test cases dealing with updates involving non-replicated or dropped columns. Now there are. Back-patch to v10 where this code came in. Even though the failure does not manifest before v12, IMO this code is too fragile to leave as-is. In any case we certainly want the additional test coverage. Patch by me; thanks to Tomas Vondra for initial investigation. Discussion: https://postgr.es/m/16129-a0c0f48e71741e5f@postgresql.org	2019-11-22 11:31:19 -05:00
Tom Lane	4a0aab14dc	Defend against self-referential views in relation_is_updatable(). While a self-referential view doesn't actually work, it's possible to create one, and it turns out that this breaks some of the information_schema views. Those views call relation_is_updatable(), which neglected to consider the hazards of being recursive. In older PG versions you get a "stack depth limit exceeded" error, but since v10 it'd recurse to the point of stack overrun and crash, because commit `a4c35ea1c` took out the expression_returns_set() call that was incidentally checking the stack depth. Since this function is only used by information_schema views, it seems like it'd be better to return "not updatable" than suffer an error. Hence, add tracking of what views we're examining, in just the same way that the nearby fireRIRrules() code detects self-referential views. I added a check_stack_depth() call too, just to be defensive. Per private report from Manuel Rigger. Back-patch to all supported versions.	2019-11-21 16:21:43 -05:00
Peter Eisentraut	2e4db241bf	Remove configure --disable-float4-byval This build option was only useful to maintain compatibility for version-0 functions, but those are no longer supported, so this option can be removed. float4 is now always pass-by-value; the pass-by-reference code path is completely removed. Discussion: https://www.postgresql.org/message-id/flat/f3e1e576-2749-bbd7-2d57-3f9dcf75255a@2ndquadrant.com	2019-11-21 18:29:21 +01:00
Fujii Masao	e6d8069522	Make DROP DATABASE command generate less WAL records. Previously DROP DATABASE generated as many XLOG_DBASE_DROP WAL records as the number of tablespaces that the database to drop uses. This caused the scans of shared_buffers as many times as the number of the tablespaces during recovery because WAL replay of one XLOG_DBASE_DROP record needs that full scan. This could make the recovery time longer especially when shared_buffers is large. This commit changes DROP DATABASE so that it generates only one XLOG_DBASE_DROP record, and registers the information of all the tablespaces into it. Then, WAL replay of XLOG_DBASE_DROP record needs full scan of shared_buffers only once, and which may improve the recovery performance. Author: Fujii Masao Reviewed-by: Kirk Jamison, Simon Riggs Discussion: https://postgr.es/m/CAHGQGwF8YwNH0ZaL+2wjZPkj+ji9UhC+Z4ScnG97WKtVY5L9iw@mail.gmail.com	2019-11-21 21:10:37 +09:00
Fujii Masao	30840c92ac	Allow ALTER VIEW command to rename the column in the view. ALTER TABLE RENAME COLUMN command always can be used to rename the column in the view, but it's reasonable to add that syntax to ALTER VIEW too. Author: Fujii Masao Reviewed-by: Ibrar Ahmed, Yu Kimura Discussion: https://postgr.es/m/CAHGQGwHoQMD3b-MqTLcp1MgdhCpOKU7QNRwjFooT4_d+ti5v6g@mail.gmail.com	2019-11-21 19:55:13 +09:00
Amit Kapila	9290ad198b	Track statistics for spilling of changes from ReorderBuffer. This adds the statistics about transactions spilled to disk from ReorderBuffer. Users can query the pg_stat_replication view to check these stats. Author: Tomas Vondra, with bug-fixes and minor changes by Dilip Kumar Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2019-11-21 08:06:51 +05:30
Michael Paquier	168d206400	Provide statistics for hypothetical BRIN indexes Trying to use hypothetical indexes with BRIN currently fails when trying to access a relation that does not exist when looking for the statistics. With the current API, it is not possible to easily pass a value for pages_per_range down to the hypothetical index, so this makes use of the default value of BRIN_DEFAULT_PAGES_PER_RANGE, which should be fine enough in most cases. Being able to refine or enforce the hypothetical costs in more optimistic ways would require more refactoring by filling in the statistics when building IndexOptInfo in plancat.c. This would involve ABI breakages around the costing routines, something not fit for stable branches. This is broken since `7e534ad`, so backpatch down to v10. Author: Julien Rouhaud, Heikki Linnakangas Reviewed-by: Álvaro Herrera, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CAOBaU_ZH0LKEA8VFCocr6Lpte1ab0b6FpvgS0y4way+RPSXfYg@mail.gmail.com Backpatch-through: 10	2019-11-21 10:23:28 +09:00
Tom Lane	9ff5b699ed	Sync patternsel_common's operator selection logic with pattern_prefix's. Make patternsel_common() select the comparison operators to use with hardwired logic that matches pattern_prefix()'s new logic, eliminating its dependencies on particular index opfamilies. This shouldn't change any behavior, as it's just replacing runtime operator lookups with the same values hard-wired. But it makes these closely-related functions look more alike, and saving some runtime syscache lookups is worth something. Actually, it's not quite true that this is zero behavioral change: when estimating for a column of type "name", the comparison constant will be kept as "text" not coerced to "name". But that's more correct anyway, and it allows additional simplification of the coercion logic, again syncing this more closely with pattern_prefix(). Per consideration of a report from Manuel Rigger. Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com	2019-11-20 15:00:18 -05:00
Peter Geoghegan	9f0f12ac57	Fix HeapTupleSatisfiesNonVacuumable() comment. Oversight in commit `63746189b2`.	2019-11-20 11:36:54 -08:00
Tom Lane	2ddedcafca	Reduce match_pattern_prefix()'s dependencies on index opfamilies. Historically, the planner's LIKE/regex index optimizations were only carried out for specific index opfamilies. That's never been a great idea from the standpoint of extensibility, but it didn't matter so much as long as we had no practical way to extend such behaviors anyway. With the addition of planner support functions, and in view of ongoing work to support additional table and index AMs, it seems like a good time to relax this. Hence, recast the decisions in match_pattern_prefix() so that rather than decide which operators to generate by looking at what the index opfamily contains, we decide which operators to generate a-priori and then see if the opfamily supports them. This is much more defensible from a semantic standpoint anyway, since we know the semantics of the chosen operators precisely, and we only need to assume that the opfamily correctly implements operators it claims to support. The existing "pattern" opfamilies put a crimp in this approach, since we need to select the pattern operators if we want those to work. So we still have to special-case those opfamilies. But that seems all right, since in view of the addition of collations, the pattern opfamilies seem like a legacy hack that nobody will be building on. The only immediate effect of this change, so far as the core code is concerned, is that anchored LIKE/regex patterns can be mapped onto BRIN index searches, and exact-match patterns can be mapped onto hash indexes, not only btree and spgist indexes as before. That's not a terribly exciting result, but it does fix an omission mentioned in the ancient comments here. Note: no catversion bump, even though this touches pg_operator.dat, because it's only adding OID macros not changing the contents of postgres.bki. Per consideration of a report from Manuel Rigger. Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com	2019-11-20 14:13:04 -05:00
Tom Lane	b3c265d7be	Fix corner-case failure in match_pattern_prefix(). The planner's optimization code for LIKE and regex operators could error out with a complaint like "no = operator for opfamily NNN" if someone created a binary-compatible index (for example, a bpchar_ops index on a text column) on the LIKE's left argument. This is a consequence of careless refactoring in commit `74dfe58a5`. The old code in match_special_index_operator only accepted specific combinations of the pattern operator and the index opclass, thereby indirectly guaranteeing that the opclass would have a comparison operator with the same LHS input type as the pattern operator. While moving the logic out to a planner support function, I simplified that test in a way that no longer guarantees that. Really though we'd like an altogether weaker dependency on the opclass, so rather than put back exactly the old code, just allow lookup failure. I have in mind now to rewrite this logic completely, but this is the minimum change needed to fix the bug in v12. Per report from Manuel Rigger. Back-patch to v12 where the mistake came in. Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com	2019-11-19 17:03:34 -05:00
Alexander Korotkov	b107140804	Fix page modification outside of critical section in GIN By oversight `52ac6cd2d0` makes ginDeletePage() sets pd_prune_xid of page to be deleted before entering the critical section. It appears that only versions 11 and later were affected by this oversight. Backpatch-through: 11	2019-11-20 00:12:33 +03:00
Alexander Korotkov	32ca32d0be	Revise GIN README We find GIN concurrency bugs from time to time. One of the problems here is that concurrency of GIN isn't well-documented in README. So, it might be even hard to distinguish design bugs from implementation bugs. This commit revised concurrency section in GIN README providing more details. Some examples are illustrated in ASCII art. Also, this commit add the explanation of how is tuple layout in internal GIN B-tree page different in comparison with nbtree. Discussion: https://postgr.es/m/CAPpHfduXR_ywyaVN4%2BOYEGaw%3DcPLzWX6RxYLBncKw8de9vOkqw%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Peter Geoghegan Backpatch-through: 9.4	2019-11-20 00:04:22 +03:00
Alexander Korotkov	d5ad7a09af	Fix traversing to the deleted GIN page via downlink Current GIN code appears to don't handle traversing to the deleted page via downlink. This commit fixes that by stepping right from the delete page like we do in nbtree. This commit also fixes setting 'deleted' flag to the GIN pages. Now other page flags are not erased once page is deleted. That helps to keep our assertions true if we arrive deleted page via downlink. Discussion: https://postgr.es/m/CAPpHfdvMvsw-NcE5bRS7R1BbvA4BxoDnVVjkXC5W0Czvy9LVrg%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Peter Geoghegan Backpatch-through: 9.4	2019-11-20 00:04:22 +03:00
Alexander Korotkov	e14641197a	Fix deadlock between ginDeletePage() and ginStepRight() When ginDeletePage() is about to delete page it locks its left sibling to revise the rightlink. So, it locks pages in right to left manner. Int he same time ginStepRight() locks pages in left to right manner, and that could cause a deadlock. This commit makes ginScanToDelete() keep exclusive lock on left siblings of currently investigated path. That elimites need to relock left sibling in ginDeletePage(). Thus, deadlock with ginStepRight() can't happen anymore. Reported-by: Chen Huajun Discussion: https://postgr.es/m/5c332bd1.87b6.16d7c17aa98.Coremail.chjischj%40163.com Author: Alexander Korotkov Reviewed-by: Peter Geoghegan Backpatch-through: 10	2019-11-20 00:04:09 +03:00
Amit Kapila	cec2edfa78	Add logical_decoding_work_mem to limit ReorderBuffer memory usage. Instead of deciding to serialize a transaction merely based on the number of changes in that xact (toplevel or subxact), this makes the decisions based on amount of memory consumed by the changes. The memory limit is defined by a new logical_decoding_work_mem GUC, so for example we can do this SET logical_decoding_work_mem = '128kB' to reduce the memory usage of walsenders or set the higher value to reduce disk writes. The minimum value is 64kB. When adding a change to a transaction, we account for the size in two places. Firstly, in the ReorderBuffer, which is then used to decide if we reached the total memory limit. And secondly in the transaction the change belongs to, so that we can pick the largest transaction to evict (and serialize to disk). We still use max_changes_in_memory when loading changes serialized to disk. The trouble is we can't use the memory limit directly as there might be multiple subxact serialized, we need to read all of them but we don't know how many are there (and which subxact to read first). We do not serialize the ReorderBufferTXN entries, so if there is a transaction with many subxacts, most memory may be in this type of objects. Those records are not included in the memory accounting. We also do not account for INTERNAL_TUPLECID changes, which are kept in a separate list and not evicted from memory. Transactions with many CTID changes may consume significant amounts of memory, but we can't really do much about that. The current eviction algorithm is very simple - the transaction is picked merely by size, while it might be useful to also consider age (LSN) of the changes for example. With the new Generational memory allocator, evicting the oldest changes would make it more likely the memory gets actually pfreed. The logical_decoding_work_mem can be set in postgresql.conf, in which case it serves as the default for all publishers on that instance. Author: Tomas Vondra, with changes by Dilip Kumar and Amit Kapila Reviewed-by: Dilip Kumar and Amit Kapila Tested-By: Vignesh C Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2019-11-19 07:32:36 +05:30
Peter Geoghegan	2110f71696	nbtree: Tweak _bt_pgaddtup() comments. Make it clear that _bt_pgaddtup() truncates the first data item on an internal page because its key is supposed to be treated as minus infinity within _bt_compare().	2019-11-18 13:04:53 -08:00
Tom Lane	bf2efc55da	Further fix dumping of views that contain just VALUES(...). It turns out that commit `e9f1c01b7` missed a case: we must print a VALUES clause in long format if get_query_def is given a resultDesc that would require the query's output column name(s) to be different from what the bare VALUES clause would produce. This applies in case an ALTER ... RENAME COLUMN has been done to a view that formerly could be printed in simple format, as shown in the added regression test case. It also explains bug #16119 from Dmitry Telpt, because it turns out that (unlike CREATE VIEW) CREATE MATERIALIZED VIEW fails to apply any column aliases it's given to the stored ON SELECT rule. So to get them to be printed, we have to account for the resultDesc renaming. It might be worth changing the matview code so that it creates the ON SELECT rule with the correct aliases; but we'd still need these messy checks in get_simple_values_rte to handle the case of a subsequent column rename, so any such change would be just neatnik-ism not a bug fix. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16119-e64823f30a45a754@postgresql.org	2019-11-16 20:00:19 -05:00
Tomas Vondra	2dc08bd617	Properly determine length for on-disk TOAST values In detoast_attr_slice, VARSIZE_ANY was used to compute compressed length of on-disk TOAST values. That's incorrect, because the varlena value may be just a TOAST pointer, producing either bogus value or crashing. This is likely why the code was crashing on big-endian machines before `540f316809` replaced the VARSIZE with VARSIZE_ANY, which however only masked the issue. Reported-by: Rushabh Lathia Discussion: https://postgr.es/m/CAL-OGkthU9Gs7TZchf5OWaL-Gsi=hXqufTxKv9qpNG73d5na_g@mail.gmail.com	2019-11-16 03:07:11 +01:00
Tomas Vondra	d482f7f867	Skip system attributes when applying mvdistinct stats When estimating number of distinct groups, we failed to ignore system attributes when matching the group expressions to mvdistinct stats, causing failures like ERROR: negative bitmapset member not allowed Fix that by simply skipping anything that is not a regular attribute. Backpatch to PostgreSQL 10, where the extended stats were introduced. Bug: #16111 Reported-by: Tuomas Leikola Author: Tomas Vondra Backpatch-through: 10 Discussion: https://postgr.es/m/16111-687799584c3a7e73@postgresql.org	2019-11-16 01:17:15 +01:00
Thomas Munro	76cbfcdf3a	Always call ExecShutdownNode() if appropriate. Call ExecShutdownNode() after ExecutePlan()'s loop, rather than at each break. We had forgotten to do that in one case. The omission caused intermittent "temporary file leak" warnings from multi-batch parallel hash joins with a LIMIT clause. Back-patch to 11. Though the problem exists in theory in earlier parallel query releases, nothing really depended on it. Author: Kyotaro Horiguchi Reviewed-by: Thomas Munro, Amit Kapila Discussion: https://postgr.es/m/20191111.212418.2222262873417235945.horikyota.ntt%40gmail.com	2019-11-16 10:11:30 +13:00
Michael Paquier	50d22de932	Cleanup code in reloptions.h regarding reloption handling reloptions.h includes since `ba748f7` a set of macros to handle reloption types in a way similar to how parseRelOptions() works. They have never been used in the core code, and we have more simple methods now to parse and fill in rd_options for a given relation depending on its relkind, so remove this interface to simplify things. Per discussion between Amit Langote, Álvaro Herrera and me. Discussion: https://postgr.es/m/CA+HiwqE6zbNO92az6pp5GiTw4tr-9rfCE0t84whQSP+YwSKjMQ@mail.gmail.com	2019-11-14 13:59:59 +09:00
Michael Paquier	1bbd608fda	Split handling of reloptions for partitioned tables Partitioned tables do not have relation options yet, but, similarly to what's done for views which have their own parsing table, it could make sense to introduce new parameters for some of the existing default ones like fillfactor, autovacuum, etc. Splitting things has the advantage to make the information stored in rd_options include only the necessary information, reducing the amount of memory used for a relcache entry with partitioned tables if new reloptions are introduced at this level. Author: Nikolay Shaplov Reviewed-by: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/1627387.Qykg9O6zpu@x200m	2019-11-14 12:34:28 +09:00
Andres Freund	7d962eaf50	Remove unused code from tuplesort. copytup_index() is unused, as tuplesort_putindextuplevalues() doesn't use COPYTUP(). Replace function body with an elog(ERROR), as already done e.g. for copytup_datum(). Author: Andres Freund Discussion: https://postgr.es/m/20191013144153.ooxrfglvnaocsrx2@alap3.anarazel.de	2019-11-13 15:57:01 -08:00
Tom Lane	d57d61533a	Add missing check_collation_set call to bpcharne(). We should throw an error for indeterminate collation, but bpcharne() was missing that logic, resulting in a much less user-friendly error (either an assertion failure or "cache lookup failed for collation 0"). Per report from Manuel Rigger. Back-patch to v12 where the mistake came in, evidently in commit `5e1963fb7`. (Before non-deterministic collations, this function wasn't collation sensitive.) Discussion: https://postgr.es/m/CA+u7OA4HOjtymxAbuGNh4-X_2R0Lw5n01tzvP8E5-i-2gQXYWA@mail.gmail.com	2019-11-13 15:53:53 -05:00
Tom Lane	0cafdd03a8	Fix silly initializations (cosmetic only). Initializing a pointer to "false" isn't per project style, and reportedly some compilers warn about it (though I've not seen any such warnings in the buildfarm). Seems to have come in with commit `ff11e7f4b`, so back-patch to v12 where that was added. Didier Gautheron Discussion: https://postgr.es/m/CAJRYxu+XQuM0qnSqt1Ujztu6fBPzMMAT3VEn6W32rgKG6A2Fsw@mail.gmail.com	2019-11-13 15:26:54 -05:00
Tom Lane	7bf40ea0d0	Avoid using SplitIdentifierString to parse ListenAddresses, too. This gets rid of our former behavior of forcibly downcasing the postmaster's hostname list and truncating the elements to NAMEDATALEN. In principle, DNS hostnames are case-insensitive so the first behavior should be harmless, and server hostnames are seldom long enough for the second behavior to be an issue. But it's still dubious, and an easy fix is available: just use SplitGUCList instead. AFAICT, all other SplitIdentifierString calls in the backend are OK: either the items actually are SQL identifiers, or they are keywords that are short and case-insensitive. Per thinking about bug #16106. While this has been wrong for a very long time, the lack of field complaints means there's little reason to back-patch. Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org	2019-11-13 13:51:58 -05:00
Tom Lane	7618eaf5f3	Avoid downcasing/truncation of RADIUS authentication parameters. Commit `6b76f1bb5` changed all the RADIUS auth parameters to be lists rather than single values. But its use of SplitIdentifierString to parse the list format was not very carefully thought through, because that function thinks it's parsing SQL identifiers, which means it will (a) downcase the strings and (b) truncate them to be shorter than NAMEDATALEN. While downcasing should be harmless for the server names and ports, it's just wrong for the shared secrets, and probably for the NAS Identifier strings as well. The truncation aspect is at least potentially a problem too, though typical values for these parameters would fit in 63 bytes. Fortunately, we now have a function SplitGUCList that is exactly the same except for not doing the two unwanted things, so fixing this is a trivial matter of calling that function instead. While here, improve the documentation to show how to double-quote the parameter values. I failed to resist the temptation to do some copy-editing as well. Report and patch from Marcos David (bug #16106); doc changes by me. Back-patch to v10 where the aforesaid commit came in, since this is arguably a regression from our previous behavior with RADIUS auth. Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org	2019-11-13 13:41:04 -05:00
Tom Lane	2c7b5dad6e	Include TableFunc references when computing expression dependencies. The TableFunc node (i.e., XMLTABLE) includes type and collation OIDs that might not be referenced anywhere else in the expression tree, so they need to be accounted for when extracting dependencies. Fortunately, the practical effects of this are limited, since (a) it's somewhat unlikely that people would be extracting columns of non-builtin types from an XML document, and (b) in many scenarios, the query would contain other references to such types, or functions depending on them. However, it's not hard to construct examples wherein the existing code lets one drop a type used in XMLTABLE and thereby break a view. This is evidently an original oversight in the XMLTABLE patch, so back-patch to v10 where that came in. Discussion: https://postgr.es/m/18427.1573508501@sss.pgh.pa.us	2019-11-13 12:11:49 -05:00
Fujii Masao	7b8a899bde	Make pg_waldump report more detail information about PREPARE TRANSACTION record. This commit changes xact_desc() so that it reports the detail information about PREPARE TRANSACTION record, like GID (global transaction identifier), timestamp at prepare transaction, delete-on-abort/commit relations, XID of subtransactions, and invalidation messages. These are helpful when diagnosing 2PC-related troubles. Author: Fujii Masao Reviewed-by: Michael Paquier, Andrey Lepikhov, Kyotaro Horiguchi, Julien Rouhaud, Alvaro Herrera Discussion: https://postgr.es/m/CAHGQGwEvhASad4JJnCv=0dW2TJypZgW_Vpb-oZik2a3utCqcrA@mail.gmail.com	2019-11-13 16:59:17 +09:00
Amit Kapila	1379fd537f	Introduce the 'force' option for the Drop Database command. This new option terminates the other sessions connected to the target database and then drop it. To terminate other sessions, the current user must have desired permissions (same as pg_terminate_backend()). We don't allow to terminate the sessions if prepared transactions, active logical replication slots or subscriptions are present in the target database. Author: Pavel Stehule with changes by me Reviewed-by: Dilip Kumar, Vignesh C, Ibrar Ahmed, Anthony Nowocien, Ryan Lambert and Amit Kapila Discussion: https://postgr.es/m/CAP_rwwmLJJbn70vLOZFpxGw3XD7nLB_7+NKz46H5EOO2k5H7OQ@mail.gmail.com	2019-11-13 08:25:33 +05:30
Tom Lane	112caf9039	Finish reverting commit `0a52d378b`. Apply the solution adopted in commit `dcb7d3caf` (ie, explicitly don't call memcmp for a zero-length comparison) to func_get_detail() as well, removing one other place where we were passing an uninitialized array to a parse_func.c entry point. Discussion: https://postgr.es/m/MN2PR18MB2927F24692485D754794F01BE3740@MN2PR18MB2927.namprd18.prod.outlook.com Discussion: https://postgr.es/m/MN2PR18MB2927F6873DF2774A505AC298E3740@MN2PR18MB2927.namprd18.prod.outlook.com	2019-11-12 16:58:08 -05:00
Alvaro Herrera	5c46e7d82e	pg_stat_{ssl,gssapi}: Show only processes with connections It is pointless to show in those views auxiliary processes that don't open network connections. A small incompatibility is that anybody joining pg_stat_activity and pg_stat_ssl/pg_stat_gssapi will have to use a left join if they want to see such auxiliary processes. Author: Euler Taveira Discussion: https://postgr.es/m/20190904151535.GA29108@alvherre.pgsql	2019-11-12 18:48:41 -03:00
Peter Geoghegan	1f55ebae27	Make _bt_keep_natts_fast() use datum_image_eq(). An upcoming patch that adds deduplication to the nbtree AM will rely on _bt_keep_natts_fast() understanding that differences in TOAST input state can never affect its answer. In particular, two opclass-equal datums (with opclasses deemed safe for deduplication) should never be treated as unequal by _bt_keep_natts_fast() due to TOAST input differences. This also seems like a good idea on general principle. nbtsplitloc.c will now occasionally make better decisions about where to split a leaf page. The behavior of _bt_keep_natts_fast() is now somewhat closer to the behavior of _bt_keep_natts(). Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com	2019-11-12 13:08:41 -08:00
Alvaro Herrera	dcb7d3cafa	Have LookupFuncName accept NULL argtypes for 0 args Prior to this change, it requires to be passed a valid pointer just to be able to pass it to a zero-byte memcmp, per `0a52d378b0`. Given the strange resulting code in callsites, it seems better to test for the case specifically and remove the requirement. Reported-by: Ranier Vilela Discussion: https://postgr.es/m/MN2PR18MB2927F24692485D754794F01BE3740@MN2PR18MB2927.namprd18.prod.outlook.com Discussion: https://postgr.es/m/MN2PR18MB2927F6873DF2774A505AC298E3740@MN2PR18MB2927.namprd18.prod.outlook.com	2019-11-12 17:06:58 -03:00
Peter Geoghegan	8c951687f5	Teach datum_image_eq() about cstring datums. Bring datum_image_eq() in line with datumIsEqual() by adding support for comparing cstring datums. An upcoming patch that adds deduplication to the nbtree AM will use datum_image_eq(). datum_image_eq() will need to work with all datatypes that can be used as the storage type of a B-Tree index column, including cstring. (cstring is used as the storage type for columns of type "name" as a space-saving optimization.) Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com	2019-11-12 11:25:34 -08:00
Tom Lane	7a0574b50e	Fix ecpglib.h to declare bool consistently with c.h. This completes the task begun in commit `1408d5d86`, to synchronize ECPG's exported definitions with the definition of bool used by c.h (and, therefore, the one actually in use in the ECPG library). On practically all modern platforms, ecpglib.h will now just include <stdbool.h>, which should surprise nobody anymore. That removes a header-inclusion-order hazard for ECPG clients, who previously might get build failures or unexpected behavior depending on whether they'd included <stdbool.h> themselves, and if so, whether before or after ecpglib.h. On platforms where sizeof(_Bool) is not 1 (only old PPC-based Mac systems, as far as I know), things are still messy, as inclusion of <stdbool.h> could still break ECPG client code. There doesn't seem to be any clean fix for that, and given the probably-negligible population of users who would care anymore, it's not clear we should go far out of our way to cope with it. This change at least fixes some header-inclusion-order hazards for our own code, since c.h and ecpglib.h previously disagreed on whether bool should be char or unsigned char. To implement this with minimal invasion of ECPG client namespace, move the choice of whether to rely on <stdbool.h> into configure, and have it export a configuration symbol PG_USE_STDBOOL. ecpglib.h no longer exports definitions for TRUE and FALSE, only their lowercase brethren. We could undo that if we get push-back about it. Ideally we'd back-patch this as far as v11, which is where c.h started to rely on <stdbool.h>. But the odds of creating problems for formerly-working ECPG client code seem about as large as the odds of fixing any non-working cases, so we'll just do this in HEAD. Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com	2019-11-12 13:00:04 -05:00
Amit Kapila	14aec03502	Make the order of the header file includes consistent in backend modules. Similar to commits `7e735035f2` and `dddf4cdc33`, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com	2019-11-12 08:30:16 +05:30
Peter Eisentraut	d0c92527cc	Fix whitespace	2019-11-11 09:51:10 +01:00
Thomas Munro	db2687d1f3	Optimize PredicateLockTuple(). PredicateLockTuple() has a fast exit if tuple was written by the current transaction, as in that case it already has a lock. This check can be performed using TransactionIdIsCurrentTransactionId() instead of SubTransGetTopmostTransaction(), to avoid any chance of having to hit the disk. Author: Ashwin Agrawal, based on a suggestion from Andres Freund Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com	2019-11-11 17:06:59 +13:00
Thomas Munro	695c5977c8	Optimize TransactionIdIsCurrentTransactionId(). If the passed in xid is the current top transaction, we can do a fast check and exit early. This should work well for the current heap but also works very well for proposed AMs that don't use a separate xid for subtransactions. Author: Ashwin Agrawal, based on a suggestion from Andres Freund Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com	2019-11-11 16:33:04 +13:00
Amit Kapila	9fab25c6cd	Rearrange dropdb() to avoid errors after allowing other sessions to exit. During Drop Database, it is better to error out before allowing other sessions to exit and forcefully terminating autovacuum workers. All the other errors except for checking subscriptions are already done before. Author: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1+qhLkCYG2oy9xug9ur_j=G2wQNRYAyd+-kZfZ1z42pLw@mail.gmail.com	2019-11-11 07:42:45 +05:30
Peter Eisentraut	1c60e40ad5	Fix negative bitmapset member not allowed error in logical replication This happens when we add a replica identity column on a subscriber that does not yet exist on the publisher, according to the mapping maintained by the subscriber. Code that checks whether the target relation on the subscriber is updatable would check the replica identity attribute bitmap with a column number -1, which would result in an error. To fix, skip such columns in the bitmap lookup and consider the relation not updatable. The result is consistent with the rule that the replica identity columns on the subscriber must be a subset of those on the publisher, since if the column doesn't exist on the publisher, the column set on the subscriber can't be a subset. Reported-by: Tim Clarke <tim.clarke@minerva.info> Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info	2019-11-09 08:35:44 +01:00
Andres Freund	aae50236e4	Pass ItemPointer not HeapTuple to IndexBuildCallback. Not all AMs use HeapTuples internally, making it inconvenient to pass a HeapTuple. As the index callbacks really only need the TID, not the full tuple, modify callback to only take ItemPointer. Author: Ashwin Agrawal Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CALfoeis6=8ehuR=VNtHvj3z16cYfCwPdTcpaxU+sfSUJ5QgR3g@mail.gmail.com	2019-11-08 11:49:29 -08:00
Alvaro Herrera	71a8a4f6e3	Add backtrace support for error reporting Add some support for automatically showing backtraces in certain error situations in the server. Backtraces are shown on assertion failure; also, a new setting backtrace_functions can be set to a list of C function names, and all ereport()s and elog()s from the mentioned functions will have backtraces generated. Finally, the function errbacktrace() can be manually added to an ereport() call to generate a backtrace for that call. Authors: Peter Eisentraut, Álvaro Herrera Discussion: https://postgr.es/m//5f48cb47-bf1e-05b6-7aae-3bf2cd01586d@2ndquadrant.com Discussion: https://postgr.es/m/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com	2019-11-08 15:44:20 -03:00
Peter Eisentraut	3dcffb381c	Fix gratuitous error message variation	2019-11-08 18:37:17 +01:00
Peter Eisentraut	b85e43feb3	More precise errors from initial pg_control check Use a separate error message for invalid checkpoint location and invalid state instead of just "invalid data" for both. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20191107041630.GK1768@paquier.xyz	2019-11-08 08:03:16 +01:00
Peter Geoghegan	e86c8ef243	Use "low key" terminology in nbtsort.c. nbtree index builds once stashed the "minimum key" for a page, which was used as the basis of the pivot tuple that gets placed in the next level up (i.e. the tuple that stores the downlink to the page in question). It doesn't quite work that way anymore, so the "minimum key" terminology now seems misleading (these days the minimum key is actually a straight copy of the high key from the left sibling, which is a distinct thing in subtle but important ways). Rename this concept to "low key". This name is a lot clearer given that there is now a sharp distinction between pivot and non-pivot tuples. Also remove comments that describe obsolete details about how the minimum key concept used to work. Rather than generating the minus infinity item for the leftmost page on a level by copying the new item and truncating that copy, simply allocate a small buffer. The old approach confusingly created the impression that the new item had some kind of significance. This was another artifact of how things used to work before commits `8224de4f` and `dd299df8`.	2019-11-07 17:12:09 -08:00
Alvaro Herrera	b4bcc6bfdf	Fix SET CONSTRAINTS .. DEFERRED on partitioned tables SET CONSTRAINTS ... DEFERRED failed on partitioned tables, because of a sanity check that ensures that the affected constraints have triggers. On partitioned tables, the triggers are in the leaf partitions, not in the partitioned relations themselves, so the sanity check fails. Removing the sanity check solves the problem, because the code needed to support the case is already there. Backpatch to 11. Note: deferred unique constraints are not affected by this bug, because they do have triggers in the parent partitioned table. I did not add a test for this scenario. Discussion: https://postgr.es/m/20191105212915.GA11324@alvherre.pgsql	2019-11-07 13:59:24 -03:00
Tom Lane	a7145f6bc8	Fix integer-overflow edge case detection in interval_mul and pgbench. This patch adopts the overflow check logic introduced by commit `cbdb8b4c0` into two more places. interval_mul() failed to notice if it computed a new microseconds value that was one more than INT64_MAX, and pgbench's double-to-int64 logic had the same sorts of edge-case problems that `cbdb8b4c0` fixed in the core code. To make this easier to get right in future, put the guts of the checks into new macros in c.h, and add commentary about how to use the macros correctly. Back-patch to all supported branches, as we did with the previous fix. Yuya Watari Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com	2019-11-07 11:22:58 -05:00
Peter Eisentraut	581a55889b	Fix nested error handling in PG_FINALLY We need to pop the error stack before running the user-supplied PG_FINALLY code. Otherwise an error in the cleanup code would end up at the same sigsetjmp() invocation and result in an infinite error handling loop. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com	2019-11-07 09:56:47 +01:00
Fujii Masao	a0c96856e8	Fix assertion failure when running pgbench -s. If there is the WAL page that the continuation WAL record just fits within (i.e., the continuation record ends just at the end of the page) and the LSN in such page is specified with -s option, previously pg_waldump caused an assertion failure. The cause of this assertion failure was that XLogFindNextRecord() that pg_waldump -s calls mistakenly handled such special WAL page. This commit changes XLogFindNextRecord() so that it can handle such WAL page correctly. Back-patch to all supported versions. Author: Andrey Lepikhov Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru	2019-11-07 16:31:36 +09:00
Thomas Munro	7815e7efdb	Add reusable routine for making arrays unique. Introduce qunique() and qunique_arg(), which can be used after qsort() and qsort_arg() respectively to remove duplicate values. Use it where appropriate. Author: Thomas Munro Reviewed-by: Tom Lane (in an earlier version) Discussion: https://postgr.es/m/CAEepm%3D2vmFTNpAmwbGGD2WaryM6T3hSDVKQPfUwjdD_5XY6vAA%40mail.gmail.com	2019-11-07 17:00:48 +13:00
Michael Paquier	3feb6ace7c	Check after errors of SPI_execute() in xml.c SPI gets used to build a list of relation OIDs for XML object generation, and one code path building a list uses SPI_execute() without looking at errors it produces. So fix that. Author: Mark Dilger Reviewed-by: Michael Paquier, Pavel Stehule Discussion: https://postgr.es/m/17d30445-4862-7917-170f-84328dcd292d@gmail.com	2019-11-07 11:13:31 +09:00
Tomas Vondra	6e3e6cc0e8	Allow sampling of statements depending on duration This allows logging a sample of statements, without incurring excessive log traffic (which may impact performance). This can be useful when analyzing workloads with lots of short queries. The sampling is configured using two new GUC parameters: * log_min_duration_sample - minimum required statement duration * log_statement_sample_rate - sample rate (0.0 - 1.0) Only statements with duration exceeding log_min_duration_sample are considered for sampling. To enable sampling, both those GUCs have to be set correctly. The existing log_min_duration_statement GUC has a higher priority, i.e. statements with duration exceeding log_min_duration_statement will be always logged, irrespectedly of how the sampling is configured. This means only configurations log_min_duration_sample < log_min_duration_statement do actually sample the statements, instead of logging everything. Author: Adrien Nayrat Reviewed-by: David Rowley, Vik Fearing, Tomas Vondra Discussion: https://postgr.es/m/bbe0a1a8-a8f7-3be2-155a-888e661cc06c@anayrat.info	2019-11-06 19:11:07 +01:00
Tom Lane	22e44e8dbc	Minor code review for tuple slot rewrite. Avoid creating transiently-inconsistent slot states where possible, by not setting TTS_FLAG_SHOULDFREE until after the slot actually has a free'able tuple pointer, and by making sure that we reset tts_nvalid and related derived state before we replace the tuple contents. This would only matter if something were to examine the slot after we'd suffered some kind of error (e.g. out of memory) while manipulating the slot. We typically don't do that, so these changes might just be cosmetic --- but even if so, it seems like good future-proofing. Also remove some redundant Asserts, and add a couple for consistency. Back-patch to v12 where all this code was rewritten. Discussion: https://postgr.es/m/16095-c3ff2e5283b8dba5@postgresql.org	2019-11-06 12:00:17 -05:00
Tom Lane	ff43b3e88e	Sync our DTrace infrastructure with c.h's definition of type bool. Since commit `d26a810eb`, we've defined bool as being either _Bool from <stdbool.h>, or "unsigned char"; but that commit overlooked the fact that probes.d has "#define bool char". For consistency, make it say "unsigned char" instead. This should be strictly a cosmetic change, but it seems best to be in sync. Formally, in the now-normal case where we're using <stdbool.h>, it'd be better to write "#define bool _Bool". However, then we'd need some build infrastructure to inject that configuration choice into probes.d, and it doesn't seem worth the trouble. We only use <stdbool.h> if sizeof(_Bool) is 1, so having DTrace think that bool parameters are "unsigned char" should be close enough. Back-patch to v12 where `d26a810eb` came in. Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com	2019-11-06 11:11:40 -05:00
Peter Eisentraut	d40abd5fcf	Fix memory allocation mistake The previous code was allocating more memory than necessary because the formula used the wrong data type. Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://www.postgresql.org/message-id/20191105172918.3e32a446@firost	2019-11-06 14:20:29 +01:00
Peter Eisentraut	5b7ba75f7f	Remove unused function argument The cache_plan argument to ri_PlanCheck has not been used since `e8c9fd5fdf`. Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ec8a8b45-a30b-9193-cd4b-985d60d1497e%402ndquadrant.com	2019-11-06 08:19:27 +01:00
Michael Paquier	5f6b1eb0cf	Fix timestamp of sent message for write context in logical decoding When sending data for logical decoding using the streaming replication protocol via a WAL sender, the timestamp of the sent write message is allocated at the beginning of the message when preparing for the write, and actually computed when the write message is ready to be sent. The timestamp was getting computed after sending the message. This impacts anything using logical decoding, causing for example logical replication to report mostly NULL for last_msg_send_time in pg_stat_subscription. This commit makes sure that the timestamp is computed before sending the message. This is wrong since `5a991ef`, so backpatch down to 9.4. Author: Jeff Janes Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com Backpatch-through: 9.4	2019-11-06 16:12:21 +09:00
Andrew Gierth	a9056cc637	Request small targetlist for input to WindowAgg. WindowAgg will potentially store large numbers of input rows into tuplestores to allow access to other rows in the frame. If the input is coming via an explicit Sort node, then unneeded columns will already have been discarded (since Sort requests a small tlist); but there are idioms like COUNT(*) OVER () that result in the input not being sorted at all, and cases where the input is being sorted by some means other than a Sort; if we don't request a small tlist, then WindowAgg's storage requirement is inflated by the unneeded columns. Backpatch back to 9.6, where the current tlist handling was added. (Prior to that, WindowAgg would always use a small tlist.) Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk	2019-11-06 04:13:30 +00:00
Fujii Masao	979766c0af	Correct the command tags for ALTER ... RENAME COLUMN. Previously ALTER MATERIALIZED VIEW / FOREIGN TABLE ... RENAME COLUMN ... returned "ALTER TABLE" as a command tag. This commit fixes them so that they return "ALTER MATERIALIZED VIEW" and "ALTER FOREIGN TABLE" as command tags, respectively. This issue exists in all supported versions, but we don't back-patch this because it's not enough of a bug to justify taking any compatibility risks for. Otherwise, the back-patch would cause minor version update to break, for example, the existing event trigger functions using TG_TAG. Author: Fujii Masao Reviewed-by: Ibrar Ahmed Discussion: https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==P25WvuBJjg@mail.gmail.com	2019-11-06 12:54:17 +09:00
Andres Freund	26aaf97b68	Make StringInfo available to frontend code. There's plenty places in frontend code that could benefit from a string buffer implementation. Some because it yields simpler and faster code, and some others because of the desire to share code between backend and frontend. While there is a string buffer implementation available to frontend code, libpq's PQExpBuffer, it is clunkier than stringinfo, it introduces a libpq dependency, doesn't allow for sharing between frontend and backend code, and has a higher API/ABI stability requirement due to being exposed via libpq. Therefore it seems best to just making StringInfo being usable by frontend code. There's not much to do for that, except for rewriting two subsequent elog/ereport calls into others types of error reporting, and deciding on a maximum string length. For the maximum string size I decided to privately define MaxAllocSize to the same value as used in the backend. It seems likely that we'll want to reconsider this for both backend and frontend code in the not too far away future. For now I've left stringinfo.h in lib/, rather than common/, to reduce the likelihood of unnecessary breakage. We could alternatively decide to provide a redirecting stringinfo.h in lib/, or just not provide compatibility. Author: Andres Freund Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de	2019-11-05 14:56:40 -08:00
Andres Freund	01368e5d9d	Split all OBJS style lines in makefiles into one-line-per-entry style. When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de	2019-11-05 14:41:07 -08:00
Tom Lane	66c61c81b9	Tweak some authentication debug messages to follow project style. Avoid initial capital, since that's not how we do it. Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com	2019-11-05 14:29:08 -05:00
Tom Lane	3affe76ef8	Avoid logging complaints about abandoned connections when using PAM. For a long time (since commit `aed378e8d`) we have had a policy to log nothing about a connection if the client disconnects when challenged for a password. This is because libpq-using clients will typically do that, and then come back for a new connection attempt once they've collected a password from their user, so that logging the abandoned connection attempt will just result in log spam. However, this did not work well for PAM authentication: the bottom-level function pam_passwd_conv_proc() was on board with it, but we logged messages at higher levels anyway, for lack of any reporting mechanism. Add a flag and tweak the logic so that the case is silent, as it is for other password-using auth mechanisms. Per complaint from Yoann La Cancellera. It's been like this for awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com	2019-11-05 14:27:37 -05:00
Tom Lane	a30531c5c8	Fix "unexpected relkind" error when denying permissions on toast tables. get_relkind_objtype, and hence get_object_type, failed when applied to a toast table. This is not a good thing, because it prevents reporting of perfectly legitimate permissions errors. (At present, these functions are in fact only used to determine the ObjectType argument for acl_error() calls.) It seems best to have them fall back to returning OBJECT_TABLE in every case where they can't determine an object type for a pg_class entry, so do that. In passing, make some edits to alter.c to make it more obvious that those calls of get_object_type() are used only for error reporting. This might save a few cycles in the non-error code path, too. Back-patch to v11 where this issue originated. John Hsu, Michael Paquier, Tom Lane Discussion: https://postgr.es/m/C652D3DF-2B0C-4128-9420-FB5379F6B1E4@amazon.com	2019-11-05 13:40:37 -05:00
Tom Lane	529ebb20aa	Generate EquivalenceClass members for partitionwise child join rels. Commit `d25ea0127` got rid of what I thought were entirely unnecessary derived child expressions in EquivalenceClasses for EC members that mention multiple baserels. But it turns out that some of the child expressions that code created are necessary for partitionwise joins, else we fail to find matching pathkeys for Sort nodes. (This happens only for certain shapes of the resulting plan; it may be that partitionwise aggregation is also necessary to show the failure, though I'm not sure of that.) Reverting that commit entirely would be quite painful performance-wise for large partition sets. So instead, add code that explicitly generates child expressions that match only partitionwise child join rels we have actually generated. Per report from Justin Pryzby. (Amit Langote noticed the problem earlier, though it's not clear if he recognized then that it could result in a planner error, not merely failure to exploit partitionwise join, in the code as-committed.) Back-patch to v12 where commit `d25ea0127` came in. Amit Langote, with lots of kibitzing from me Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com Discussion: https://postgr.es/m/20191011143703.GN10470@telsasoft.com	2019-11-05 11:42:24 -05:00
Michael Paquier	3534fa2233	Refactor code building relation options Historically, the code to build relation options has been shaped the same way in multiple code paths by using a set of datums in input with the options parsed with a static table which is then filled with the option values. This introduces a new common routine in reloptions.c to do most of the legwork for the in-core code paths. Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA+HiwqGsoSn_uTPPYT19WrtR7oYpYtv4CdS0xuedTKiHHWuk_g@mail.gmail.com	2019-11-05 09:17:05 +09:00
Tom Lane	ec28808ba8	Fix ginEntryInsert's counting of GIN leaf tuples. As the code stands, nEntries counts the number of ginEntryInsert() calls, so that's what you end up with at the end of a GIN index build. However, ginvacuumcleanup() recomputes nEntries as the number of surviving leaf tuples, and that's generally consistent with the way that gincostestimate() uses the value. So let's clearly define nEntries as the number of leaf tuples, and therefore adjust ginEntryInsert() to increment it only when we make a new one, not when we add TIDs into an existing tuple or posting tree. In practice this inconsistency probably has little impact, so I don't feel a need to back-patch. Insung Moon and Keisuke Kuroda Discussion: https://postgr.es/m/CAEMmqBuH_O-oXL+3_ArQ6F5cJ7kXVow2SGQB3HRacku_T+xkmA@mail.gmail.com	2019-11-04 14:16:42 -05:00
Peter Eisentraut	a63c84e59a	Fix some compiler warnings on older compilers Some older compilers appear to not understand the recently introduced PG_FINALLY code structure that well in some circumstances and complain about possibly uninitialized variables. So to fix, initialize the variables explicitly in the cases complained about. Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com	2019-11-04 11:07:32 +01:00
Peter Eisentraut	8557a6f10c	Catch invalid typlens in a couple of places Rearrange the logic in record_image_cmp() and datum_image_eq() to error out on unexpected typlens (either not supported there or completely invalid due to corruption). Barring corruption, this is not possible today but it seems more future-proof and robust to fix this. Reported-by: Peter Geoghegan <pg@bowt.ie>	2019-11-04 09:08:15 +01:00
Tom Lane	db27b60f07	Suppress warning from older compilers. Commit `8af1624e3` introduced a warning about possibly returning without a value, on compilers that don't realize that ereport(ERROR) doesn't return. Tweak the code to avoid that. Per buildfarm. Back-patch to 9.6, like the aforesaid commit.	2019-11-03 16:10:23 -05:00
Tom Lane	8af1624e3f	Validate ispell dictionaries more carefully. Using incorrect, or just mismatched, dictionary and affix files could result in a crash, due to failure to cross-check offsets obtained from the file. Add necessary validation, as well as some Asserts for future-proofing. Per bug #16050 from Alexander Lakhin. Back-patch to 9.6 where the problem was introduced. Arthur Zakirov, per initial investigation by Tomas Vondra Discussion: https://postgr.es/m/16050-024ae722464ab604@postgresql.org Discussion: https://postgr.es/m/20191013012610.2p2fp3zzpoav7jzf@development	2019-11-02 16:45:32 -04:00
Michael Paquier	dc816e5815	Fix failure when creating cloned indexes for a partition When using CREATE TABLE for a new partition, the partitioned indexes of the parent are created automatically in a fashion similar to LIKE INDEXES. The new partition and its parent use a mapping for attribute numbers for this operation, and while the mapping was correctly built, its length was defined as the number of attributes of the newly-created child, and not the parent. If the parent includes dropped columns, this could cause failures. This is wrong since `8b08f7d` which has introduced the concept of partitioned indexes, so backpatch down to 11. Reported-by: Wyatt Alt Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/CAGem3qCcRmhbs4jYMkenYNfP2kEusDXvTfw-q+eOhM0zTceG-g@mail.gmail.com Backpatch-through: 11	2019-11-02 14:16:04 +09:00
Michael Paquier	e174f699c4	Add some assertions in syncrep.c A couple of routines assume that the LWLock SyncRepLock needs to be taken, so add a couple of assertions to be sure of that. Also, when waiting for a given LSN at transaction commit, the code implied that the syncrep queue cleanup happens while holding interrupts, but the code never checked after that. Author: Michael Paquier Reviewed-by: Fujii Masao, Kyotaro Horiguchi, Dongming Liu Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com	2019-11-01 22:51:05 +09:00
Michael Paquier	20345197ff	Fix race condition at backend exit when deleting element in syncrep queue When a backend exits, it gets deleted from the syncrep queue if present. The queue was checked without SyncRepLock taken in exclusive mode, so it would have been possible for a backend to remove itself after a WAL sender already did the job. Fix this issue based on a suggestion from Fujii Masao, by first checking the queue without the lock. Then, if the backend is present in the queue, take the lock and perform an additional lookup check before doing the element deletion. Author: Dongming Liu Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com Backpatch-through: 9.4	2019-11-01 22:38:32 +09:00
Peter Eisentraut	604bd36711	PG_FINALLY This gives an alternative way of catching exceptions, for the common case where the cleanup code is the same in the error and non-error cases. So instead of PG_TRY(); { ... code that might throw ereport(ERROR) ... } PG_CATCH(); { cleanup(); PG_RE_THROW(); } PG_END_TRY(); cleanup(); one can write PG_TRY(); { ... code that might throw ereport(ERROR) ... } PG_FINALLY(); { cleanup(); } PG_END_TRY(); Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com	2019-11-01 11:18:03 +01:00
Peter Eisentraut	7302514088	Add const qualifiers to internal range type APIs Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/dc9b45fa-b950-fadc-4751-85d6f729df55%402ndquadrant.com	2019-10-31 07:48:21 +01:00
Michael Paquier	f921ea624e	Fix typo in comment of syncrep.c Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20191030.123428.18823202335157111.horikyota.ntt@gmail.com	2019-10-31 10:22:24 +09:00
Peter Eisentraut	c5e1df951d	Remove one use of IDENT_USERNAME_MAX IDENT_USERNAME_MAX is the maximum length of the information returned by an ident server, per RFC 1413. Using it as the buffer size in peer authentication is inappropriate. It was done here because of the historical relationship between peer and ident authentication. To reduce confusion between the two authenticaton methods and disentangle their code, use a dynamically allocated buffer instead. Discussion: https://www.postgresql.org/message-id/flat/c798fba5-8b71-4f27-c78e-37714037ea31%402ndquadrant.com	2019-10-30 11:18:00 +01:00
Peter Eisentraut	5cc1e64fb6	Update code comments about peer authenticaton For historical reasons, the functions for peer authentication were grouped under ident authentication. But they are really completely separate, so give them their own section headings.	2019-10-30 09:13:39 +01:00
Michael Paquier	6ca86bb7e9	Fix typos in the code Author: Vignesh C Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/CALDaNm0ni+GAOe4+fbXiOxNrVudajMYmhJFtXGX-zBPoN8ixhw@mail.gmail.com	2019-10-30 10:03:00 +09:00
Michael Paquier	d80be6f2f6	Fix handling of pg_class.relispartition at swap phase in REINDEX CONCURRENTLY When cancelling REINDEX CONCURRENTLY after swapping the old and new indexes (for example interruption at step 5), the old index remains around and is marked as invalid. The old index should also be manually droppable to clean up the parent relation from any invalid indexes still remaining. For a partition index reindexed, pg_class.relispartition was not getting updated, causing the index to not be droppable as DROP INDEX would look for dependencies in a partition tree, which do not exist anymore after the swap phase is done. The fix here is simple: when swapping the old and new indexes, make sure that pg_class.relispartition is correctly switched, similarly to what is done for the index name. Reported-by: Justin Pryzby Author: Michael Paquier Discussion: https://postgr.es/m/20191015164047.GA22729@telsasoft.com Backpatch-through: 12	2019-10-29 11:08:09 +09:00
Tom Lane	8b7a0f1d11	Allow extracting fields from a ROW() expression in more cases. Teach get_expr_result_type() to manufacture a tuple descriptor directly from a RowExpr node. If the RowExpr has type RECORD, this is the only way to get a tupdesc for its result, since even if the rowtype has been blessed, we don't have its typmod available at this point. (If the RowExpr has some named composite type, we continue to let the existing code handle it, since the RowExpr might well not have the correct column names embedded in it.) This fixes assorted corner cases illustrated by the added regression tests. Discussion: https://postgr.es/m/10872.1572202006@sss.pgh.pa.us	2019-10-28 15:08:24 -04:00
Tom Lane	bd1ef5799b	Handle empty-string edge cases correctly in strpos(). Commit `9556aa01c` rearranged the innards of text_position() in a way that would make it not work for empty search strings. Which is fine, because all callers of that code special-case an empty pattern in some way. However, the primary use-case (text_position itself) got special-cased incorrectly: historically it's returned 1 not 0 for an empty search string. Restore the historical behavior. Per complaint from Austin Drenski (via Shay Rojansky). Back-patch to v12 where it got broken. Discussion: https://postgr.es/m/CADT4RqAz7oN4vkPir86Kg1_mQBmBxCp-L_=9vRpgSNPJf0KRkw@mail.gmail.com	2019-10-28 12:21:13 -04:00
Michael Paquier	68ac9cf249	Fix dependency handling at swap phase of REINDEX CONCURRENTLY When swapping the dependencies of the old and new indexes, the code has been correctly switching all links in pg_depend from the old to the new index for both referencing and referenced entries. However it forgot the fact that the new index may itself have existing entries in pg_depend, like references to the parent table attributes. This resulted in duplicated entries in pg_depend after running REINDEX CONCURRENTLY. Fix this problem by removing any existing entries in pg_depend on the new index before switching the dependencies of the old index to the new one. More regression tests are added to check the consistency of entries in pg_depend for indexes, including partition indexes. Author: Michael Paquier Discussion: https://postgr.es/m/20191025064318.GF8671@paquier.xyz Backpatch-through: 12	2019-10-28 11:57:31 +09:00
Michael Paquier	51970fa8df	Fix initialization of fake LSN for unlogged relations `9155580` has changed the value of the first fake LSN for unlogged relations from 1 to FirstNormalUnloggedLSN (aka 1000), GiST requiring a non-zero LSN on some pages to allow an interlocking logic to work, but its value was still initialized to 1 at the beginning of recovery or after running pg_resetwal. This fixes the initialization for both code paths. Author: Takayuki Tsunakawa Reviewed-by: Dilip Kumar, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/OSBPR01MB2503CE851940C17DE44AE3D9FE6F0@OSBPR01MB2503.jpnprd01.prod.outlook.com Backpatch-through: 12	2019-10-27 13:54:12 +09:00
Peter Eisentraut	2fc2a88e67	Remove obsolete information schema tables Remove SQL_LANGUAGES, which was eliminated in SQL:2008, and SQL_PACKAGES and SQL_SIZING_PROFILES, which were eliminated in SQL:2011. Since they were dropped by the SQL standard, the information in them was no longer updated and therefore no longer useful. This also removes the feature-package association information in sql_feature_packages.txt, but for the time begin we are keeping the information which features are in the Core package (that is, mandatory SQL features). Maybe at some point someone wants to invent a way to store that that does not involve using the "package" mechanism anymore. Discussion https://www.postgresql.org/message-id/flat/91334220-7900-071b-9327-0c6ecd012017%402ndquadrant.com	2019-10-25 21:37:14 +02:00
Tom Lane	22f6f2c1cc	Improve management of statement timeouts. Commit `f8e5f156b` added private state in postgres.c to track whether a statement timeout is running. This seems like bad design to me; timeout.c's private state should be the single source of truth about that. We already fixed one bug associated with failure to keep those states in sync (cf. `be42015fc`), and I've got little faith that we won't find more in future. So get rid of postgres.c's local variable by exposing a way to ask timeout.c whether a timeout is running. (Obviously, such an inquiry is subject to race conditions, but it seems fine for the purpose at hand.) To make get_timeout_active() as cheap as possible, add a flag in the per-timeout struct showing whether that timeout is active. This allows some small savings elsewhere in timeout.c, mainly elimination of unnecessary searches of the active_timeouts array. While at it, fix enable_statement_timeout to not call disable_timeout when statement_timeout is 0 and the timeout is not running. This avoids a useless deschedule-and-reschedule-timeouts cycle, which represents a significant savings (at least one kernel call) when there is any other active timeout. Right now, there usually isn't, but there are proposals around to change that. Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org	2019-10-25 11:41:16 -04:00
Tom Lane	2b2bacdca0	Reset statement_timeout between queries of a multi-query string. Historically, we started the timer (if StatementTimeout > 0) at the beginning of a simple-Query message and usually let it run until the end, so that the timeout limit applied to the entire query string, and intra-string changes of the statement_timeout GUC had no effect. But, confusingly, a COMMIT within the string would reset the state and allow a fresh timeout cycle to start with the current setting. Commit `f8e5f156b` changed the behavior of statement_timeout for extended query protocol, and as an apparently-unintended side effect, a change in the statement_timeout GUC during a multi-statement simple-Query message might have an effect immediately --- but only if it was going from "disabled" to "enabled". This is all pretty confusing, not to mention completely undocumented. Let's change things so that the timeout is always reset between queries of a multi-query string, whether they're transaction control commands or not. Thus the active timeout setting is applied to each query in the string, separately. This costs a few more cycles if statement_timeout is active, but it provides much more intuitive behavior, especially if one changes statement_timeout in one of the queries of the string. Also, add something to the documentation to explain all this. Per bug #16035 from Raj Mohite. Although this is a bug fix, I'm hesitant to back-patch it; conceivably somebody has worked out the old behavior and is depending on it. (But note that this change should make the behavior less restrictive in most cases, since the timeout will now be applied to shorter segments of code.) Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org	2019-10-25 11:15:50 -04:00
Michael Paquier	8270a0d9a9	Handle interrupts within a transaction context in REINDEX CONCURRENTLY Phases 2 (building the new index) and 3 (validating the new index) checked for interrupts outside a transaction context, having as consequence to not release session-level locks taken on the parent relation and the old and new indexes processed. This could for example be triggered with statement_timeout and a bad timing, and would issue confusing error messages when shutting down the session still holding the locks (note that an assertion failure would be triggered first), on top of more issues with concurrent sessions trying to take a lock that would interfere with the SHARE UPDATE EXCLUSIVE locks hold here. This moves all the interruption checks inside a transaction context. Note that I have manually tested all interruptions to make sure that invalid indexes can be cleaned up properly. Partition indexes still have issues on their own with some missing dependency handling, which will be dealt with in a follow-up patch. Reported-by: Justin Pryzby Author: Michael Paquier Discussion: https://postgr.es/m/20191013025145.GC4475@telsasoft.com Backpatch-through: 12	2019-10-25 10:20:08 +09:00
Fujii Masao	3b0c59ac1c	Fix typo in xlog.c. Author: Fujii Masao Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAHGQGwH7dtYvOZZ8c0AG5AJwH5pfiRdKaCptY1_RdHy0HYeRfQ@mail.gmail.com	2019-10-24 14:13:36 +09:00
Michael Paquier	5d3500da72	Acquire properly session-level lock on new index in REINDEX CONCURRENTLY In the first transaction run for REINDEX CONCURRENTLY, a thinko in the existing logic caused two session locks to be taken on the old index, causing the session lock on the newly-created index to be missed. This made possible concurrent DDL commands (like ALTER INDEX) on the new index while REINDEX CONCURRENTLY was processing from the point where the first internal transaction committed. This issue has been discovered while digging into another bug. Author: Michael Paquier Discussion: https://postgr.es/m/20191021074323.GB1869@paquier.xyz Backpatch-through: 12	2019-10-23 15:04:48 +09:00
Michael Paquier	e3db3f829f	Clean up properly error_context_stack in autovacuum worker on exception Any callback set would have no meaning in the context of an exception. As an autovacuum worker exits quickly in this context, this could be only an issue within EmitErrorReport(), where the elog hook is for example called. That's unlikely to going to be a problem, but let's be clean and consistent with other code paths handling exceptions. This is present since `2909419`, which introduced autovacuum. Author: Ashwin Agrawal Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALfoeisM+_+dgmAdAOHAu0k-ZpEHHqSSG=GRf3pKJGm8OqWX0w@mail.gmail.com Backpatch-through: 9.4	2019-10-23 10:25:06 +09:00
Peter Eisentraut	f86f46d091	Fix comment The last argument of smgrextend() was renamed from isTemp to skipFsync in `debcec7dc3`, but the comments at two call sites were not updated.	2019-10-22 09:58:20 +02:00
Alexander Korotkov	52ad1e6599	Refactor jsonpath's compareDatetime() This commit refactors come ridiculous coding in compareDatetime(). Also, it provides correct cross-datatype comparison even when one of values overflows during cast. That eliminates dilemma on whether we should suppress overflow errors during cast. Reported-by: Tom Lane Discussion: https://postgr.es/m/32308.1569455803%40sss.pgh.pa.us Discussion: https://postgr.es/m/a5629d0c-8162-7559-16aa-0c8390d6ba5f%40postgrespro.ru Author: Nikita Glukhov, Alexander Korotkov	2019-10-21 23:07:07 +03:00
Alexander Korotkov	a6888fde7f	Refactor timestamp2timestamptz_opt_error() While casting from timestamp to timestamptz we do timestamp2tm() then tm2timestamp(). This commit eliminates call to tm2timestamp(). Instead, it directly applies timezone offset to the original timestamp value. That makes upcoming datetime overflow handling in jsonpath easier. That should also save us some CPU cycles. Discussion: https://postgr.es/m/CAPpHfdvRPRh_mTGar5WmDeRZ%3DU5dOXHdxspYYD%3D76m3knNGjXA%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane	2019-10-21 23:07:07 +03:00
Etsuro Fujita	80831bcdbe	Update obsolete comment. Commit `b52b7dc25`, which moved code creating PartitionBoundInfo in RelationBuildPartitionDesc() in partcache.c (relocated to partdesc.c afterwards) to partbounds.c, should have updated this, but didn't. Author: Etsuro Fujita Reviewed-by: Alvaro Herrera Backpatch-through: 12 Discussion: https://postgr.es/m/CAPmGK16Uxr%3DPatiGyaRwiQVLB7Y-GqbkK3AxRLVYzU0Czv%3DsEw%40mail.gmail.com	2019-10-21 17:30:00 +09:00
Amit Kapila	70a6c37d52	Fix memory leak introduced in commit `7df159a620`. We memorize all internal and empty leaf pages in the 1st vacuum stage for gist indexes. They are used in the 2nd stage, to delete all the empty pages. There was a memory context page_set_context for this purpose, but we never used it. Reported-by: Amit Kapila Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 12, where it got introduced Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com	2019-10-21 08:57:32 +05:30
Peter Eisentraut	5d3587d14b	Fix most -Wundef warnings In some cases #if was used instead of #ifdef in an inconsistent style. Cleaning this up also helps when analyzing cases like `38d8dce61f` where this makes a difference. There are no behavior changes here, but the change in pg_bswap.h would prevent possible accidental misuse by third-party code. Discussion: https://www.postgresql.org/message-id/flat/3b615ca5-c595-3f1d-fdf7-a429e564f614%402ndquadrant.com	2019-10-19 18:31:38 +02:00
Noah Misch	48cc59ed24	Use standard compare_exchange loop style in ProcArrayGroupClearXid(). Besides style, this might improve performance in the contended case. Reviewed by Amit Kapila. Discussion: https://postgr.es/m/20191015035348.GA4166224@rfd.leadboat.com	2019-10-18 20:21:10 -07:00
Michael Paquier	f25968c496	Remove last traces of heap_open/close in the tree Since pluggable storage has been introduced, those two routines have been replaced by table_open/close, with some compatibility macros still present to allow extensions to compile correctly with v12. Some code paths using the old routines still remained, so replace them. Based on the discussion done, the consensus reached is that it is better to remove those compatibility macros so as nothing new uses the old routines, so remove also the compatibility macros. Discussion: https://postgr.es/m/20191017014706.GF5605@paquier.xyz	2019-10-19 11:18:15 +09:00
Fujii Masao	ec1259e880	Fix failure of archive recovery with recovery_min_apply_delay enabled. recovery_min_apply_delay parameter is intended for use with streaming replication deployments. However, the document clearly explains that the parameter will be honored in all cases if it's specified. So it should take effect even if in archive recovery. But, previously, archive recovery with recovery_min_apply_delay enabled always failed, and caused assertion failure if --enable-caasert is enabled. The cause of this problem is that; the ownership of recoveryWakeupLatch that recovery_min_apply_delay uses was taken only when standby mode is requested. So unowned latch could be used in archive recovery, and which caused the failure. This commit changes recovery code so that the ownership of recoveryWakeupLatch is taken even in archive recovery. Which prevents archive recovery with recovery_min_apply_delay from failing. Back-patch to v9.4 where recovery_min_apply_delay was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com	2019-10-18 22:32:18 +09:00
Fujii Masao	9b95a36be8	Make crash recovery ignore recovery_min_apply_delay setting. In v11 or before, this setting could not take effect in crash recovery because it's specified in recovery.conf and crash recovery always starts without recovery.conf. But commit `2dedf4d9a8` integrated recovery.conf into postgresql.conf and which unexpectedly allowed this setting to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery_min_apply_delay setting. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net	2019-10-18 22:24:18 +09:00
Alvaro Herrera	89403ed228	Fix typo Apparently while this code was being developed, ReindexRelationConcurrently operated on multiple relations. The version that was ultimately pushed doesn't, so this comment's use of plural is inaccurate.	2019-10-18 14:49:39 +02:00
Alvaro Herrera	d2efb90dba	Update comments about progress reporting by index_drop Michaël Paquier complained that index_drop is requesting progress reporting for non-obvious reasons, so let's add a comment to explain why. Discussion: https://postgr.es/m/20191017010412.GH2602@paquier.xyz	2019-10-18 07:23:05 -03:00
Michael Paquier	3f60f690fa	Fix timeout handling in logical replication worker The timestamp tracking the last moment a message is received in a logical replication worker was initialized in each loop checking if a message was received or not, causing wal_receiver_timeout to be ignored in basically any logical replication deployments. This also broke the ping sent to the server when reaching half of wal_receiver_timeout. This simply moves the initialization of the timestamp out of the apply loop to the beginning of LogicalRepApplyLoop(). Reported-by: Jehan-Guillaume De Rorthais Author: Julien Rouhaud Discussion: https://postgr.es/m/CAOBaU_ZHESFcWva8jLjtZdCLspMj7vqaB2k++rjHLY897ZxbYw@mail.gmail.com Backpatch-through: 10	2019-10-18 14:26:29 +09:00
Alvaro Herrera	38ddeab13b	Fix minor bug in logical-replication walsender shutdown Logical walsender should exit when it catches up with sending WAL during shutdown; but there was a rare corner case when it failed to because of a race condition that puts it back to wait for more WAL instead -- but since there wasn't any, it'd not shut down immediately. It would only continue the shutdown when wal_sender_timeout terminates the sleep, which causes annoying waits during shutdown procedure. Restructure the code so that we no longer forget to set WalSndCaughtUp in that case. This was an oversight in commit `c6c333436`. Backpatch all the way down to 9.4. Author: Craig Ringer, Álvaro Herrera Discussion: https://postgr.es/m/CAMsr+YEuz4XwZX_QmnX_-2530XhyAmnK=zCmicEnq1vLr0aZ-g@mail.gmail.com	2019-10-17 15:06:06 +02:00
Thomas Munro	3c8c55dd54	When restoring GUCs in parallel workers, show an error context. Otherwise it can be hard to see where an error is coming from, when the parallel worker sets all the GUCs that it received from the leader. Bug #15726. Back-patch to 9.5, where RestoreGUCState() appeared. Reported-by: Tiago Anastacio Reviewed-by: Daniel Gustafsson, Tom Lane Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org	2019-10-17 13:47:01 +13:00
Thomas Munro	6bda2af039	Fix bug that could try to freeze running multixacts. Commits `801c2dc7` and `801c2dc7` made it possible for vacuum to try to freeze a multixact that is still running. That was prevented by a check, but raised an error. Repair. Back-patch all the way. Author: Nathan Bossart, Jeremy Schneider Reported-by: Jeremy Schneider Reviewed-by: Jim Nasby, Thomas Munro Discussion: https://postgr.es/m/DAFB8AFF-2F05-4E33-AD7F-FF8B0F760C17%40amazon.com	2019-10-17 09:59:21 +13:00
Alvaro Herrera	0d21f919eb	Fix crash when reporting CREATE INDEX progress A race condition can make us try to dereference a NULL pointer to the PGPROC struct of a process that's already finished. That results in crashes during REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY. This was introduced in `ab0dfc961b`, so backpatch to pg12. Reported by: Justin Pryzby Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/20191012004446.GT10470@telsasoft.com	2019-10-16 14:51:34 +02:00
Michael Paquier	1de4fd1092	Refresh some incorrect links in pg_crc.c/h Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm0LPk9vTGTBPBRv0=fX=94o4r6-DuBbHNeCN2AH5bufLw@mail.gmail.com	2019-10-16 15:10:14 +09:00
Thomas Munro	d5ac14f9cc	Use libc version as a collation version on glibc systems. Using glibc's version string to detect potential collation definition changes is not 100% reliable, but it's better than nothing. Currently this affects only collations explicitly provided by "libc". More work will be needed to handle the default collation. Author: Thomas Munro, based on a suggestion from Christoph Berg Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com	2019-10-16 17:28:24 +13:00
Andres Freund	cef82eda14	Fix CLUSTER on expression indexes. Since the introduction of different slot types, in `1a0586de36`, we create a virtual slot in tuplesort_begin_cluster(). While that looks right, it unfortunately doesn't actually work, as ExecStoreHeapTuple() is used to store tuples in the slot. Unfortunately no regression tests for CLUSTER on expression indexes existed so far. Fix the slot type, and add bare bones tests for CLUSTER on expression indexes. Reported-By: Justin Pryzby Author: Andres Freund Discussion: https://postgr.es/m/20191011210320.GS10470@telsasoft.com Backpatch: 12, like `1a0586de36`	2019-10-15 10:40:13 -07:00
Peter Eisentraut	bdb839cbde	Update unicode.org URLs Use https, consistent host name, remove references to ftp. Also update the URLs for CLDR, which has moved from Trac to GitHub.	2019-10-13 22:10:38 +02:00
Tom Lane	9abb2bfc04	In the postmaster, rely on the signal infrastructure to block signals. POSIX sigaction(2) can be told to block a set of signals while a signal handler executes. Make use of that instead of manually blocking and unblocking signals in the postmaster's signal handlers. This should save a few cycles, and it also prevents recursive invocation of signal handlers when many signals arrive in close succession. We have seen buildfarm failures that seem to be due to postmaster stack overflow caused by such recursion (exacerbated by a Linux PPC64 kernel bug). This doesn't change anything about the way that it works on Windows. Somebody might consider adjusting port/win32/signal.c to let it work similarly, but I'm not in a position to do that. For the moment, just apply to HEAD. Possibly we should consider back-patching this, but it'd be good to let it age awhile first. Discussion: https://postgr.es/m/14878.1570820201@sss.pgh.pa.us	2019-10-13 15:48:26 -04:00
Michael Paquier	1df5875d39	Fix dependency handling of column drop with partitioned tables When dropping a column on a partitioned table which has one or more partitioned indexes, the operation was failing as dependencies with partitioned indexes using the column dropped were not getting removed in a way consistent with the columns involved across all the relations part of an inheritance tree. This commit refactors the code executing column drop so as all the columns from an inheritance tree to remove are gathered first, and dropped all at the end. This way, we let the dependency machinery sort out by itself the deletion of all the columns with the partitioned indexes across a partition tree. This issue has been introduced by `1d92a0c`, so backpatch down to REL_12_STABLE. Author: Amit Langote, Michael Paquier Reviewed-by: Álvaro Herrera, Ashutosh Sharma Discussion: https://postgr.es/m/CA+HiwqE9kuBsZ3b5pob2-cvE8ofzPWs-og+g8bKKGnu6b4-yTQ@mail.gmail.com Backpatch-through: 12	2019-10-13 17:51:55 +09:00
Peter Eisentraut	b4675a8ae2	Fix use of term "verifier" Within the context of SCRAM, "verifier" has a specific meaning in the protocol, per RFCs. The existing code used "verifier" differently, to mean whatever is or would be stored in pg_auth.rolpassword. Fix this by using the term "secret" for this, following RFC 5803. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/be397b06-6e4b-ba71-c7fb-54cae84a7e18%402ndquadrant.com	2019-10-12 21:41:59 +02:00
Fujii Masao	20961ceaf0	Make crash recovery ignore restore_command and recovery_end_command settings. In v11 or before, those settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit `2dedf4d9a8` integrated recovery.conf into postgresql.conf and which unexpectedly allowed those settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore restore_command and recovery_end_command settings. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net	2019-10-11 15:47:59 +09:00
Andres Freund	93765bd956	Fix table rewrites that include a column without a default. In `c2fe139c20` I made ATRewriteTable() use tuple slots. Unfortunately I did not notice that columns can be added in a rewrite that do not have a default, when another column is added/altered requiring one. Initialize columns to NULL again, and add tests. Bug: #16038 Reported-By: anonymous Author: Andres Freund Discussion: https://postgr.es/m/16038-5c974541f2bf6749@postgresql.org Backpatch: 12, where the bug was introduced in `c2fe139c20`	2019-10-09 22:00:50 -07:00
Peter Eisentraut	50518ec296	Revert "Use libc version as a collation version on glibc systems." This reverts commit `9f90b1d08d`. This needs some refinements in the pg_dump and pg_upgrade tests.	2019-10-09 21:36:01 +02:00
Peter Eisentraut	9f90b1d08d	Use libc version as a collation version on glibc systems. Using glibc's version number to detect potential collation definition changes is not 100% reliable, but it's better than nothing. Author: Thomas Munro Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com	2019-10-09 21:17:47 +02:00
Michael Paquier	b8e19b932a	Flush logical mapping files with fd opened for read/write at checkpoint The file descriptor was opened with read-only to fsync a regular file, which would cause EBADFD errors on some platforms. This is similar to the recent fix done by `a586cc4b` (which was broken by me with `82a5649`), except that I noticed this issue while monitoring the backend code for similar mistakes. Backpatch to 9.4, as this has been introduced since logical decoding exists as of `b89e151`. Author: Michael Paquier Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz Backpatch-through: 9.4	2019-10-09 13:30:43 +09:00
Peter Eisentraut	38d8dce61f	Remove some code for old unsupported versions of MSVC As of `d9dd406fe2`, we require MSVC 2013, which means _MSC_VER >= 1800. This means that conditionals about older versions of _MSC_VER can be removed or simplified. Previous code was also in some cases handling MinGW, where _MSC_VER is not defined at all, incorrectly, such as in pg_ctl.c and win32_port.h, leading to some compiler warnings. This should now be handled better. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2019-10-08 10:50:54 +02:00
Michael Paquier	a7471bd85c	Update some outdated links about XLC and UNIX specification Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm3Dy=dTdx8UCVw=DWbzLzmRUC1dkq45=heOZDUg3U_PtA@mail.gmail.com	2019-10-08 14:31:30 +09:00
Tom Lane	3887e9455f	Check for too many postmaster children before spawning a bgworker. The postmaster's code path for spawning a bgworker neglected to check whether we already have the max number of live child processes. That's a bit hard to hit, since it would necessarily be a transient condition; but if we do, AssignPostmasterChildSlot() fails causing a postmaster crash, as seen in a report from Bhargav Kamineni. To fix, invoke canAcceptConnections() in the bgworker code path, as we do in the other code paths that spawn children. Since we don't want the same pmState tests in this case, add a child-process-type parameter to canAcceptConnections() so that it can know what to do. Back-patch to 9.5. In principle the same hazard exists in 9.4, but the code is enough different that this patch wouldn't quite fix it there. Given the tiny usage of bgworkers in that branch it doesn't seem worth creating a variant patch for it. Discussion: https://postgr.es/m/18733.1570382257@sss.pgh.pa.us	2019-10-07 12:39:09 -04:00
Tom Lane	ac12ab06a9	Avoid trying to release a List's initial allocation via repalloc(). Commit `1cff1b95a` included some code that supposed it could repalloc() a memory chunk to a smaller size without risk of the chunk moving. That was not a great idea, because it depended on undocumented behavior of AllocSetRealloc, which commit `c477f3e44` changed thereby breaking it. (Not to mention that this code ought to work with other memory context types, which might not work the same...) So get rid of the repalloc calls, and instead just wipe the now-unused ListCell array and/or tell Valgrind it's NOACCESS, as if we'd freed it. In cases where the initial list allocation had been quite large, this could represent an annoying waste of space. In principle we could ameliorate that by allocating the initial cell array separately when it exceeds some threshold. But that would complicate new_list() which is hot code, and the returns would materialize only in narrow cases. On balance I don't think it'd be worth it. Discussion: https://postgr.es/m/17059.1570208426@sss.pgh.pa.us	2019-10-06 12:06:30 -04:00
Tomas Vondra	36425ece5d	Change MemoryContextMemAllocated to return Size Commit `f2369bc610` switched most of the memory accounting from int64 to Size, but it forgot to change the MemoryContextMemAllocated return type. So this fixes that omission. Discussion: https://www.postgresql.org/message-id/11238.1570200198%40sss.pgh.pa.us	2019-10-05 20:49:39 +02:00
Andres Freund	d986d4e87f	Fix crash caused by EPQ happening with a before update trigger present. When ExecBRUpdateTriggers()'s GetTupleForTrigger() follows an EPQ chain the former needs to run the result tuple through the junkfilter again, and update the slot containing the new version of the tuple to contain that new version. The input tuple may already be in the junkfilter's output slot, which used to be OK - we don't need the previous version anymore. Unfortunately `ff11e7f4b9` started to use ExecCopySlot() to update newslot, and ExecCopySlot() doesn't support copying a slot into itself, leading to a slot in a corrupt state, which then can cause crashes or other symptoms. Fix this by skipping the ExecCopySlot() when copying into itself. While we could have easily made ExecCopySlot() handle that case, it seems better to add an assert forbidding doing so instead. As the goal of copying might be to make the contents of one slot independent from another, it seems failure prone to handle doing so silently. A follow-up commit will add tests for the obviously under-covered combination of EPQ and triggers. Done as a separate commit as it might make sense to backpatch them further than this bug. Also remove confusion with confusing variable names for slots in ExecBRDeleteTriggers() and ExecBRUpdateTriggers(). Bug: #16036 Reported-By: Антон Власов Author: Andres Freund Discussion: https://postgr.es/m/16036-28184c90d952fb7f@postgresql.org Backpatch: 12-, where `ff11e7f4b9` was merged	2019-10-04 13:50:49 -07:00
Andres Freund	a586cc4b6c	Use a fd opened for read/write when syncing slots during startup, take 2. Cribbing from dfbaed45975: Some operating systems, including the reporter's windows, return EBADFD or similar when fsync() is invoked on a O_RDONLY file descriptor. Unfortunately RestoreSlotFromDisk() does exactly that; which causes failures after restarts in at least some scenarios. If you hit the bug the error message will be something like ERROR: could not fsync file "pg_replslot/$name/state": Bad file descriptor Simply use O_RDWR instead of O_RDONLY when opening the relevant file descriptor to fix the bug. Unfortunately this fix was undone in `82a5649fb9`. Re-apply, and add a comment. Bug: 16039 Reported-By: Hans Buschmann Author: Andres Freund Discussion: https://postgr.es/m/16039-196fc97cc05e141c@postgresql.org Backpatch: 12-, as `82a5649fb9`	2019-10-04 13:34:28 -07:00
Robert Haas	2e8b6bfa90	Rename some toasting functions based on whether they are heap-specific. The old names for the attribute-detoasting functions names included the word "heap," which seems outdated now that the heap is only one of potentially many table access methods. On the other hand, toast_insert_or_update and toast_delete are heap-specific, so rename them by adding "heap_" as a prefix. Not all of the work of making the TOAST system fully accessible to AMs other than the heap is done yet, but there seems to be little harm in getting this renaming out of the way now. Commit `8b94dab066` already divided up the functions among various files partially according to whether it was intended that they should be heap-specific or AM-agnostic, so this is just clarifying the division contemplated by that commit. Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro, Andres Freund, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com	2019-10-04 14:24:46 -04:00
Tom Lane	61aa9f544a	Fix bitshiftright()'s zero-padding some more. Commit `5ac0d9360` failed to entirely fix bitshiftright's habit of leaving one-bits in the pad space that should be all zeroes, because in a moment of sheer brain fade I'd concluded that only the code path used for not-a-multiple-of-8 shift distances needed to be fixed. Of course, a multiple-of-8 shift distance can also cause the problem, so we need to forcibly zero the extra bits in both cases. Per bug #16037 from Alexander Lakhin. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/16037-1d1ebca564db54f4@postgresql.org	2019-10-04 10:34:40 -04:00
Tomas Vondra	f2369bc610	Use Size instead of int64 to track allocated memory Commit `5dd7fc1519` added block-level memory accounting, but used int64 variable to track the amount of allocated memory. That is incorrect, because we have Size for exactly these purposes, but it was mostly harmless until `c477f3e449` which changed how we handle with repalloc() when downsizing the chunk. Previously we've ignored these cases and just kept using the original chunk, but now we need to update the accounting, and the code was doing this: context->mem_allocated += blksize - oldblksize; Both blksize and oldblksize are Size (so unsigned) which means the subtraction underflows, producing a very high positive value. On 64-bit platforms (where Size has the same size as mem_alllocated) this happens to work because the result wraps to the right value, but on (some) 32-bit platforms this fails. This fixes two things - it changes mem_allocated (and related variables) to Size, and it splits the update to two separate steps, to prevent any underflows. Discussion: https://www.postgresql.org/message-id/15151.1570163761%40sss.pgh.pa.us	2019-10-04 16:10:56 +02:00
Robert Haas	967e276e9f	Remove AtSubStart_Notify. Allocate notify-related state lazily instead. This makes trivial subtransactions noticeably faster. Patch by me, reviewed and tested by Dilip Kumar, Kyotaro Horiguchi, and Jeevan Ladhe. Discussion: https://postgr.es/m/CA+TgmobE1J22S1eC-6N-je9LgrcwZypkwp+zH6JXo9mc=4Nk3A@mail.gmail.com	2019-10-04 08:19:25 -04:00
Tom Lane	8e10405c74	Avoid unnecessary out-of-memory errors during encoding conversion. Encoding conversion uses the very simplistic rule that the output can't be more than 4X longer than the input, and palloc's a buffer of that size. This results in failure to convert any string longer than 1/4 GB, which is becoming an annoying limitation. As a band-aid to improve matters, allow the allocated output buffer size to exceed 1GB. We still insist that the final result fit into MaxAllocSize (1GB), though. Perhaps it'd be safe to relax that restriction, but it'd require close analysis of all callers, which is daunting (not least because external modules might call these functions). For the moment, this should allow a 2X to 4X improvement in the longest string we can convert, which is a useful gain in return for quite a simple patch. Also, once we have successfully converted a long string, repalloc the output down to the actual string length, returning the excess to the malloc pool. This seems worth doing since we can usually expect to give back several MB if we take this path at all. This still leaves much to be desired, most notably that the assumption that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no guard code verifying that the output buffer isn't overrun. Fixing that would require significant changes in the encoding conversion APIs, so it'll have to wait for some other day. The present patch seems safely back-patchable, so patch all supported branches. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us	2019-10-03 17:34:25 -04:00
Tom Lane	c477f3e449	Allow repalloc() to give back space when a large chunk is downsized. Up to now, if you resized a large (>8K) palloc chunk down to a smaller size, aset.c made no attempt to return any space to the malloc pool. That's unpleasant if a really large allocation is resized to a significantly smaller size. I think no such cases existed when this code was designed, and I'm not sure whether they're common even yet, but an upcoming fix to encoding conversion will certainly create such cases. Therefore, fix AllocSetRealloc so that it gives realloc() a chance to do something with the block. This doesn't noticeably increase complexity, we mostly just have to change the order in which the cases are considered. Back-patch to all supported branches. Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us	2019-10-03 13:56:26 -04:00
Andrew Gierth	b7a1c5539a	Selectively include window frames in expression walks/mutates. query_tree_walker and query_tree_mutator were skipping the windowClause of the query, without regard for the fact that the startOffset and endOffset in a WindowClause node are expression trees that need to be processed. This was an oversight in commit `ec4be2ee6` from 2010 which added the expression fields; the main symptom is that function parameters in window frame clauses don't work in inlined functions. Fix (as conservatively as possible since this needs to not break existing out-of-tree callers) and add tests. Backpatch all the way, since this has been broken since 9.0. Per report from Alastair McKinley; fix by me with kibitzing and review from Tom Lane. Discussion: https://postgr.es/m/DB6PR0202MB2904E7FDDA9D81504D1E8C68E3800@DB6PR0202MB2904.eurprd02.prod.outlook.com	2019-10-03 10:54:52 +01:00
Michael Paquier	df86e52cac	Remove temporary WAL and history files at the end of archive recovery `cbc55da` has reworked the order of some actions at the end of archive recovery. Unfortunately this overlooked the fact that the startup process needs to remove RECOVERYXLOG (for temporary WAL segment newly recovered from archives) and RECOVERYHISTORY (for temporary history file) at this step, leaving the files around even after recovery ended. Backpatch to 9.5, like the previous commit. Author: Sawada Masahiko Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com Backpatch-through: 9.5	2019-10-02 15:53:07 +09:00

... 5 6 7 8 9 ...

20459 Commits