postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	9a3cebeaa7	Change executor to just Assert that table locks were already obtained. Instead of locking tables during executor startup, just Assert that suitable locks were obtained already during the parse/plan pipeline (or re-obtained by the plan cache). This must be so, else we have a hazard that concurrent DDL has invalidated the plan. This is pretty inefficient as well as undercommented, but it's all going to go away shortly, so I didn't try hard. This commit is just another attempt to use the buildfarm to see if we've missed anything in the plan to simplify the executor's table management. Note that the change needed here in relation_open() exposes that parallel workers now really are accessing tables without holding any lock of their own, whereas they were not doing that before this commit. This does not give me a warm fuzzy feeling about that aspect of parallel query; it does not seem like a good design, and we now know that it's had exactly no actual testing. I think that we should modify parallel query so that that change can be reverted. Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-10-03 16:05:12 -04:00
Andres Freund	c03c1449c0	Fix issues around EXPLAIN with JIT. I (Andres) was more than a bit hasty in committing `33001fd7a7` after last minute changes, leading to a number of problems (jit output was only shown for JIT in parallel workers, and just EXPLAIN without ANALYZE didn't work). Lukas luckily found these issues quickly. Instead of combining instrumentation in in standard_ExecutorEnd(), do so on demand in the new ExplainPrintJITSummary(). Also update a documentation example of the JIT output, changed in `52050ad8eb`. Author: Lukas Fittl, with minor changes by me Discussion: https://postgr.es/m/CAP53PkxmgJht69pabxBXJBM+0oc6kf3KHMborLP7H2ouJ0CCtQ@mail.gmail.com Backpatch: 11, where JIT compilation was introduced	2018-10-03 12:48:37 -07:00
Amit Kapila	9bc9f72b28	MAXALIGN the target address where we store flattened value. The API (EOH_flatten_into) that flattens the expanded value representation expects the target address to be maxaligned. All it's usage adhere to that principle except when serializing datums for parallel query. Fix that usage. Diagnosed-by: Tom Lane Author: Tom Lane and Amit Kapila Backpatch-through: 9.6 Discussion: https://postgr.es/m/11629.1536550032@sss.pgh.pa.us	2018-10-03 09:15:03 +05:30
Tom Lane	6e35939feb	Change rewriter/planner/executor/plancache to depend on RTE rellockmode. Instead of recomputing the required lock levels in all these places, just use what commit `fdba460a2` made the parser store in the RTE fields. This already simplifies the code measurably in these places, and follow-on changes will remove a bunch of no-longer-needed infrastructure. In a few cases, this change causes us to acquire a higher lock level than we did before. This is OK primarily because said higher lock level should've been acquired already at query parse time; thus, we're saving a useless extra trip through the shared lock manager to acquire a lesser lock alongside the original lock. The only known exception to this is that re-execution of a previously planned SELECT FOR UPDATE/SHARE query, for a table that uses ROW_MARK_REFERENCE or ROW_MARK_COPY methods, might have gotten only AccessShareLock before. Now it will get RowShareLock like the first execution did, which seems fine. While there's more to do, push it in this state anyway, to let the buildfarm help verify that nothing bad happened. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-10-02 14:43:09 -04:00
Andres Freund	cc2905e963	Use slots more widely in tuple mapping code and make naming more consistent. It's inefficient to use a single slot for mapping between tuple descriptors for multiple tuples, as previously done when using ConvertPartitionTupleSlot(), as that means the slot's tuple descriptors change for every tuple. Previously we also, via ConvertPartitionTupleSlot(), built new tuples after the mapping even in cases where we, immediately afterwards, access individual columns again. Refactor the code so one slot, on demand, is used for each partition. That avoids having to change the descriptor (and allows to use the more efficient "fixed" tuple slots). Then use slot->slot mapping, to avoid unnecessarily forming a tuple. As the naming between the tuple and slot mapping functions wasn't consistent, rename them to execute_attr_map_{tuple,slot}. It's likely that we'll also rename convert_tuples_by_* to denote that these functions "only" build a map, but that's left for later. Author: Amit Khandekar and Amit Langote, editorialized by me Reviewed-By: Amit Langote, Amit Khandekar, Andres Freund Discussion: https://postgr.es/m/CAJ3gD9fR0wRNeAE8VqffNTyONS_UfFPRpqxhnD9Q42vZB+Jvpg@mail.gmail.com https://postgr.es/m/e4f9d743-cd4b-efb0-7574-da21d86a7f36%40lab.ntt.co.jp Backpatch: -	2018-10-02 11:14:26 -07:00
Tom Lane	3d0f68dd30	Fix corner-case failures in has_foo_privilege() family of functions. The variants of these functions that take numeric inputs (OIDs or column numbers) are supposed to return NULL rather than failing on bad input; this rule reduces problems with snapshot skew when queries apply the functions to all rows of a catalog. has_column_privilege() had careless handling of the case where the table OID didn't exist. You might get something like this: select has_column_privilege(9999,'nosuchcol','select'); ERROR: column "nosuchcol" of relation "(null)" does not exist or you might get a crash, depending on the platform's printf's response to a null string pointer. In addition, while applying the column-number variant to a dropped column returned NULL as desired, applying the column-name variant did not: select has_column_privilege('mytable','........pg.dropped.2........','select'); ERROR: column "........pg.dropped.2........" of relation "mytable" does not exist It seems better to make this case return NULL as well. Also, the OID-accepting variants of has_foreign_data_wrapper_privilege, has_server_privilege, and has_tablespace_privilege didn't follow the principle of returning NULL for nonexistent OIDs. Superusers got TRUE, everybody else got an error. Per investigation of Jaime Casanova's report of a new crash in HEAD. These behaviors have been like this for a long time, so back-patch to all supported branches. Patch by me; thanks to Stephen Frost for discussion and review Discussion: https://postgr.es/m/CAJGNTeP=-6Gyqq5TN9OvYEydi7Fv1oGyYj650LGTnW44oAzYCg@mail.gmail.com	2018-10-02 11:54:12 -04:00
Michael Paquier	e3a25ab9ea	Refactor relation opening for VACUUM and ANALYZE VACUUM and ANALYZE share similar logic when it comes to opening a relation to work on in terms of how the relation is opened, in which order locks are tried and how logs should be generated when something does not work as expected. This commit refactors things so as both use the same code path to handle the way a relation is opened, so as the integration of new options becomes easier. Author: Michael Paquier Reviewed-by: Nathan Bossart Discussion: https://postgr.es/m/20180927075152.GT1659@paquier.xyz	2018-10-02 08:53:38 +09:00
Peter Eisentraut	cf3dfea45b	Change PROCEDURE to FUNCTION in CREATE EVENT TRIGGER syntax This was claimed to have been done in `0a63f996e0`, but that actually only changed the documentation and not the grammar. (That commit did fully change it for CREATE TRIGGER.)	2018-10-01 23:02:55 +02:00
Tom Lane	b04aeb0a05	Add assertions that we hold some relevant lock during relation open. Opening a relation with no lock at all is unsafe; there's no guarantee that we'll see a consistent state of the relevant catalog entries. While use of MVCC scans to read the catalogs partially addresses that complaint, it's still possible to switch to a new catalog snapshot partway through loading the relcache entry. Moreover, whether or not you trust the reasoning behind sometimes using less than AccessExclusiveLock for ALTER TABLE, that reasoning is certainly not valid if concurrent users of the table don't hold a lock corresponding to the operation they want to perform. Hence, add some assertion-build-only checks that require any caller of relation_open(x, NoLock) to hold at least AccessShareLock. This isn't a full solution, since we can't verify that the lock level is semantically appropriate for the action --- but it's definitely of some use, because it's already caught two bugs. We can also assert that callers of addRangeTableEntryForRelation() hold at least the lock level specified for the new RTE. Amit Langote and Tom Lane Discussion: https://postgr.es/m/16565.1538327894@sss.pgh.pa.us	2018-10-01 12:43:21 -04:00
Tom Lane	e27453bd83	Fix ALTER COLUMN TYPE to not open a relation without any lock. If the column being modified is referenced by a foreign key constraint of another table, ALTER TABLE would open the other table (to re-parse the constraint's definition) without having first obtained a lock on it. This was evidently intentional, but that doesn't mean it's really safe. It's especially not safe in 9.3, which pre-dates use of MVCC scans for catalog reads, but even in current releases it doesn't seem like a good idea. We know we'll need AccessExclusiveLock shortly to drop the obsoleted constraint, so just get that a little sooner to close the hole. Per testing with a patch that complains if we open a relation without holding any lock on it. I don't plan to back-patch that patch, but we should close the holes it identifies in all supported branches. Discussion: https://postgr.es/m/2038.1538335244@sss.pgh.pa.us	2018-10-01 11:39:13 -04:00
Tom Lane	fdba460a26	Create an RTE field to record the query's lock mode for each relation. Add RangeTblEntry.rellockmode, which records the appropriate lock mode for each RTE_RELATION rangetable entry (either AccessShareLock, RowShareLock, or RowExclusiveLock depending on the RTE's role in the query). This patch creates the field and makes all creators of RTE nodes fill it in reasonably, but for the moment nothing much is done with it. The plan is to replace assorted post-parser logic that re-determines the right lockmode to use with simple uses of rte->rellockmode. For now, just add Asserts in each of those places that the rellockmode matches what they are computing today. (In some cases the match isn't perfect, so the Asserts are weaker than you might expect; but this seems OK, as per discussion.) This passes check-world for me, but it seems worth pushing in this state to see if the buildfarm finds any problems in cases I failed to test. catversion bump due to change of stored rules. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-09-30 13:55:51 -04:00
Stephen Frost	8bddc86400	Add application_name to connection authorized msg The connection authorized message has quite a bit of useful information in it, but didn't include the application_name (when provided), so let's add that as it can be very useful. Note that at the point where we're emitting the connection authorized message, we haven't processed GUCs, so it's not possible to get this by using log_line_prefix (which pulls from the GUC). There's also something to be said for having this included in the connection authorized message and then not needing to repeat it for every line, as having it in log_line_prefix would do. The GUC cleans the application name to pure-ascii, so do that here too, but pull out the logic for cleaning up a string into its own function in common and re-use it from those places, and check_cluster_name which was doing the same thing. Author: Don Seiler <don@seiler.us> Discussion: https://postgr.es/m/CAHJZqBB_Pxv8HRfoh%2BAB4KxSQQuPVvtYCzMg7woNR3r7dfmopw%40mail.gmail.com	2018-09-28 19:04:50 -04:00
Tom Lane	2b04dfc472	Improve error reporting for unsupported effective_io_concurrency setting. Give a specific error complaining about lack of posix_fadvise() when someone tries to set effective_io_concurrency > 0 on platforms without that. This probably isn't worth extensive back-patching, but I (tgl) felt cramming it into v11 was reasonable. James Robinson Discussion: https://postgr.es/m/153771876450.14994.560017943128223619@wrigleys.postgresql.org Discussion: https://postgr.es/m/A3942987-5BC7-4F05-B54D-2A0EC2914B33@jlr-photo.com	2018-09-28 16:12:13 -04:00
Amit Kapila	a86bf6057e	Fix assertion failure when updating full_page_writes for checkpointer. When the checkpointer receives a SIGHUP signal to update its configuration, it may need to update the shared memory for full_page_writes and need to write a WAL record for it. Now, it is quite possible that the XLOG machinery has not been initialized by that time and it will lead to assertion failure while doing that. Fix is to allow the initialization of the XLOG machinery outside critical section. This bug has been introduced by the commit `2c03216d83` which added the XLOG machinery initialization in RecoveryInProgress code path. Reported-by: Dilip Kumar Author: Dilip Kumar Reviewed-by: Michael Paquier and Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CAFiTN-u4BA8KXcQUWDPNgaKAjDXC=C2whnzBM8TAcv=stckYUw@mail.gmail.com	2018-09-28 16:40:04 +05:30
Michael Paquier	78ea8b5daa	Fix WAL recycling on standbys depending on archive_mode A restart point or a checkpoint recycling WAL segments treats segments marked with neither ".done" (archiving is done) or ".ready" (segment is ready to be archived) in archive_status the same way for archive_mode being "on" or "always". While for a primary this is fine, a standby running a restart point with archive_mode = on would try to mark such a segment as ready for archiving, which is something that will never happen except after the standby is promoted. Note that this problem applies only to WAL segments coming from the local pg_wal the first time archive recovery is run. Segments part of a self-contained base backup are the most common case where this could happen, however even in this case normally the .done markers would be most likely part of the backup. Segments recovered from an archive are marked as .ready or .done by the startup process, and segments finished streaming are marked as such by the WAL receiver, so they are handled already. Reported-by: Haruka Takatsuka Author: Michael Paquier Discussion: https://postgr.es/m/15402-a453c90ed4cf88b2@postgresql.org Backpatch-through: 9.5, where archive_mode = always has been added.	2018-09-28 11:54:38 +09:00
Tom Lane	aaf10f32a3	Fix assorted bugs in pg_get_partition_constraintdef(). It failed if passed a nonexistent relation OID, or one that was a non-heap relation, because of blindly applying heap_open to a user-supplied OID. This is not OK behavior for a SQL-exposed function; we have a project policy that we should return NULL in such cases. Moreover, since pg_get_partition_constraintdef ought now to work on indexes, restricting it to heaps is flat wrong anyway. The underlying function generate_partition_qual() wasn't on board with indexes having partition quals either, nor for that matter with rels having relispartition set but yet null relpartbound. (One wonders whether the person who wrote the function comment blocks claiming that these functions allow a missing relpartbound had ever tested it.) Fix by testing relispartition before opening the rel, and by using relation_open not heap_open. (If any other relkinds ever grow the ability to have relispartition set, the code will work with them automatically.) Also, don't reject null relpartbound in generate_partition_qual. Back-patch to v11, and all but the null-relpartbound change to v10. (It's not really necessary to change generate_partition_qual at all in v10, but I thought s/heap_open/relation_open/ would be a good idea anyway just to keep the code in sync with later branches.) Per report from Justin Pryzby. Discussion: https://postgr.es/m/20180927200020.GJ776@telsasoft.com	2018-09-27 18:15:17 -04:00
Alexander Korotkov	4ec90f53f1	Minor formatting cleanup for `2a6368343f`	2018-09-27 23:29:50 +03:00
Alexander Korotkov	0f64595894	Remove extra usage of BoxPGetDatum() macro Author: Mark Dilger Discussion: https://postgr.es/m/B2AEFCD0-836D-4654-9D59-3DF616E0A6F3%40gmail.com	2018-09-27 23:25:22 +03:00
Andres Freund	27e082b0c6	Clean up in the wake of TupleDescGetSlot() removal / `10763358c3`. The previous commit wasn't careful enough to remove all traces of TupleDescGetSlot(). Besides fixing the oversight of not removing TupleDescGetSlot()'s declaration, this also removes FuncCallContext->slot. That was documented to be for use in combination with TupleDescGetSlot(), a cursory search over extensions finds no users, and there doesn't seem to be convincing reasons to keep it around. If we later in the v12 release cycle find users, we can re-consider this part of the commit. Reported-By: Michael Paquier Discussion: https://postgr.es/m/20180926000413.GC1659@paquier.xyz	2018-09-27 11:38:11 -07:00
Michael Paquier	ba16aade33	Switch flags tracking pending interrupts to sig_atomic_t Those previously used bool, which should be safe on any modern platforms, however the C standard is clear that it is better to use sig_atomic_t for variables manipulated in signal handlers. This commit adds at the same time PGDLLIMPORT to ClientConnectionLost. Author: Michael Paquier Reviewed-by: Tom Lane, Chris Travers, Andres Freund Discussion: https://postgr.es/m/20180925011311.GD1354@paquier.xyz	2018-09-27 07:47:20 +09:00
Peter Eisentraut	0320ddaf3c	Recurse to sequences on ownership change for all relkinds When a table ownership is changed, we must apply that also to any owned sequences. (Otherwise, it would result in a situation that cannot be restored, because linked sequences must have the same owner as the table.) But this was previously only applied to regular tables and materialized views. But it should also apply to at least foreign tables. This patch removes the relkind check altogether, because it doesn't save very much and just introduces the possibility of similar omissions. Bug: #15238 Reported-by: Christoph Berg <christoph.berg@credativ.de>	2018-09-26 20:19:15 +02:00
Tom Lane	d6c55de1f9	Implement %m in src/port/snprintf.c, and teach elog.c to rely on that. I started out with the idea that we needed to detect use of %m format specs in contexts other than elog/ereport calls, because we couldn't rely on that working in *printf calls. But a better answer is to fix things so that it does work. Now that we're using snprintf.c all the time, we can implement %m in that and we've fixed the problem. This requires also adjusting our various printf-wrapping functions so that they ensure "errno" is preserved when they call snprintf.c. Remove elog.c's handmade implementation of %m, and let it rely on snprintf to support the feature. That should provide some performance gain, though I've not attempted to measure it. There are a lot of places where we could now simplify 'printf("%s", strerror(errno))' into 'printf("%m")', but I'm not in any big hurry to make that happen. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us	2018-09-26 13:31:56 -04:00
Tom Lane	26e9d4d4ef	Convert elog.c's useful_strerror() into a globally-used strerror wrapper. elog.c has long had a private strerror wrapper that handles assorted possible failures or deficiencies of the platform's strerror. On Windows, it also knows how to translate Winsock error codes, which the native strerror does not. Move all this code into src/port/strerror.c and define strerror() as a macro that invokes it, so that both our frontend and backend code will have all of this behavior. I believe this constitutes an actual bug fix on Windows, since AFAICS our frontend code did not report Winsock error codes properly before this. However, the main point is to lay the groundwork for implementing %m in src/port/snprintf.c: the behavior we want %m to have is this one, not the native strerror's. Note that this throws away the prior use of src/port/strerror.c, which was to implement strerror() on platforms lacking it. That's been dead code for nigh twenty years now, since strerror() was already required by C89. We should likewise cause strerror_r to use this behavior, but I'll tackle that separately. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us	2018-09-26 11:06:42 -04:00
Peter Eisentraut	a49ceda6a0	Update dummy CREATE ASSERTION grammar While we are probably still far away from fully implementing assertions, all patch proposals appear to take issue with the existing dummy grammar CREATE/DROP ASSERTION productions, so update those a little bit. Rename the rule, use any_name instead of name, and remove some unused code. Also remove the production for DROP ASSERTION, since that would most likely be handled via the generic DROP support. extracted from a patch by Joe Wildish	2018-09-26 13:26:24 +02:00
Tomas Vondra	2e2a392de3	Fix problems in handling the line data type According to the source history, the internal format of line data type has changed, but various functions working with it did were not updated and thus were producing wrong results. This patch addresses various such issues, in particular: * Reject invalid specification A=B=0 on receive * Reject same points on line_construct_pp() * Fix perpendicular operator when negative values are involved * Avoid division by zero on perpendicular operator * Fix intersection and distance operators when neither A nor B are 1 * Return NULL for closest point when objects are parallel * Check whether closest point of line segments is the intersection point * Fix closest point of line segments being on the wrong segment Aside from handling those issues, the patch also aims to make operators more symmetric and less sen to precision loss. The EPSILON interferes with even minor changes, but the least we can do is applying it to both sides of the operators equally. Author: Emre Hasegeli Reviewed-by: Tomas Vondra Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com	2018-09-26 10:25:24 +02:00
Michael Paquier	8d28bf500f	Rework activation of commit timestamps during recovery The activation and deactivation of commit timestamp tracking has not been handled consistently for a primary or standbys at recovery. The facility can be activated at three different moments of recovery: - The beginning, where a primary would use the GUC value for the decision-making, and where a standby relies on the contents of the control file. - When replaying a XLOG_PARAMETER_CHANGE record at redo. - The end, where both primary and standby rely on the GUC value. Using the GUC value for a primary at the beginning of recovery causes problems with commit timestamp access when doing crash recovery. Particularly, when replaying transaction commits, it could be possible that an attempt to read commit timestamps is done for a transaction which committed at a moment when track_commit_timestamp was disabled. A test case is added to reproduce the failure. The test works down to v11 as it takes advantage of transaction commits within procedures. Reported-by: Hailong Li Author: Masahiko Sawasa, Michael Paquier Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/11224478-a782-203b-1f17-e4797b39bdf0@qunar.com Backpatch-through: 9.5, where commit timestamps have been introduced.	2018-09-26 10:25:54 +09:00
Andres Freund	10763358c3	Remove absolete function TupleDescGetSlot(). TupleDescGetSlot() was kept around for backward compatibility for user-written SRFs. With the TupleTableSlot abstraction work, that code will need to be version specific anyway, so there's no point in keeping the function around any longer. Author: Ashutosh Bapat Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-09-25 16:28:57 -07:00
Andres Freund	29c94e03c7	Split ExecStoreTuple into ExecStoreHeapTuple and ExecStoreBufferHeapTuple. Upcoming changes introduce further types of tuple table slots, in preparation of making table storage pluggable. New storage methods will have different representation of tuples, therefore the slot accessor should refer explicitly to heap tuples. Instead of just renaming the functions, split it into one function that accepts heap tuples not residing in buffers, and one accepting ones in buffers. Previously one function was used for both, but that was a bit awkward already, and splitting will allow us to represent slot types for tuples in buffers and normal memory separately. This is split out from the patch introducing abstract slots, as this largely consists out of mechanical changes. Author: Ashutosh Bapat Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-09-25 16:27:48 -07:00
Andres Freund	bbdfbb9154	Remove function list from prologue of execTuples.c. That section is never in sync with the actual routines available and their functionality. Author: Ashutosh Bapat Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-09-25 16:27:48 -07:00
Andres Freund	a598708ffa	Change TupleTableSlot->tts_nvalid to type AttrNumber. Previously it was an int / 4 bytes. The maximum number of attributes in a tuple is restricted by the maximum value Var->varattno, which is an AttrNumber/int16. Hence use the same data type for TupleTableSlot->tts_nvalid. Author: Ashutosh Bapat Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-09-25 15:59:46 -07:00
Alvaro Herrera	5913b9bbf3	Remove obsolete comment The documented shortcoming was actually fixed in `4c728f3829` so the comment is not true anymore.	2018-09-25 17:55:22 -03:00
Andres Freund	33001fd7a7	Collect JIT instrumentation from workers. Previously, when using parallel query, EXPLAIN (ANALYZE)'s JIT compilation timings did not include the overhead from doing so on the workers. Fix that. We do so by simply aggregating the cost of doing JIT compilation on workers and the leader together. Arguably that's not quite accurate, because the total time spend doing so is spent in parallel - but it's hard to do much better. For additional detail, when VERBOSE is specified, the stats for workers are displayed separately. Author: Amit Khandekar and Andres Freund Discussion: https://postgr.es/m/CAJ3gD9eLrz51RK_gTkod+71iDcjpB_N8eC6vU2AW-VicsAERpQ@mail.gmail.com Backpatch: 11-	2018-09-25 13:12:44 -07:00
Thomas Munro	64171b3206	Constify dsa_size_class_map and use a better type. Author: Mark G Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAEeOP_Zy_FvVwcAU0UX9nkOhnoR5KN%3D0B6LWX_kv0ZuSc4wbGw%40mail.gmail.com	2018-09-25 14:58:41 +12:00
Tom Lane	fd582317e1	Sync our Snowball stemmer dictionaries with current upstream. We haven't touched these since text search functionality landed in core in 2007 :-(. While the upstream project isn't a beehive of activity, they do make additions and bug fixes from time to time. Update our copies of these files. Also update our documentation about how to keep things in sync, since they're not making distribution tarballs these days. Fortunately, their source code turns out to be a breeze to build. Notable changes: * The non-UTF8 version of the hungarian stemmer now works in LATIN2 not LATIN1. * New stemmers have appeared for arabic, indonesian, irish, lithuanian, nepali, and tamil. These all work in UTF8, and the indonesian and irish ones also work in LATIN1. (There are some new stemmers that I did not incorporate, mainly because their names don't match the underlying languages, suggesting that they're not to be considered mainstream.) Worth noting: the upstream Nepali dictionary was contributed by Arthur Zakirov. initdb forced because the contents of snowball_create.sql have changed. Still TODO: see about updating the stopword lists. Arthur Zakirov, minor mods and doc work by me Discussion: https://postgr.es/m/20180626122025.GA12647@zakirov.localdomain Discussion: https://postgr.es/m/20180219140849.GA9050@zakirov.localdomain	2018-09-24 17:29:38 -04:00
Andres Freund	52050ad8eb	Make EXPLAIN output for JIT compilation more dense. A discussion about also reporting JIT compilation overhead on workers brought unhappiness with the verbosity of the current explain format to light. Make the text format more dense, and restructure the structured output to mirror that more closely. As we're re-jiggering the output format anyway: The denser format allows us to report all flags for JIT compilation (now also reporting PGJIT_EXPR and PGJIT_DEFORM), and report the total time in addition to the individual times. Per complaint from Tom Lane. Author: Andres Freund Discussion: https://postgr.es/m/27812.1537221015@sss.pgh.pa.us Backpatch: 11-, where JIT compilation was introduced	2018-09-24 13:35:45 -07:00
Andrew Dunstan	7636e5c60f	Fast default trigger and expand_tuple fixes Ensure that triggers get properly filled in tuples for the OLD value. Also fix the logic of detecting missing null values. The previous logic failed to detect a missing null column before the first missing column with a default. Fixing this has simplified the logic a bit. Regression tests are added to test changes. This should ensure better coverage of expand_tuple(). Original bug reports, and some code and test scripts from Tomas Vondra Backpatch to release 11.	2018-09-24 16:11:24 -04:00
Tom Lane	87d9bbca13	Fix over-allocation of space for array_out()'s result string. array_out overestimated the space needed for its output, possibly by a very substantial amount if the array is multi-dimensional, because of wrong order of operations in the loop that counts the number of curly-brace pairs needed. While the output string is normally short-lived, this could still cause problems in extreme cases. An additional minor error was that it counted one more delimiter than is actually needed. Repair those errors, add an Assert that the space is now correctly calculated, and make some minor improvements in the comments. I also failed to resist the temptation to get rid of an integer modulus operation per array element; a simple comparison is sufficient. This bug dates clear back to Berkeley days, so back-patch to all supported versions. Keiichi Hirobe, minor additional work by me Discussion: https://postgr.es/m/CAH=EFxE9W0tRvQkixR2XJRRCToUYUEDkJZk6tnADXugPBRdcdg@mail.gmail.com	2018-09-24 11:30:59 -04:00
Joe Conway	c62dd80cdf	Document aclitem functions and operators aclitem functions and operators have been heretofore undocumented. Fix that. While at it, ensure the non-operator aclitem functions have pg_description strings. Does not seem worthwhile to back-patch. Author: Fabien Coelho, with pg_description from John Naylor, and significant refactoring and editorialization by me. Reviewed by: Tom Lane Discussion: https://postgr.es/m/flat/alpine.DEB.2.21.1808010825490.18204%40lancre	2018-09-24 10:14:57 -04:00
Noah Misch	d18f6674bd	Initialize random() in bootstrap/stand-alone postgres and in initdb. This removes a difference between the standard IsUnderPostmaster execution environment and that of --boot and --single. In a stand-alone backend, "SELECT random()" always started at the same seed. On a system capable of using posix shared memory, initdb could still conclude "selecting dynamic shared memory implementation ... sysv". Crashed --boot or --single postgres processes orphaned shared memory objects having names that collided with the not-actually-random names that initdb probed. The sysv fallback appeared after ten crashes of --boot or --single postgres. Since --boot and --single are rare in production use, systems used for PostgreSQL development are the principal candidate to notice this symptom. Back-patch to 9.3 (all supported versions). PostgreSQL 9.4 introduced dynamic shared memory, but 9.3 does share the "SELECT random()" problem. Reviewed by Tom Lane and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20180915221546.GA3159382@rfd.leadboat.com	2018-09-23 22:56:39 -07:00
Tom Lane	89b280e139	Fix failure in WHERE CURRENT OF after rewinding the referenced cursor. In a case where we have multiple relation-scan nodes in a cursor plan, such as a scan of an inheritance tree, it's possible to fetch from a given scan node, then rewind the cursor and fetch some row from an earlier scan node. In such a case, execCurrent.c mistakenly thought that the later scan node was still active, because ExecReScan hadn't done anything to make it look not-active. We'd get some sort of failure in the case of a SeqScan node, because the node's scan tuple slot would be pointing at a HeapTuple whose t_self gets reset to invalid by heapam.c. But it seems possible that for other relation scan node types we'd actually return a valid tuple TID to the caller, resulting in updating or deleting a tuple that shouldn't have been considered current. To fix, forcibly clear the ScanTupleSlot in ExecScanReScan. Another issue here, which seems only latent at the moment but could easily become a live bug in future, is that rewinding a cursor does not necessarily lead to immediately applying ExecReScan to every scan-level node in the plan tree. Upper-level nodes will think that they can postpone that call if their child node is already marked with chgParam flags. I don't see a way for that to happen today in a plan tree that's simple enough for execCurrent.c's search_plan_tree to understand, but that's one heck of a fragile assumption. So, add some logic in search_plan_tree to detect chgParam flags being set on nodes that it descended to/through, and assume that that means we should consider lower scan nodes to be logically reset even if their ReScan call hasn't actually happened yet. Per bug #15395 from Matvey Arye. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/153764171023.14986.280404050547008575@wrigleys.postgresql.org	2018-09-23 16:05:45 -04:00
Alexander Korotkov	2f39106a20	Replace CAS loop with single TAS in ProcArrayGroupClearXid() Single pg_atomic_exchange_u32() is expected to be faster than loop of pg_atomic_compare_exchange_u32(). Also, it would be consistent with clog group update code. Discussion: https://postgr.es/m/CAPpHfdtxLsC-bqfxFcHswZ91OxXcZVNDBBVfg9tAWU0jvn1tQA%40mail.gmail.com Reviewed-by: Amit Kapila	2018-09-22 16:22:30 +03:00
Michael Paquier	db361db2fc	Make GUC wal_sender_timeout user-settable Being able to use a value that can be changed on a connection basis is useful with clusters distributed geographically, and makes failure detection more flexible. A note is added in the documentation about the use of "options" in primary_conninfo, which can be hard to grasp for newcomers with the need of two single quotes when listing a set of parameters. Author: Tsunakawa Takayuki Reviewed-by: Masahiko Sawada, Michael Paquier Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1FAAD3AE@G01JPEXMBYT05	2018-09-22 15:23:59 +09:00
Thomas Munro	f025bd2ddd	Use size_t consistently in dsa.{ch}. Takeshi Ideriha complained that there is a mixture of Size and size_t in dsa.c and corresponding header. Let's use size_t. Back-patch to 10 where dsa.c landed, to make future back-patching easy. Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A6F19ABD9%40G01JPEXMBKW04	2018-09-22 00:40:13 +12:00
Tom Lane	1dba1b61c2	Add a "return" statement to pacify perlcritic. Per buildfarm member crake.	2018-09-20 16:10:23 -04:00
Tom Lane	3dc820c43e	Teach genbki.pl to auto-generate pg_type entries for array types. This eliminates some more tedium in adding new catalog entries, specifically the need to set up an array type when adding a new built-in data type. Now it's sufficient to assign an OID for the array type and write it in an "array_type_oid" metadata field. You don't have to fill the base type's typarray link explicitly, either. No catversion bump since the contents of pg_type aren't changed. (Well, their order might be different, but that doesn't matter.) John Naylor, reviewed and whacked around a bit by Dagfinn Ilmari Mannsåker, and some more by me. Discussion: https://postgr.es/m/CAJVSVGVTb6m9pJF49b3SuA8J+T-THO9c0hxOmoyv-yGKh-FbNg@mail.gmail.com	2018-09-20 15:14:46 -04:00
Alexander Korotkov	09e99ce86e	Fix handling of format string text characters in to_timestamp()/to_date() `cf984672` introduced improvement of handling of spaces and separators in to_timestamp()/to_date() functions. In particular, now we're skipping spaces both before and after fields. That may cause format string text character to consume part of field in the situations, when it didn't happen before `cf984672`. This commit cause format string text character consume input string characters only when since previous field (or string beginning) number of skipped input string characters is not greater than number of corresponding format string characters (that is we didn't skip any extra characters in input string).	2018-09-20 15:48:04 +03:00
Thomas Munro	38763d6778	Fix segment_bins corruption in dsa.c. If a segment has been freed by dsa.c because it is entirely empty, other backends must make sure to unmap it before following links to new segments that might happen to have the same index number, or they could finish up looking at a defunct segment and then corrupt the segment_bins lists. The correct protocol requires checking freed_segment_counter after acquiring the area lock and before resolving any index number to a segment. Add the missing checks and an assertion. Back-patch to 10, where dsa.c first arrived. Author: Thomas Munro Reported-by: Tomas Vondra Discussion: https://postgr.es/m/CAEepm%3D0thg%2Bja5zGVa7jBy-uqyHrTqTm8HGhEOtMmigGrAqTbw%40mail.gmail.com	2018-09-20 15:52:39 +12:00
Thomas Munro	6c3c9d4189	Defer restoration of libraries in parallel workers. Several users of extensions complained of crashes in parallel workers that turned out to be due to syscache access from their _PG_init() functions. Reorder the initialization of parallel workers so that libraries are restored after the caches are initialized, and inside a transaction. This was reported in bug #15350 and elsewhere. We don't consider it to be a bug: extensions shouldn't do that, because then they can't be used in shared_preload_libraries. However, it's a fairly obscure hazard and these extensions worked in practice before parallel query came along. So let's make it work. Later commits might add a warning message and eventually an error. Back-patch to 9.6, where parallel query landed. Author: Thomas Munro Reviewed-by: Amit Kapila Reported-by: Kieran McCusker, Jimmy Discussion: https://postgr.es/m/153512195228.1489.8545997741965926448%40wrigleys.postgresql.org	2018-09-20 14:21:18 +12:00
Tom Lane	0d38e4ebb7	Fix minor error message style guide violation. No periods at the ends of primary error messages, please. Daniel Gustafsson Discussion: https://postgr.es/m/43E004C0-18C6-42B4-A313-003B43EB0571@yesql.se	2018-09-19 17:06:40 -04:00
Tom Lane	8f0de712c3	Don't ignore locktable-full failures in StandbyAcquireAccessExclusiveLock. Commit `37c54863c` removed the code in StandbyAcquireAccessExclusiveLock that checked the return value of LockAcquireExtended. That created a bug, because it's still passing reportMemoryError = false to LockAcquireExtended, meaning that LOCKACQUIRE_NOT_AVAIL will be returned if we're out of shared memory for the lock table. In such a situation, the startup process would believe it had acquired an exclusive lock even though it hadn't, with potentially dire consequences. To fix, just drop the use of reportMemoryError = false, which allows us to simplify the call into a plain LockAcquire(). It's unclear that the locktable-full situation arises often enough that it's worth having a better recovery method than crash-and-restart. (I strongly suspect that the only reason the code path existed at all was that it was relatively simple to do in the pre-37c54863c implementation. But now it's not.) LockAcquireExtended's reportMemoryError parameter is now dead code and could be removed. I refrained from doing so, however, because there was some interest in resurrecting the behavior if we do get reports of locktable-full failures in the field. Also, it seems unwise to remove the parameter concurrently with shipping commit `f868a8143`, which added a parameter; if there are any third-party callers of LockAcquireExtended, we want them to get a wrong-number-of-parameters compile error rather than a possibly-silent misinterpretation of its last parameter. Back-patch to 9.6 where the bug was introduced. Discussion: https://postgr.es/m/6202.1536359835@sss.pgh.pa.us	2018-09-19 12:43:51 -04:00
Alexander Korotkov	2a6368343f	Add support for nearest-neighbor (KNN) searches to SP-GiST Currently, KNN searches were supported only by GiST. SP-GiST also capable to support them. This commit implements that support. SP-GiST scan stack is replaced with queue, which serves as stack if no ordering is specified. KNN support is provided for three SP-GIST opclasses: quad_point_ops, kd_point_ops and poly_ops (catversion is bumped). Some common parts between GiST and SP-GiST KNNs are extracted into separate functions. Discussion: https://postgr.es/m/570825e8-47d0-4732-2bf6-88d67d2d51c8%40postgrespro.ru Author: Nikita Glukhov, Alexander Korotkov based on GSoC work by Vlad Sterzhanov Review: Andrey Borodin, Alexander Korotkov	2018-09-19 01:54:10 +03:00
Tom Lane	d0cfc3d6a4	Add a debugging option to stress-test outfuncs.c and readfuncs.c. In the normal course of operation, query trees will be serialized only if they are stored as views or rules; and plan trees will be serialized only if they get passed to parallel-query workers. This leaves an awful lot of opportunity for bugs/oversights to not get detected, as indeed we've just been reminded of the hard way. To improve matters, this patch adds a new compile option WRITE_READ_PARSE_PLAN_TREES, which is modeled on the longstanding option COPY_PARSE_PLAN_TREES; but instead of passing all parse and plan trees through copyObject, it passes them through nodeToString + stringToNode. Enabling this option in a buildfarm animal or two will catch problems at least for cases that are exercised by the regression tests. A small problem with this idea is that readfuncs.c historically has discarded location fields, on the reasonable grounds that parse locations in a retrieved view are not relevant to the current query. But doing that in WRITE_READ_PARSE_PLAN_TREES breaks pg_stat_statements, and it could cause problems for future improvements that might try to report error locations at runtime. To fix that, provide a variant behavior in readfuncs.c that makes it restore location fields when told to. In passing, const-ify the string arguments of stringToNode and its subsidiary functions, just because it annoyed me that they weren't const already. Discussion: https://postgr.es/m/17114.1537138992@sss.pgh.pa.us	2018-09-18 17:11:54 -04:00
Tom Lane	db1071d4ee	Fix some minor issues exposed by outfuncs/readfuncs testing. A test patch to pass parse and plan trees through outfuncs + readfuncs exposed several issues that need to be fixed to get clean matches: Query.withCheckOptions failed to get copied; it's intentionally ignored by outfuncs/readfuncs on the grounds that it'd always be NIL anyway in stored rules. This seems less than future-proof, and it's not even saving very much, so just undo the decision and treat the field like all others. Several places that convert a view RTE into a subquery RTE, or similar manipulations, failed to clear out fields that were specific to the original RTE type and should be zero in a subquery RTE. Since readfuncs.c will leave such fields as zero, equalfuncs.c thinks the nodes are different leading to a reported mismatch. It seems like a good idea to clear out the no-longer-needed fields, even though in principle nothing should look at them; the node ought to be indistinguishable from how it would look if we'd built a new node instead of scribbling on the old one. BuildOnConflictExcludedTargetlist randomly set the resname of some TargetEntries to "" not NULL. outfuncs/readfuncs don't distinguish those cases, and so the string will read back in as NULL ... but equalfuncs.c does distinguish. Perhaps we ought to try to make things more consistent in this area --- but it's just useless extra code space for BuildOnConflictExcludedTargetlist to not use NULL here, so I fixed it for now by making it do that. catversion bumped because the change in handling of Query.withCheckOptions affects stored rules. Discussion: https://postgr.es/m/17114.1537138992@sss.pgh.pa.us	2018-09-18 15:08:28 -04:00
Tom Lane	09991e5a47	Fix some probably-minor oversights in readfuncs.c. The system expects TABLEFUNC RTEs to have coltypes, coltypmods, and colcollations lists, but outfuncs doesn't dump them and readfuncs doesn't restore them. This doesn't cause obvious failures, because the only things that look at those fields are expandRTE() and get_rte_attribute_type(), which are mostly used during parse analysis, before anything would've passed the parsetree through outfuncs/readfuncs. But expandRTE() is used in build_physical_tlist(), which means that that function will return a wrong answer for a TABLEFUNC RTE that came from a view. Very accidentally, this doesn't cause serious problems, because what it will return is NIL which callers will interpret as "couldn't build a physical tlist because of dropped columns". So you still get a plan that works, though it's marginally less efficient than it could be. There are also some other expandRTE() calls associated with transformation of whole-row Vars in the planner. I have been unable to exhibit misbehavior from that, and it may be unreachable in any case that anyone would care about ... but I'm not entirely convinced, so this seems like something we should back- patch a fix for. Fortunately, we can fix it without forcing a change of stored rules and a catversion bump, because we can just copy these lists from the subsidiary TableFunc object. readfuncs.c was also missing support for NamedTuplestoreScan plan nodes. This accidentally fails to break parallel query because a query using a named tuplestore would never be considered parallel-safe anyway. However, project policy since parallel query came in is that all plan node types should have outfuncs/readfuncs support, so this is clearly an oversight that should be repaired. Noted while fooling around with a patch to test outfuncs/readfuncs more thoroughly. That exposed some other issues too, but these are the only ones that seem worth back-patching. Back-patch to v10 where both of these features came in. Discussion: https://postgr.es/m/17114.1537138992@sss.pgh.pa.us	2018-09-18 13:02:27 -04:00
Thomas Munro	422952ee78	Allow DSM allocation to be interrupted. Chris Travers reported that the startup process can repeatedly try to cancel a backend that is in a posix_fallocate()/EINTR loop and cause it to loop forever. Teach the retry loop to give up if an interrupt is pending. Don't actually check for interrupts in that loop though, because a non-local exit would skip some clean-up code in the caller. Back-patch to 9.4 where DSM was added (and posix_fallocate() was later back-patched). Author: Chris Travers Reviewed-by: Ildar Musin, Murat Kabilov, Oleksii Kliukin Tested-by: Oleksii Kliukin Discussion: https://postgr.es/m/CAN-RpxB-oeZve_J3SM_6%3DHXPmvEG%3DHX%2B9V9pi8g2YR7YW0rBBg%40mail.gmail.com	2018-09-18 22:56:36 +12:00
Michael Paquier	1d6fbc38d9	Refactor routines for subscription and publication lookups Those routines gain a missing_ok argument, allowing a caller to get a NULL result instead of an error if set to true. This is part of a larger refactoring effort for objectaddress.c where trying to check for non-existing objects does not result in cache lookup failures. Author: Michael Paquier Reviewed-by: Aleksander Alekseev, Álvaro Herrera Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com	2018-09-18 12:00:18 +09:00
Tom Lane	07a3af0ff8	Fix parsetree representation of XMLTABLE(XMLNAMESPACES(DEFAULT ...)). The original coding for XMLTABLE thought it could represent a default namespace by a T_String Value node with a null string pointer. That's not okay, though; in particular outfuncs.c/readfuncs.c are not on board with such a representation, meaning you'll get a null pointer crash if you try to store a view or rule containing this construct. To fix, change the parsetree representation so that we have a NULL list element, instead of a bogus Value node. This isn't really a functional limitation since default XML namespaces aren't yet implemented in the executor; you'd just get "DEFAULT namespace is not supported" anyway. But crashes are not nice, so back-patch to v10 where this syntax was added. Ordinarily we'd consider a parsetree representation change to be un-backpatchable; but since existing releases would crash on the way to storing such constructs, there can't be any existing views/rules to be incompatible with. Per report from Andrey Lepikhov. Discussion: https://postgr.es/m/3690074f-abd2-56a9-144a-aa5545d7a291@postgrespro.ru	2018-09-17 13:16:32 -04:00
Tom Lane	3844adbf3c	Add outfuncs.c support for RawStmt nodes. I noticed while poking at a report from Andrey Lepikhov that the recent addition of RawStmt nodes at the top of raw parse trees makes it impossible to print any raw parse trees whatsoever, because outfuncs.c doesn't know RawStmt and hence fails to descend into it. While we generally lack outfuncs.c support for utility statements, there is reasonably complete support for what you can find in a raw SELECT statement. It was not my intention to make that all dead code ... so let's add support for RawStmt. Back-patch to v10 where RawStmt appeared.	2018-09-16 13:02:47 -04:00
Tom Lane	8f32bacc00	In v11, disable JIT by default (it's still enabled by default in HEAD). Per discussion, JIT isn't quite mature enough to ship enabled-by-default. I failed to resist the temptation to do a bunch of copy-editing on the related documentation. Also, clean up some inconsistencies in which section of config.sgml the JIT GUCs are documented in vs. what guc.c and postgresql.config.sample had. Discussion: https://postgr.es/m/20180914222657.mw25esrzbcnu6qlu@alap3.anarazel.de	2018-09-15 17:24:35 -04:00
Tom Lane	1f4a920b73	Fix failure with initplans used conditionally during EvalPlanQual rechecks. The EvalPlanQual machinery assumes that any initplans (that is, uncorrelated sub-selects) used during an EPQ recheck would have already been evaluated during the main query; this is implicit in the fact that execPlan pointers are not copied into the EPQ estate's es_param_exec_vals. But it's possible for that assumption to fail, if the initplan is only reached conditionally. For example, a sub-select inside a CASE expression could be reached during a recheck when it had not been previously, if the CASE test depends on a column that was just updated. This bug is old, appearing to date back to my rewrite of EvalPlanQual in commit `9f2ee8f28`, but was not detected until Kyle Samson reported a case. To fix, force all not-yet-evaluated initplans used within the EPQ plan subtree to be evaluated at the start of the recheck, before entering the EPQ environment. This could be inefficient, if such an initplan is expensive and goes unused again during the recheck --- but that's piling one layer of improbability atop another. It doesn't seem worth adding more complexity to prevent that, at least not in the back branches. It was convenient to use the new-in-v11 ExecEvalParamExecParams function to implement this, but I didn't like either its name or the specifics of its API, so revise that. Back-patch all the way. Rather than rewrite the patch to avoid depending on bms_next_member() in the oldest branches, I chose to back-patch that function into 9.4 and 9.3. (This isn't the first time back-patches have needed that, and it exhausted my patience.) I also chose to back-patch some test cases added by commits `71404af2a` and `342a1ffa2` into 9.4 and 9.3, so that the 9.x versions of eval-plan-qual.spec are all the same. Andrew Gierth diagnosed the problem and contributed the added test cases, though the actual code changes are by me. Discussion: https://postgr.es/m/A033A40A-B234-4324-BE37-272279F7B627@tripadvisor.com	2018-09-15 13:42:33 -04:00
Alvaro Herrera	6b78231d91	Move PartitionDispatchData struct definition to execPartition.c There's no reason to expose the struct definition, so don't. Author: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Discussion: https://postgr.es/m/d3fa24c1-bc65-7133-81df-6474387ccc4f@lab.ntt.co.jp	2018-09-14 19:06:57 -03:00
Alvaro Herrera	20bef2c311	Fix ALTER/TYPE on columns referenced by FKs in partitioned tables When ALTER TABLE ... SET DATA TYPE affects a column referenced by constraints and indexes, it drop those constraints and indexes and recreates them afterwards, so that the definitions match the new data type. The original code did this by dropping one object at a time (commit `077db40fa1` of May 2004), which worked fine because the dependencies between the objects were pretty straightforward, and ordering the objects in a specific way was enough to make this work. However, when there are foreign key constraints in partitioned tables, the dependencies are no longer so straightforward, and we were getting errors when attempted: ERROR: cache lookup failed for constraint 16398 This can be fixed by doing all the drops in one pass instead, using performMultipleDeletions (introduced by `df18c51f29` of Aug 2006). With this change we can also remove the code to carefully order the list of objects to be deleted. Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKcux6nWS_m+s=1Udk_U9B+QY7pA-Ac58qR5BdUfOyrwnWHDew@mail.gmail.com	2018-09-14 13:41:20 -03:00
Andrew Gierth	728202b63c	Order active window clauses for greater reuse of Sort nodes. By sorting the active window list lexicographically by the sort clause list but putting longer clauses before shorter prefixes, we generate more chances to elide Sort nodes when building the path. Author: Daniel Gustafsson (with some editorialization by me) Reviewed-by: Alexander Kuzmenkov, Masahiko Sawada, Tom Lane Discussion: https://postgr.es/m/124A7F69-84CD-435B-BA0E-2695BE21E5C2%40yesql.se	2018-09-14 17:35:42 +01:00
Amit Kapila	75f9c4ca5a	Don't allow LIMIT/OFFSET clause within sub-selects to be pushed to workers. Allowing sub-select containing LIMIT/OFFSET in workers can lead to inconsistent results at the top-level as there is no guarantee that the row order will be fully deterministic. The fix is to prohibit pushing LIMIT/OFFSET within sub-selects to workers. Reported-by: Andrew Fletcher Bug: 15324 Author: Amit Kapila Reviewed-by: Dilip Kumar Backpatch-through: 9.6 Discussion: https://postgr.es/m/153417684333.10284.11356259990921828616@wrigleys.postgresql.org	2018-09-14 09:36:30 +05:30
Michael Paquier	28a8fa984c	Improve autovacuum logging for aggressive and anti-wraparound runs A log message was being generated when log_min_duration is reached for autovacuum on a given relation to indicate if it was an aggressive run, and missed the point of mentioning if it is doing an anti-wrapround run. The log message generated is improved so as one, both or no extra details are added depending on the option set. Author: Sergei Kornilov Reviewed-by: Masahiko Sawada, Michael Paquier Discussion: https://postgr.es/m/11587951532155118@sas1-19a94364928d.qloud-c.yandex.net	2018-09-14 07:35:39 +09:00
Amit Kapila	bc153c941d	Attach FPI to the first record after full_page_writes is turned on. XLogInsert fails to attach a required FPI to the first record after full_page_writes is turned on by the last checkpoint. This bug got introduced in 9.5 due to code rearrangement in commits `2c03216d83` and `2076db2aea`. Fix it by ensuring that XLogInsertRecord performs a recomputation when the given record is generated with FPW as off but found that the flag has been turned on while actually inserting the record. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi Reviewed-by: Amit Kapila Backpatch-through: 9.5 where this problem was introduced Discussion: https://postgr.es/m/20180420.151043.74298611.horiguchi.kyotaro@lab.ntt.co.jp	2018-09-13 15:32:50 +05:30
Michael Paquier	514a731ddc	Simplify static function in extension.c An extra argument for the filename defining the extension script location was present, aimed at being used for error reporting, but has never been used. This was around since extensions have been added in `d9572c4`. Author: Yugo Nagata Reviewed-by: Tatsuo Ishii Discussion: https://postgr.es/m/20180907180504.1ff19e1675bb44a67e9c7ab1@sraoss.co.jp	2018-09-13 16:56:57 +09:00
Peter Eisentraut	e5f1bb92cf	Simplify index tuple descriptor initialization We have two code paths for initializing the tuple descriptor for a new index: For a normal index, we copy the tuple descriptor from the table and reset a number of fields that are not applicable to indexes. For an expression index, we make a blank tuple descriptor and fill in the needed fields based on the provided expressions. As pg_attribute has grown over time, the number of fields that we need to reset in the first case is now bigger than the number of fields we actually want to copy, so it's sensible to do it the other way around: Make a blank descriptor and copy just the fields we need. This also allows more code sharing between the two branches, and it avoids having to touch this code for almost every unrelated change to the pg_attribute structure. Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>	2018-09-13 08:22:03 +02:00
Andrew Gierth	b7f6bcbffc	Repair bug in regexp split performance improvements. Commit `c8ea87e4b` introduced a temporary conversion buffer for substrings extracted during regexp splits. Unfortunately the code that sized it was failing to ignore the effects of ignored degenerate regexp matches, so for regexp_split_* calls it could under-size the buffer in such cases. Fix, and add some regression test cases (though those will only catch the bug if run in a multibyte encoding). Backpatch to 9.3 as the faulty code was. Thanks to the PostGIS project, Regina Obe and Paul Ramsey for the report (via IRC) and assistance in analysis. Patch by me.	2018-09-12 19:31:06 +01:00
Tom Lane	fedc97cdfd	Remove ruleutils.c's special case for BIT [VARYING] literals. Up to now, get_const_expr() insisted on prefixing BIT and VARBIT literals with 'B'. That's not really necessary, because we always append explicit-cast syntax to identify the constant's type. Moreover, it's subtly wrong for VARBIT, because the parser will interpret B'...' as '...'::"bit"; see make_const() which explicitly assigns type BITOID for a T_BitString literal. So what had been a simple VARBIT literal is reconstructed as ('...'::"bit")::varbit, which is not the same thing, at least not before constant folding. This results in odd differences after dump/restore, as complained of by the patch submitter, and it could result in actual failures in partitioning or inheritance DDL operations (see commit `542320c2b`, which repaired similar misbehaviors for some other data types). Fixing it is pretty easy: just remove the special case and let the default code path handle these types. We could have kept the special case for BIT only, but there seems little point in that. Like the previous patch, I judge that back-patching this into stable branches wouldn't be a good idea. However, it seems not quite too late for v11, so let's fix it there. Paul Guo, reviewed by Davy Machado and John Naylor, minor adjustments by me Discussion: https://postgr.es/m/CABQrizdTra=2JEqA6+Ms1D1k1Kqw+aiBBhC9TreuZRX2JzxLAA@mail.gmail.com	2018-09-11 16:32:25 -04:00
Andrew Gierth	500d49794f	Repair double-free in SP-GIST rescan (bug #15378 ) spgrescan would first reset traversalCxt, and then traverse a potentially non-empty stack containing pointers to traversalValues which had been allocated in those contexts, freeing them a second time. This bug originates in commit `ccd6eb49a` where traversalValue was introduced. Repair by traversing the stack before the context reset; this isn't ideal, since it means doing retail pfree in a context that's about to be reset, but the freeing of a stack entry is also done in other places in the code during the scan so it's not worth trying to refactor it further. Regression test added. Backpatch to 9.6 where the problem was introduced. Per bug #15378; analysis and patch by me, originally from a report on IRC by user velix; see also PostGIS ticket #4174; review by Alexander Korotkov. Discussion: https://postgr.es/m/153663176628.23136.11901365223750051490@wrigleys.postgresql.org	2018-09-11 18:14:19 +01:00
Alexander Korotkov	cf98467242	Improve behavior of to_timestamp()/to_date() functions to_timestamp()/to_date() functions were introduced mainly for Oracle compatibility, and became very popular among PostgreSQL users. However, some behavior of to_timestamp()/to_date() functions are both incompatible with Oracle and confusing for our users. This behavior is related to handling of spaces and separators in non FX (fixed format) mode. This commit reworks this behavior making less confusing, better documented and more compatible with Oracle. Nevertheless, there are still following incompatibilities with Oracle. 1) We don't insist that there are no format string patterns unmatched to input string. 2) In FX mode we don't insist space and separators in format string to exactly match input string. 3) When format string patterns are divided by mix of spaces and separators, we don't distinguish them, while Oracle takes into account only last group of spaces/separators. Discussion: https://postgr.es/m/1873520224.1784572.1465833145330.JavaMail.yahoo%40mail.yahoo.com Author: Artur Zakirov, Alexander Korotkov, Liudmila Mantrova Review: Amul Sul, Robert Haas, Tom Lane, Dmitry Dolgov, David G. Johnston	2018-09-09 21:19:51 +03:00
Alexander Korotkov	5f08accdad	Fix past pd_upper write in ginRedoRecompress() ginRedoRecompress() replays actions over compressed segments of posting list in-place. However, it might lead to write past pg_upper, because intermediate state during playing the changes can take more space than both original state and final state. This commit fixes that by refuse from in-place modification. Instead page tail is copied once modification is started, and then it's used as the source of original segments. Backpatch to 9.4 where posting list compression was introduced. Reported-by: Sivasubramanian Ramasubramanian Discussion: https://postgr.es/m/1536091151804.6588%40amazon.com Author: Alexander Korotkov based on patch from and ideas by Sivasubramanian Ramasubramanian Review: Sivasubramanian Ramasubramanian Backpatch-through: 9.4	2018-09-09 21:19:29 +03:00
Tom Lane	ff47d4bf1f	Work around stdbool problem in dfmgr.c. Commit `842cb9fa6` refactored things so that dfmgr.c includes <dlfcn.h>, which before that had only been directly included in platform-specific stub files. It turns out that on macOS, <dlfcn.h> includes <stdbool.h>, and that causes problems on platforms where _Bool is not char-sized ... which happens to include the PPC versions of macOS. Work around it much as we have in plperl.h, by #undef'ing bool after including the problematic file, but only if we're not using stdbool-style booleans. Discussion: https://postgr.es/m/E1fxqjl-0003YS-NS@gemulon.postgresql.org	2018-09-09 12:41:27 -04:00
Tom Lane	ed0cdf0e05	Install a check for mis-linking of src/port and src/common functions. On ELF-based platforms (and maybe others?) it's possible for a shared library, when dynamically loaded into the backend, to call the backend versions of src/port and src/common functions rather than the frontend versions that are actually linked into the shlib. This is definitely not what we want, because the frontend versions often behave slightly differently. Up to now it's been "slight" enough that nobody noticed; but with the addition of SCRAM support functions in src/common, we're observing crashes due to the difference between palloc and malloc memory allocation rules, as reported in bug #15367 from Jeremy Evans. The purpose of this patch is to create a direct test for this type of mis-linking, so that we know whether any given platform requires extra measures to prevent using the wrong functions. If the test fails, it will lead to connection failures in the contrib/postgres_fdw regression test. At the moment, *BSD platforms using ELF format are known to have the problem and can be expected to fail; but we need to know whether anything else does, and we need a reliable ongoing check for future platforms. Actually fixing the problem will be the subject of later commit(s). Discussion: https://postgr.es/m/153626613985.23143.4743626885618266803@wrigleys.postgresql.org	2018-09-09 12:23:23 -04:00
Tom Lane	f47f314801	Minor cleanup/future-proofing for pg_saslprep(). Ensure that pg_saslprep() initializes its output argument to NULL in all failure paths, and then remove the redundant initialization that some (not all) of its callers did. This does not fix any live bug, but it reduces the odds of future bugs of omission. Also add a comment about why the existing failure-path coding is adequate. Back-patch so as to keep the function's API consistent across branches, again to forestall future bug introduction. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/16558.1536407783@sss.pgh.pa.us	2018-09-08 18:20:36 -04:00
Michael Paquier	9226a3b89b	Remove duplicated words split across lines in comments This has been detected using some interesting tricks with sed, and the method used is mentioned in details in the discussion below. Author: Justin Pryzby Discussion: https://postgr.es/m/20180908013109.GB15350@telsasoft.com	2018-09-08 12:24:19 -07:00
Tom Lane	361844fe56	Save/restore SPI's global variables in SPI_connect() and SPI_finish(). This patch removes two sources of interference between nominally independent functions when one SPI-using function calls another, perhaps without knowing that it does so. Chapman Flack pointed out that xml.c's query_to_xml_internal() expects SPI_tuptable and SPI_processed to stay valid across datatype output function calls; but it's possible that such a call could involve re-entrant use of SPI. It seems likely that there are similar hazards elsewhere, if not in the core code then in third-party SPI users. Previously SPI_finish() reset SPI's API globals to zeroes/nulls, which would typically make for a crash in such a situation. Restoring them to the values they had at SPI_connect() seems like a considerably more useful behavior, and it still meets the design goal of not leaving any dangling pointers to tuple tables of the function being exited. Also, cause SPI_connect() to reset these variables to zeroes/nulls after saving them. This prevents interference in the opposite direction: it's possible that a SPI-using function that's only ever been tested standalone contains assumptions that these variables start out as zeroes. That was the case as long as you were the outermost SPI user, but not so much for an inner user. Now it's consistent. Report and fix suggestion by Chapman Flack, actual patch by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/9fa25bef-2e4f-1c32-22a4-3ad0723c4a17@anastigmatix.net	2018-09-07 20:09:57 -04:00
Tom Lane	f510412df3	Limit depth of forced recursion for CLOBBER_CACHE_RECURSIVELY. It's somewhat surprising that we got away with this before. (Actually, since nobody tests this routinely AFAIK, it might've been broken for awhile. But it's definitely broken in the wake of commit f868a8143.) It seems sufficient to limit the forced recursion to a small number of levels. Back-patch to all supported branches, like the preceding patch. Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us	2018-09-07 18:13:29 -04:00
Tom Lane	f868a8143a	Fix longstanding recursion hazard in sinval message processing. LockRelationOid and sibling routines supposed that, if our session already holds the lock they were asked to acquire, they could skip calling AcceptInvalidationMessages on the grounds that we must have already read any remote sinval messages issued against the relation being locked. This is normally true, but there's a critical special case where it's not: processing inside AcceptInvalidationMessages might attempt to access system relations, resulting in a recursive call to acquire a relation lock. Hence, if the outer call had acquired that same system catalog lock, we'd fall through, despite the possibility that there's an as-yet-unread sinval message for that system catalog. This could, for example, result in failure to access a system catalog or index that had just been processed by VACUUM FULL. This is the explanation for buildfarm failures we've been seeing intermittently for the past three months. The bug is far older than that, but commits `a54e1f158` et al added a new recursion case within AcceptInvalidationMessages that is apparently easier to hit than any previous case. To fix this, we must not skip calling AcceptInvalidationMessages until we have finished a call to it since acquiring a relation lock, not merely acquired the lock. (There's already adequate logic inside AcceptInvalidationMessages to deal with being called recursively.) Fortunately, we can implement that at trivial cost, by adding a flag to LOCALLOCK hashtable entries that tracks whether we know we have completed such a call. There is an API hazard added by this patch for external callers of LockAcquire: if anything is testing for LOCKACQUIRE_ALREADY_HELD, it might be fooled by the new return code LOCKACQUIRE_ALREADY_CLEAR into thinking the lock wasn't already held. This should be a fail-soft condition, though, unless something very bizarre is being done in response to the test. Also, I added an additional output argument to LockAcquireExtended, assuming that that probably isn't called by any outside code given the very limited usefulness of its additional functionality. Back-patch to all supported branches. Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us	2018-09-07 18:04:54 -04:00
Michael Paquier	8582b4d044	Improve handling of corrupted two-phase state files at recovery When a corrupted two-phase state file is found by WAL replay, be it for crash recovery or archive recovery, then the file is simply skipped and a WARNING is logged to the user, causing the transaction to be silently lost. Facing an on-disk WAL file which is corrupted is as likely to happen as what is stored in WAL records, but WAL records are already able to fail hard if there is a CRC mismatch. On-disk two-phase state files, on the contrary, are simply ignored if corrupted. Note that when restoring the initial two-phase data state at recovery, files newer than the horizon XID are discarded hence no files present in pg_twophase/ should be torned and have been made durable by a previous checkpoint, so recovery should never see any corrupted two-phase state file by design. The situation got better since `978b2f6` which has added two-phase state information directly in WAL instead of using on-disk files, so the risk is limited to two-phase transactions which live across at least one checkpoint for long periods. Backups having legit two-phase state files on-disk could also lose silently transactions when restored if things get corrupted. This behavior exists since two-phase commit has been introduced, no back-patch is done for now per the lack of complaints about this problem. Author: Michael Paquier Discussion: https://postgr.es/m/20180709050309.GM1467@paquier.xyz	2018-09-07 11:00:16 -07:00
Peter Eisentraut	98afa68d93	Use C99 designated initializers for some structs These are just a few particularly egregious cases that were hard to read and write, and error prone because of many similar adjacent types. Discussion: https://www.postgresql.org/message-id/flat/4c9f01be-9245-2148-b569-61a8562ef190%402ndquadrant.com	2018-09-07 11:40:03 +02:00
Peter Eisentraut	842cb9fa62	Refactor dlopen() support Nowadays, all platforms except Windows and older HP-UX have standard dlopen() support. So having a separate implementation per platform under src/backend/port/dynloader/ is a bit excessive. Instead, treat dlopen() like other library functions that happen to be missing sometimes and put a replacement implementation under src/port/. Discussion: https://www.postgresql.org/message-id/flat/e11a49cb-570a-60b7-707d-7084c8de0e61%402ndquadrant.com#54e735ae37476a121abb4e33c2549b03	2018-09-06 11:33:04 +02:00
Tom Lane	54b01b9293	Remove no-longer-used variable. Oversight in `2fbdf1b38`. Per buildfarm.	2018-09-05 14:29:58 -04:00
Alvaro Herrera	2fbdf1b38b	Simplify partitioned table creation vs. relcache In the original code, we were storing the pg_inherits row for a partitioned table too early: enough that we had a hack for relcache to avoid falling flat on its face while reading such a partial entry. If we finish the pg_class creation first and then store the pg_inherits entry, we don't need that hack. Also recognize that pg_class.relpartbound is not marked NOT NULL and therefore it's entirely possible to read null values, so having only Assert() protection isn't enough. Change those to if/elog tests instead. This qualifies as a robustness fix, so backpatch to pg11. In passing, remove one access that wasn't actually needed, and reword one message to be like all the others that check for the same thing. Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20180903213916.hh6wasnrdg6xv2ud@alvherre.pgsql	2018-09-05 14:36:13 -03:00
Michael Paquier	d6e98ebe37	Improve some error message strings and errcodes This makes a bit less work for translators, by unifying error strings a bit more with what the rest of the code does, this time for three error strings in autoprewarm and one in base backup code. After some code review of slot.c, some file-access errcodes are reported but lead to an incorrect internal error, while corrupted data makes the most sense, similarly to the previous work done in `e41d0a1`. Also, after calling rmtree(), a WARNING gets reported, which is a duplicate of what the internal call report, so make the code more consistent with all other code paths calling this function. Author: Michael Paquier Discussion: https://postgr.es/m/20180902200747.GC1343@paquier.xyz	2018-09-04 11:06:04 -07:00
Tom Lane	17b7c302b5	Fully enforce uniqueness of constraint names. It's been true for a long time that we expect names of table and domain constraints to be unique among the constraints of that table or domain. However, the enforcement of that has been pretty haphazard, and it missed some corner cases such as creating a CHECK constraint and then an index constraint of the same name (as per recent report from André Hänsel). Also, due to the lack of an actual unique index enforcing this, duplicates could be created through race conditions. Moreover, the code that searches pg_constraint has been quite inconsistent about how to handle duplicate names if one did occur: some places checked and threw errors if there was more than one match, while others just processed the first match they came to. To fix, create a unique index on (conrelid, contypid, conname). Since either conrelid or contypid is zero, this will separately enforce uniqueness of constraint names among constraints of any one table and any one domain. (If we ever implement SQL assertions, and put them into this catalog, more thought might be needed. But it'd be at least as reasonable to put them into a new catalog; having overloaded this one catalog with two kinds of constraints was a mistake already IMO.) This index can replace the existing non-unique index on conrelid, though we need to keep the one on contypid for query performance reasons. Having done that, we can simplify the logic in various places that either coped with duplicates or neglected to, as well as potentially improve lookup performance when searching for a constraint by name. Also, as per our usual practice, install a preliminary check so that you get something more friendly than a unique-index violation report in the case complained of by André. And teach ChooseIndexName to avoid choosing autogenerated names that would draw such a failure. While it's not possible to make such a change in the back branches, it doesn't seem quite too late to put this into v11, so do so. Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de	2018-09-04 13:45:35 -04:00
Amit Kapila	14e9b2a752	Prohibit pushing subqueries containing window function calculation to workers. Allowing window function calculation in workers leads to inconsistent results because if the input row ordering is not fully deterministic, the output of window functions might vary across workers. The fix is to treat them as parallel-restricted. In the passing, improve the coding pattern in max_parallel_hazard_walker so that it has a chain of mutually-exclusive if ... else if ... else if ... else if ... IsA tests. Reported-by: Marko Tiikkaja Bug: 15324 Author: Amit Kapila Reviewed-by: Tom Lane Backpatch-through: 9.6 Discussion: https://postgr.es/m/CAL9smLAnfPJCDUUG4ckX2iznj53V7VSMsYefzZieN93YxTNOcw@mail.gmail.com	2018-09-04 10:28:08 +05:30
Amit Kapila	7c9e19ca9a	During the split, set checksum on an empty hash index page. On a split, we allocate a new splitpoint's worth of bucket pages wherein we initialize the last page with zeros which is fine, but we forgot to set the checksum for that last page. We decided to back-patch this fix till 10 because we don't have an easy way to test it in prior versions. Another reason is that the hash-index code is changed heavily in 10, so it is not advisable to push the fix without testing it in prior versions. Author: Amit Kapila Reviewed-by: Yugo Nagata Backpatch-through: 10 Discussion: https://postgr.es/m/5d03686d-727c-dbf8-0064-bf8b97ffe850@2ndquadrant.com	2018-09-04 08:35:42 +05:30
Alvaro Herrera	c076f3d74a	Remove pg_constraint.conincluding This column was added in commit `8224de4f42` ("Indexes with INCLUDE columns and their support in B-tree") to ease writing the ruleutils.c supporting code for that feature, but it turns out to be unnecessary -- we can do the same thing with just one more syscache lookup. Even the documentation for the new column being removed in this commit is awkward. Discussion: https://postgr.es/m/20180902165018.33otxftp3olgtu4t@alvherre.pgsql	2018-09-03 12:59:26 -03:00
Tomas Vondra	4ddd8f5f55	Fix memory leak in TRUNCATE decoding When decoding a TRUNCATE record, the relids array was being allocated in the main ReorderBuffer memory context, but not released with the change resulting in a memory leak. The array was also ignored when serializing/deserializing the change, assuming all the information is stored in the change itself. So when spilling the change to disk, we've only we have serialized only the pointer to the relids array. Thanks to never releasing the array, the pointer however remained valid even after loading the change back to memory, preventing an actual crash. This fixes both the memory leak and (de)serialization. The relids array is still allocated in the main ReorderBuffer memory context (none of the existing ones seems like a good match, and adding an extra context seems like an overkill). The allocation is wrapped in a new ReorderBuffer API functions, to keep the details within reorderbuffer.c, just like the other ReorderBufferGet methods do. Author: Tomas Vondra Discussion: https://www.postgresql.org/message-id/flat/66175a41-9342-2845-652f-1bd4c3ee50aa%402ndquadrant.com Backpatch: 11, where decoding of TRUNCATE was introduced	2018-09-03 02:10:24 +02:00
Michael Paquier	caa0c6ceba	Fix initial sync of slot parent directory when restoring status At the beginning of recovery, information from replication slots is recovered from disk to memory. In order to ensure the durability of the information, the status file as well as its parent directory are synced. It happens that the sync on the parent directory was done directly using the status file path, which is logically incorrect, and the current code has been doing a sync on the same object twice in a row. Reported-by: Konstantin Knizhnik Diagnosed-by: Konstantin Knizhnik Author: Michael Paquier Discussion: https://postgr.es/m/9eb1a6d5-b66f-2640-598d-c5ea46b8f68a@postgrespro.ru Backpatch-through: 9.4-	2018-09-02 12:40:30 -07:00
Tom Lane	44cac93464	Avoid using potentially-under-aligned page buffers. There's a project policy against using plain "char buf[BLCKSZ]" local or static variables as page buffers; preferred style is to palloc or malloc each buffer to ensure it is MAXALIGN'd. However, that policy's been ignored in an increasing number of places. We've apparently got away with it so far, probably because (a) relatively few people use platforms on which misalignment causes core dumps and/or (b) the variables chance to be sufficiently aligned anyway. But this is not something to rely on. Moreover, even if we don't get a core dump, we might be paying a lot of cycles for misaligned accesses. To fix, invent new union types PGAlignedBlock and PGAlignedXLogBlock that the compiler must allocate with sufficient alignment, and use those in place of plain char arrays. I used these types even for variables where there's no risk of a misaligned access, since ensuring proper alignment should make kernel data transfers faster. I also changed some places where we had been palloc'ing short-lived buffers, for coding style uniformity and to save palloc/pfree overhead. Since this seems to be a live portability hazard (despite the lack of field reports), back-patch to all supported versions. Patch by me; thanks to Michael Paquier for review. Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de	2018-09-01 15:27:17 -04:00
Alexander Korotkov	ec74369931	Implement "pg_ctl logrotate" command Currently there are two ways to trigger log rotation in logging collector process: call pg_rotate_logfile() SQL-function or send SIGUSR1 signal directly to logging collector process. However, it's nice to have more suitable way for external tools to do that, which wouldn't require SQL connection or knowledge of logging collector pid. This commit implements triggering log rotation by "pg_ctl logrotate" command. Discussion: https://postgr.es/m/20180416.115435.28153375.horiguchi.kyotaro%40lab.ntt.co.jp Author: Kyotaro Horiguchi, Alexander Kuzmenkov, Alexander Korotkov	2018-09-01 19:46:49 +03:00
Noah Misch	ab0ed6153a	Ignore server-side delays when enforcing wal_sender_timeout. Healthy clients of servers having poor I/O performance, such as buildfarm members hamster and tern, saw unexpected timeouts. That disagreed with documentation. This fix adds one gettimeofday() call whenever ProcessRepliesIfAny() finds no client reply messages. Back-patch to 9.4; the bug's symptom is rare and mild, and the code all moved between 9.3 and 9.4. Discussion: https://postgr.es/m/20180826034600.GA1105084@rfd.leadboat.com	2018-08-31 22:59:58 -07:00
Michael Paquier	c186ba135e	Ensure correct minimum consistent point on standbys Startup process has improved its calculation of incorrect minimum consistent point in `8d68ee6`, which ensures that all WAL available gets replayed when doing crash recovery, and has introduced an incorrect calculation of the minimum recovery point for non-startup processes, which can cause incorrect page references on a standby when for example the background writer flushed a couple of pages on-disk but was not updating the control file to let a subsequent crash recovery replay to where it should have. The only case where this has been reported to be a problem is when a standby needs to calculate the latest removed xid when replaying a btree deletion record, so one would need connections on a standby that happen just after recovery has thought it reached a consistent point. Using a background worker which is started after the consistent point is reached would be the easiest way to get into problems if it connects to a database. Having clients which attempt to connect periodically could also be a problem, but the odds of seeing this problem are much lower. The fix used is pretty simple, as the idea is to give access to the minimum recovery point written in the control file to non-startup processes so as they use a reference, while the startup process still initializes its own references of the minimum consistent point so as the original problem with incorrect page references happening post-promotion with a crash do not show up. Reported-by: Alexander Kukushkin Diagnosed-by: Alexander Kukushkin Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, Alexander Kukushkin Discussion: https://postgr.es/m/153492341830.1368.3936905691758473953@wrigleys.postgresql.org Backpatch-through: 9.3	2018-08-31 11:03:40 -07:00
Etsuro Fujita	7cfdc77023	Disable support for partitionwise joins in problematic cases. Commit `f49842d`, which added support for partitionwise joins, built the child's tlist by applying adjust_appendrel_attrs() to the parent's. So in the case where the parent's included a whole-row Var for the parent, the child's contained a ConvertRowtypeExpr. To cope with that, that commit added code to the planner, such as setrefs.c, but some code paths still assumed that the tlist for a scan (or join) rel would only include Vars and PlaceHolderVars, which was true before that commit, causing errors: * When creating an explicit sort node for an input path for a mergejoin path for a child join, prepare_sort_from_pathkeys() threw the 'could not find pathkey item to sort' error. * When deparsing a relation participating in a pushed down child join as a subquery in contrib/postgres_fdw, get_relation_column_alias_ids() threw the 'unexpected expression in subquery output' error. * When performing set_plan_references() on a local join plan generated by contrib/postgres_fdw for EvalPlanQual support for a pushed down child join, fix_join_expr() threw the 'variable not found in subplan target lists' error. To fix these, two approaches have been proposed: one by Ashutosh Bapat and one by me. While the former keeps building the child's tlist with a ConvertRowtypeExpr, the latter builds it with a whole-row Var for the child not to violate the planner assumption, and tries to fix it up later, But both approaches need more work, so refuse to generate partitionwise join paths when whole-row Vars are involved, instead. We don't need to handle ConvertRowtypeExprs in the child's tlists for now, so this commit also removes the changes to the planner. Previously, partitionwise join computed attr_needed data for each child separately, and built the child join's tlist using that data, which also required an extra step for adding PlaceHolderVars to that tlist, but it would be more efficient to build it from the parent join's tlist through the adjust_appendrel_attrs() transformation. So this commit builds that list that way, and simplifies build_joinrel_tlist() and placeholder.c as well as part of set_append_rel_size() to basically what they were before partitionwise join went in. Back-patch to PG11 where partitionwise join was introduced. Report by Rajkumar Raghuwanshi. Analysis by Ashutosh Bapat, who also provided some of regression tests. Patch by me, reviewed by Robert Haas. Discussion: https://postgr.es/m/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg@mail.gmail.com	2018-08-31 20:34:06 +09:00
Etsuro Fujita	2e39f69b66	Remove extra word from src/backend/optimizer/README	2018-08-31 16:40:17 +09:00
Peter Eisentraut	1e5e4efd02	Error position support for partition specifications Add support for error position reporting for partition specifications. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2018-08-30 08:20:23 +02:00
Peter Eisentraut	a4a232b1e7	Error position support for defaults and check constraints Add support for error position reporting for the expressions contained in defaults and check constraint definitions. This currently works only for CREATE TABLE, not ALTER TABLE, because the latter is not set up to pass around the original query string. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2018-08-30 08:20:23 +02:00
Michael Paquier	55875b6d2a	Stop bgworkers during fast shutdown with postmaster in startup phase When a postmaster gets into its phase PM_STARTUP, it would start background workers using BgWorkerStart_PostmasterStart mode immediately, which would cause problems for a fast shutdown as the postmaster forgets to send SIGTERM to already-started background workers. With smart and immediate shutdowns, this correctly happened, and fast shutdown is the only mode missing the shot. Author: Alexander Kukushkin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAFh8B=mvnD8+DZUfzpi50DoaDfZRDfd7S=gwj5vU9GYn8UvHkA@mail.gmail.com Backpatch-through: 9.5	2018-08-29 17:10:02 -07:00
Andrew Gierth	c8ea87e4bd	Avoid quadratic slowdown in regexp match/split functions. regexp_matches, regexp_split_to_table and regexp_split_to_array all work by compiling a list of match positions as character offsets (NOT byte positions) in the source string. Formerly, they then used text_substr to extract the matched text; but in a multi-byte encoding, that counts the characters in the string, and the characters needed to reach the starting byte position, on every call. Accordingly, the performance degraded as the product of the input string length and the number of match positions, such that splitting a string of a few hundred kbytes could take many minutes. Repair by keeping the wide-character copy of the input string available (only in the case where encoding_max_length is not 1) after performing the match operation, and extracting substrings from that instead. This reduces the complexity to being linear in the number of result bytes, discounting the actual regexp match itself (which is not affected by this patch). In passing, remove cleanup using retail pfree() which was obsoleted by commit `ff428cded` (Feb 2008) which made cleanup of SRF multi-call contexts automatic. Also increase (to ~134 million) the maximum number of matches and provide an error message when it is reached. Backpatch all the way because this has been wrong forever. Analysis and patch by me; review by Kaiting Chen. Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk	2018-08-28 12:17:33 +01:00
Peter Eisentraut	7a3b7bbfde	Fix snapshot leak warning for some procedures The problem arises with the combination of CALL with output parameters and doing a COMMIT inside the procedure. When a CALL has output parameters, the portal uses the strategy PORTAL_UTIL_SELECT instead of PORTAL_MULTI_QUERY. Using PORTAL_UTIL_SELECT causes the portal's snapshot to be registered with the current resource owner (portal->holdSnapshot); see `9ee1cf04ab` for the reason. Normally, PortalDrop() unregisters the snapshot. If not, then ResourceOwnerRelease() will print a warning about a snapshot leak on transaction commit. A transaction commit normally drops all portals (PreCommit_Portals()), except the active portal. So in case of the active portal, we need to manually release the snapshot to avoid the warning. Reported-by: Prabhat Sahu <prabhat.sahu@enterprisedb.com> Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>	2018-08-27 22:16:15 +02:00
Michael Paquier	a556549d7e	Improve VACUUM and ANALYZE by avoiding early lock queue A caller of VACUUM can perform early lookup obtention which can cause other sessions to block on the request done, causing potentially DOS attacks as even a non-privileged user can attempt a vacuum fill of a critical catalog table to block even all incoming connection attempts. Contrary to TRUNCATE, a client could attempt a system-wide VACUUM after building the list of relations to VACUUM, which can cause vacuum_rel() or analyze_rel() to try to lock the relation but the operation would just block. When the client specifies a list of relations and the relation needs to be skipped, ownership checks are done when building the list of relations to work on, preventing a later lock attempt. vacuum_rel() already had the sanity checks needed, except that those were applied too late. This commit refactors the code so as relation skips are checked beforehand, making it safer to avoid too early locks, for both manual VACUUM with and without a list of relations specified. An isolation test is added emulating the fact that early locks do not happen anymore, issuing a WARNING message earlier if the user calling VACUUM is not a relation owner. When a partitioned table is listed in a manual VACUUM or ANALYZE command, its full list of partitions is fetched, all partitions get added to the list to work on, and then each one of them is processed one by one, with ownership checks happening at the later phase of vacuum_rel() or analyze_rel(). Trying to do early ownership checks for each partition is proving to be tedious as this would result in deadlock risks with lock upgrades, and skipping all partitions if the listed partitioned table is not owned would result in a behavior change compared to how Postgres 10 has implemented vacuum for partitioned tables. The original problem reported related to early lock queue for critical relations is fixed anyway, so priority is given to avoiding a backward-incompatible behavior. Reported-by: Lloyd Albin, Jeremy Schneider Author: Michael Paquier Reviewed by: Nathan Bossart, Kyotaro Horiguchi Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org Discussion: https://postgr.es/m/20180812222142.GA6097@paquier.xyz	2018-08-27 09:11:12 +09:00
Thomas Munro	18e586741b	Fix typos. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f8du35u5DprpykWvgNEScxapbWYJdHq%2Bz06Wj3Y2KFPbw%40mail.gmail.com	2018-08-27 09:32:59 +12:00
Tom Lane	bff84a547d	Make syslogger more robust against failures in opening CSV log files. The previous coding figured it'd be good enough to postpone opening the first CSV log file until we got a message we needed to write there. This is unsafe, though, because if the open fails we end up in infinite recursion trying to report the failure. Instead make the CSV log file management code look as nearly as possible like the longstanding logic for the stderr log file. In particular, open it immediately at postmaster startup (if enabled), or when we get a SIGHUP in which we find that log_destination has been changed to enable CSV logging. It seems OK to fail if a postmaster-start-time open attempt fails, as we've long done for the stderr log file. But we can't die if we fail to open a CSV log file during SIGHUP, so we're still left with a problem. In that case, write any output meant for the CSV log file to the stderr log file. (This will also cover race-condition cases in which backends send CSV log data before or after we have the CSV log file open.) This patch also fixes an ancient oversight that, if CSV logging was turned off during a SIGHUP, we never actually closed the last CSV log file. In passing, remember to reset whereToSendOutput = DestNone during syslogger start, since (unlike all other postmaster children) it's forked before the postmaster has done that. This made for a platform-dependent difference in error reporting behavior between the syslogger and other children: except on Windows, it'd report problems to the original postmaster stderr as well as the normal error log file(s). It's barely possible that that was intentional at some point; but it doesn't seem likely to be desirable in production, and the platform dependency definitely isn't desirable. Per report from Alexander Kukushkin. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAFh8B==iLUD_gqC-dAENS0V+kVrCeGiKujtKqSQ7++S-caaChw@mail.gmail.com	2018-08-26 14:21:55 -04:00
Jeff Davis	ba9d35b8eb	Reconsider new file extension in commit `91f26d5f`. Andres and Tom objected to the choice of the ".tmp" extension. Changing to Andres's suggestion of ".spill". Discussion: https://postgr.es/m/88092095-3348-49D8-8746-EB574B1D30EA%40anarazel.de	2018-08-25 22:52:46 -07:00
Jeff Davis	91f26d5fe4	Change extension of spilled ReorderBufferChange data to ".tmp". The previous extension, ".snap", was chosen for historical reasons and became confusing. Discussion: https://postgr.es/m/CAMp0ubd_P8vBGx8=MfDXQJZxHA5D_Zarw5cCkDxJ_63+pWRJ9w@mail.gmail.com	2018-08-25 09:36:09 -07:00
Andres Freund	cb92520563	LLVMJIT: LLVMGetHostCPUFeatures now is upstream, use LLMV version if available. Noticed thanks to buildfarm animal seawasp. Author: Andres Freund Backpatch: v11-, where LLVM based JIT compliation was introduced.	2018-08-24 10:21:38 -07:00
Tom Lane	b0c5da615e	Suppress uninitialized-variable warning in new SCRAM code. While we generally don't sweat too much about "may be used uninitialized" warnings from older compilers, I noticed that there's a fair number of buildfarm animals that are producing such a warning only for this variable. So it seems worth silencing.	2018-08-24 10:51:10 -04:00
Andres Freund	143290efd0	Introduce minimal C99 usage to verify compiler support. This just converts a few for loops in postgres.c to declare variables in the loop initializer, and uses designated initializers in smgr.c's definition of smgr callbacks. Author: Andres Freund Discussion: https://postgr.es/m/97d4b165-192d-3605-749c-f614a0c4e783@2ndquadrant.com	2018-08-23 18:36:07 -07:00
Andres Freund	88ebd62fcc	Deduplicate code between slot_getallattrs() and slot_getsomeattrs(). Code in slot_getallattrs() is the same as if slot_getsomeattrs() is called with number of attributes specified in the tuple descriptor. Implement it that way instead of duplicating the code between those two functions. This is part of a patchseries abstracting TupleTableSlots so they can store arbitrary forms of tuples, but is a nice enough cleanup on its own. Author: Ashutosh Bapat Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de	2018-08-23 16:58:53 -07:00
Andrew Gierth	a40631a920	Fix lexing of standard multi-character operators in edge cases. Commits `c6b3c939b` (which fixed the precedence of >=, <=, <> operators) and `865f14a2d` (which added support for the standard => notation for named arguments) created a class of lexer tokens which look like multi-character operators but which have their own token IDs distinct from Op. However, longest-match rules meant that following any of these tokens with another operator character, as in (1<>-1), would cause them to be incorrectly returned as Op. The error here isn't immediately obvious, because the parser would usually still find the correct operator via the Op token, but there were more subtle problems: 1. If immediately followed by a comment or +-, >= <= <> would be given the old precedence of Op rather than the correct new precedence; 2. If followed by a comment, != would be returned as Op rather than as NOT_EQUAL, causing it not to be found at all; 3. If followed by a comment or +-, the => token for named arguments would be lexed as Op, causing the argument to be mis-parsed as a simple expression, usually causing an error. Fix by explicitly checking for the operators in the {operator} code block in addition to all the existing special cases there. Backpatch to 9.5 where the problem was introduced. Analysis and patch by me; review by Tom Lane. Discussion: https://postgr.es/m/87va851ppl.fsf@news-spur.riddles.org.uk	2018-08-23 21:42:40 +01:00
Andrew Gierth	d4a63f8297	Reduce an unnecessary O(N^3) loop in lexer. The lexer's handling of operators contained an O(N^3) hazard when dealing with long strings of + or - characters; it seems hard to prevent this case from being O(N^2), but the additional N multiplier was not needed. Backpatch all the way since this has been there since 7.x, and it presents at least a mild hazard in that trying to do Bind, PREPARE or EXPLAIN on a hostile query could take excessive time (without honouring cancels or timeouts) even if the query was never executed.	2018-08-23 21:42:40 +01:00
Peter Eisentraut	0a63f996e0	Change PROCEDURE to FUNCTION in CREATE TRIGGER syntax Since procedures are now a different thing from functions, change the CREATE TRIGGER and CREATE EVENT TRIGGER syntax to use FUNCTION in the clause that specifies the function. PROCEDURE is still accepted for compatibility. pg_dump and ruleutils.c output is not changed yet, because that would require a change in information_schema.sql and thus a catversion change. Reported-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>	2018-08-22 14:44:49 +02:00
Peter Eisentraut	d12782898e	Change PROCEDURE to FUNCTION in CREATE OPERATOR syntax Since procedures are now a different thing from functions, change the CREATE OPERATOR syntax to use FUNCTION in the clause that specifies the function. PROCEDURE is still accepted for compatibility. Reported-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>	2018-08-22 14:44:49 +02:00
Peter Eisentraut	b19495772e	doc: Update uses of the word "procedure" Historically, the term procedure was used as a synonym for function in Postgres/PostgreSQL. Now we have procedures as separate objects from functions, so we need to clean up the documentation to not mix those terms. In particular, mentions of "trigger procedures" are changed to "trigger functions", and access method "support procedures" are changed to "support functions". (The latter already used FUNCTION in the SQL syntax anyway.) Also, the terminology in the SPI chapter has been cleaned up. A few tests, examples, and code comments are also adjusted to be consistent with documentation changes, but not everything. Reported-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>	2018-08-22 14:44:49 +02:00
Thomas Munro	af63926cf5	Wrap long line in postgresql.conf.sample. Per complaint from Michael Paquier.	2018-08-22 21:34:50 +12:00
Thomas Munro	f9fe269ca2	Provide plan_cache_mode options in postgresql.conf.sample. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f8YkwojSTSg8YjNYCLCXzx0fR7wBR3Gf%2BrA9_52eoPZKg%40mail.gmail.com	2018-08-22 18:19:39 +12:00
Alvaro Herrera	55a0154efd	Fix typo	2018-08-21 17:16:10 -03:00
Alvaro Herrera	083651da17	fix typo	2018-08-21 17:03:35 -03:00
Michael Paquier	72be8c29a1	Fix set of NLS translation issues While monitoring the code, a couple of issues related to string translation has showed up: - Some routines for auto-updatable views return an error string, which sometimes missed the shot. A comment regarding string translation is added for each routine to help with future features. - GSSAPI authentication missed two translations. - vacuumdb handles non-translated strings. - GetConfigOptionByNum should translate strings. This part is not back-patched as after a minor upgrade this could be surprising for users. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20180810.152131.31921918.horiguchi.kyotaro@lab.ntt.co.jp Backpatch-through: 9.3	2018-08-21 15:17:13 +09:00
Michael Paquier	d8c83800c3	Fix typo in description of enable_parallel_hash Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20180821.115841.93250330.horiguchi.kyotaro@lab.ntt.co.jp	2018-08-21 12:13:16 +09:00
Michael Paquier	1339fcc896	Clarify comment about assignment and reset of temp namespace ID in MyProc The new wording comes from Álvaro, which I modified a bit. Reported-by: Andres Freund, Álvaro Herrera Author: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20180809165047.GK13638@paquier.xyz Backpatch-through: 11	2018-08-21 08:32:18 +09:00
Peter Eisentraut	d83423db86	Improve error messages for CREATE OR REPLACE PROCEDURE Change the hint to recommend DROP PROCEDURE instead of FUNCTION. Also make the error message when changing the return type more specific to the case of procedures. Reported-by: Jeremy Evans <code@jeremyevans.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2018-08-18 22:05:43 +02:00
Peter Eisentraut	e4597ee65d	InsertPgAttributeTuple() to set attcacheoff InsertPgAttributeTuple() is the interface between in-memory tuple descriptors and on-disk pg_attribute, so it makes sense to give it the job of resetting attcacheoff. This avoids having all the callers having to do so. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2018-08-17 22:08:21 +02:00
Andrew Gierth	520acab171	Set scan direction appropriately for SubPlans (bug #15336 ) When executing a SubPlan in an expression, the EState's direction field was left alone, resulting in an attempt to execute the subplan backwards if it was encountered during a backwards scan of a cursor. Also, though much less likely, it was possible to reach the execution of an InitPlan while in backwards-scan state. Repair by saving/restoring estate->es_direction and forcing forward scan mode in the relevant places. Backpatch all the way, since this has been broken since 8.3 (prior to commit `c7ff7663e`, SubPlans had their own EStates rather than sharing the parent plan's, so there was no confusion over scan direction). Per bug #15336 reported by Vladimir Baranoff; analysis and patch by me, review by Tom Lane. Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org	2018-08-17 15:44:13 +01:00
Tomas Vondra	c4c3400885	Use the built-in float datatypes to implement geometric types This patch makes the geometric operators and functions use the exported function of the float4/float8 datatypes. The main reason of doing so is to check for underflow and overflow, and to handle NaNs consciously. The float datatypes consider NaNs values to be equal and greater than all non-NaN values. This change considers NaNs equal only for equality operators. The placement operators, contains, overlaps, left/right of etc. continue to return false when NaNs are involved. We don't need to worry about them being considered greater than any-NaN because there aren't any basic comparison operators like less/greater than for the geometric datatypes. The changes may be summarised as: * Check for underflow, overflow and division by zero * Consider NaN values to be equal * Return NULL when the distance is NaN for all closest point operators * Favour not-NaN over NaN where it makes sense The patch also replaces all occurrences of "double" as "float8". They are the same, but were used inconsistently in the same file. Author: Emre Hasegeli Reviewed-by: Kyotaro Horiguchi, Tomas Vondra Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com	2018-08-16 19:56:11 +02:00
Tomas Vondra	a082aed072	Remove remaining GEODEBUG references from geo_ops.c Commit `a7dc63d904` removed most of the GEODEBUG occurrences, but there were a couple remaining. So remove them too, to get rid of the macro entirely. Author: Emre Hasegeli Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com	2018-08-16 19:55:43 +02:00
Tom Lane	e1d19c902e	Require a C99-compliant snprintf(), and remove related workarounds. Since our substitute snprintf now returns a C99-compliant result, there's no need anymore to have complicated code to cope with pre-C99 behavior. We can just make configure substitute snprintf.c if it finds that the system snprintf() is pre-C99. (Note: I do not believe that there are any platforms where this test will trigger that weren't already being rejected due to our other C99-ish feature requirements for snprintf. But let's add the check for paranoia's sake.) Then, simplify the call sites that had logic to cope with the pre-C99 definition. I also dropped some stuff that was being paranoid about the possibility of snprintf overrunning the given buffer. The only reports we've ever heard of that being a problem were for Solaris 7, which is long dead, and we've sure not heard any reports of these assertions triggering in a long time. So let's drop that complexity too. Likewise, drop some code that wasn't trusting snprintf to set errno when it returns -1. That would be not-per-spec, and again there's no real reason to believe it is a live issue, especially not for snprintfs that pass all of configure's feature checks. Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us	2018-08-16 13:01:09 -04:00
Alvaro Herrera	1eb9221585	Fix executor prune failure when plan already pruned In a multi-layer partitioning setup, if at plan time all the sub-partitions are pruned but the intermediate one remains, the executor later throws a spurious error that there's nothing to prune. That is correct, but there's no reason to throw an error. Therefore, don't. Reported-by: Andreas Seltenreich <seltenreich@gmx.de> Author: David Rowley <david.rowley@2ndquadrant.com> Discussion: https://postgr.es/m/87in4h98i0.fsf@ansel.ydns.eu	2018-08-16 12:53:43 -03:00
Tomas Vondra	fa73b377ee	Close the file descriptor in ApplyLogicalMappingFile The function was forgetting to close the file descriptor, resulting in failures like this: ERROR: 53000: exceeded maxAllocatedDescs (492) while trying to open file "pg_logical/mappings/map-4000-4eb-1_60DE1E08-5376b5-537c6b" LOCATION: OpenTransientFile, fd.c:2161 Simply close the file at the end, and backpatch to 9.4 (where logical decoding was introduced). While at it, fix a nearby typo. Discussion: https://www.postgresql.org/message-id/flat/738a590a-2ce5-9394-2bef-7b1caad89b37%402ndquadrant.com	2018-08-16 16:49:57 +02:00
Michael Paquier	3593579bcd	Update comment in header of errcodes.txt This file mentions all the files generated from it, but missed that errcodes-list.sgml is no more, while errcodes-table.sgml is. Author: Noriyoshi Shinoda Discussion: https://postgr.es/m/TU4PR8401MB0430855D6B971E49EB55F328EE3E0@TU4PR8401MB0430.NAMPRD84.PROD.OUTLOOK.COM	2018-08-16 09:47:59 +02:00
Thomas Munro	ca1e64feba	Improve comment in GetNewObjectId(). The previous comment gave the impression that skipping OIDs before FirstNormalObjectId was merely an optimization to avoid likely collisions. In fact other parts of the system have been relying on this threshold to detect system-created objects since commit `8e18d04d4d`, so adjust the wording. Author: Thomas Munro Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAEepm%3D33JASACeOayr_W3%3DCSjy2jiPxM-k89axu0akFbHdjnjA%40mail.gmail.com	2018-08-16 17:17:30 +12:00
Alvaro Herrera	ab7dbd681c	Update FSM on WAL replay of page all-visible/frozen We aren't very strict about keeping FSM up to date on WAL replay, because per-page freespace values aren't critical in replicas (can't write to heap in a replica; and if the replica is promoted, the values would be updated by VACUUM anyway). However, VACUUM since 9.6 can skip processing pages marked all-visible or all-frozen, and if such pages are recorded in FSM with wrong values, those values are blindly propagated to FSM's upper layers by VACUUM's FreeSpaceMapVacuum. (This rationale assumes that crashes are not very frequent, because those would cause outdated FSM to occur in the primary.) Even when the FSM is outdated in standby, things are not too bad normally, because, most per-page FSM values will be zero (other than those propagated with the base-backup that created the standby); only once the remaining free space is less than 0.2*BLCKSZ the per-page value is maintained by WAL replay of heap ins/upd/del. However, if wal_log_hints=on causes complete FSM pages to be propagated to a standby via full-page images, many too-optimistic per-page values can end up being registered in the standby. Incorrect per-page values aren't critical in most cases, since an inserter that is given a page that doesn't actually contain the claimed free space will update FSM with the correct value, and retry until it finds a usable page. However, if there are many such updates to do, an inserter can spend a long time doing them before a usable page is found; in a heavily trafficked insert-only table with many concurrent inserters this has been observed to cause several second stalls, causing visible application malfunction. To fix this problem, it seems sufficient to have heap_xlog_visible (replay of setting all-visible and all-frozen VM bits for a heap page) update the FSM value for the page being processed. This fixes the per-page counters together with making the page skippable to vacuum, so when vacuum does FreeSpaceMapVacuum, the values propagated to FSM upper layers are the correct ones, avoiding the problem. While at it, apply the same fix to heap_xlog_clean (replay of tuple removal by HOT pruning and vacuum). This makes any space freed by the cleaning available earlier than the next vacuum in the promoted replica. Backpatch to 9.6, where this problem was diagnosed on an insert-only table with all-frozen pages, which were introduced as a concept in that release. Theoretically it could apply with all-visible pages to older branches, but there's been no report of that and it doesn't backpatch cleanly anyway. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20180802172857.5skoexsilnjvgruk@alvherre.pgsql	2018-08-15 18:09:29 -03:00
Tom Lane	cc4f6b7786	Clean up assorted misuses of snprintf()'s result value. Fix a small number of places that were testing the result of snprintf() but doing so incorrectly. The right test for buffer overrun, per C99, is "result >= bufsize" not "result > bufsize". Some places were also checking for failure with "result == -1", but the standard only says that a negative value is delivered on failure. (Note that this only makes these places correct if snprintf() delivers C99-compliant results. But at least now these places are consistent with all the other places where we assume that.) Also, make psql_start_test() and isolation_start_test() check for buffer overrun while constructing their shell commands. There seems like a higher risk of overrun, with more severe consequences, here than there is for the individual file paths that are made elsewhere in the same functions, so this seemed like a worthwhile change. Also fix guc.c's do_serialize() to initialize errno = 0 before calling vsnprintf. In principle, this should be unnecessary because vsnprintf should have set errno if it returns a failure indication ... but the other two places this coding pattern is cribbed from don't assume that, so let's be consistent. These errors are all very old, so back-patch as appropriate. I think that only the shell command overrun cases are even theoretically reachable in practice, but there's not much point in erroneous error checks. Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us	2018-08-15 16:29:31 -04:00
Peter Eisentraut	b68ff3ea67	Remove obsolete netbsd dynloader code dlopen() has been documented since NetBSD 1.1 (1995).	2018-08-13 23:21:01 +02:00
Peter Eisentraut	29351a06af	Remove obsolete openbsd dynloader code dlopen() has been documented since OpenBSD 2.0 (1996).	2018-08-13 23:21:01 +02:00
Peter Eisentraut	2db1905fdd	Remove obsolete freebsd dynloader code dlopen() has been documented since FreeBSD 3.0 (1989).	2018-08-13 23:21:01 +02:00
Peter Eisentraut	351855fc4e	Remove obsolete linux dynloader code This has been obsolete probably since the late 1990s.	2018-08-13 23:21:01 +02:00
Peter Eisentraut	b5d29299cc	Remove obsolete darwin dynloader code not needed since macOS 10.3 (2003)	2018-08-13 23:21:01 +02:00
Peter Eisentraut	3ebdd21b79	Remove obsolete comment The sequence name is no longer stored in the sequence relation, since `1753b1b027`.	2018-08-13 21:07:31 +02:00
Michael Paquier	246a6c8f7b	Make autovacuum more aggressive to remove orphaned temp tables Commit `dafa084`, added in 10, made the removal of temporary orphaned tables more aggressive. This commit makes an extra step into the aggressiveness by adding a flag in each backend's MyProc which tracks down any temporary namespace currently in use. The flag is set when the namespace gets created and can be reset if the temporary namespace has been created in a transaction or sub-transaction which is aborted. The flag value assignment is assumed to be atomic, so this can be done in a lock-less fashion like other flags already present in PGPROC like databaseId or backendId, still the fact that the temporary namespace and table created are still locked until the transaction creating those commits acts as a barrier for other backends. This new flag gets used by autovacuum to discard more aggressively orphaned tables by additionally checking for the database a backend is connected to as well as its temporary namespace in-use, removing orphaned temporary relations even if a backend reuses the same slot as one which created temporary relations in a past session. The base idea of this patch comes from Robert Haas, has been written in its first version by Tsunakawa Takayuki, then heavily reviewed by me. Author: Tsunakawa Takayuki Reviewed-by: Michael Paquier, Kyotaro Horiguchi, Andres Freund Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05 Backpatch: 11-, as PGPROC gains a new flag and we don't want silent ABI breakages on already released versions.	2018-08-13 11:49:04 +02:00
Amit Kapila	4f9a97e417	Adjust comment atop ExecShutdownNode. After commits `a315b967cc` and `b805b63ac2`, part of the comment atop ExecShutdownNode is redundant. Adjust it. Author: Amit Kapila Backpatch-through: 10 where both the mentioned commits are present. Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info	2018-08-13 10:04:39 +05:30
Amit Kapila	2cd0acfdad	Prohibit shutting down resources if there is a possibility of back up. Currently, we release the asynchronous resources as soon as it is evident that no more rows will be needed e.g. when a Limit is filled. This can be problematic especially for custom and foreign scans where we can scan backward. Fix that by disallowing the shutting down of resources in such cases. Reported-by: Robert Haas Analysed-by: Robert Haas and Amit Kapila Author: Amit Kapila Reviewed-by: Robert Haas Backpatch-through: 9.6 where this code was introduced Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info	2018-08-13 08:22:18 +05:30
Andrew Gierth	07172d5aff	Avoid query-lifetime memory leaks in XMLTABLE (bug #15321 ) Multiple calls to XMLTABLE in a query (e.g. laterally applying it to a table with an xml column, an important use-case) were leaking large amounts of memory into the per-query context, blowing up memory usage. Repair by reorganizing memory context usage in nodeTableFuncscan; use the usual per-tuple context for row-by-row evaluations instead of perValueCxt, and use the explicitly created context -- renamed from perValueCxt to perTableCxt -- for arguments and state for each individual table-generation operation. Backpatch to PG10 where this code was introduced. Original report by IRC user begriffs; analysis and patch by me. Reviewed by Tom Lane and Pavel Stehule. Discussion: https://postgr.es/m/153394403528.10284.7530399040974170549@wrigleys.postgresql.org	2018-08-13 01:59:45 +01:00
Tom Lane	4a2994f055	Fix wrong order of operations in inheritance_planner. When considering a partitioning parent rel, we should stop processing that subroot as soon as we've done adjust_appendrel_attrs and any securityQuals updates. The rest of this is unnecessary, and indeed adding duplicate subquery RTEs to the subroot is wrong. As the code stood, the children of that partition ended up with two sets of copied subquery RTEs, confusing matters greatly. Even more hilarity ensued if all of the children got excluded by constraint exclusion, so that the extra RTEs didn't make it back into the parent rtable. Per fuzz testing by Andreas Seltenreich. Back-patch to v11 where this got broken (by commit `0a480502b`, it looks like). Discussion: https://postgr.es/m/87va8g7vq0.fsf@ansel.ydns.eu	2018-08-11 15:53:20 -04:00
Andrew Dunstan	5c047fd709	Revert changes in execMain.c from commit `16828d5c02` These changes were put in at some stage of the development process, but are unnecessary and should not have made it into the final patch. Mea culpa. Per gripe from Andreas Freund Backpatch to REL_11_STABLE	2018-08-10 16:09:25 -04:00
Peter Geoghegan	4974d7f87e	Handle parallel index builds on mapped relations. Commit `9da0cc3528`, which introduced parallel CREATE INDEX, failed to propagate relmapper.c backend local cache state to parallel worker processes. This could result in parallel index builds against mapped catalog relations where the leader process (participating as a worker) scans the new, pristine relfilenode, while worker processes scan the obsolescent relfilenode. When this happened, the final index structure was typically not consistent with the owning table's structure. The final index structure could contain entries formed from both heap relfilenodes. Only rebuilds on mapped catalog relations that occur as part of a VACUUM FULL or CLUSTER could become corrupt in practice, since their mapped relation relfilenode swap is what allows the inconsistency to arise. On master, fix the problem by propagating the required relmapper.c backend state as part of standard parallel initialization (Cf. commit `29d58fd3`). On v11, simply disallow builds against mapped catalog relations by deeming them parallel unsafe. Author: Peter Geoghegan Reported-By: "death lock" Reviewed-By: Tom Lane, Amit Kapila Bug: #15309 Discussion: https://postgr.es/m/153329671686.1405.18298309097348420351@wrigleys.postgresql.org Backpatch: 11-, where parallel CREATE INDEX was introduced.	2018-08-10 13:01:34 -07:00
Michael Paquier	f841ceb26d	Improve TRUNCATE by avoiding early lock queue A caller of TRUNCATE could previously queue for an access exclusive lock on a relation it may not have permission to truncate, potentially interfering with users authorized to work on it. This can be very intrusive depending on the lock attempted to be taken. For example, pg_authid could be blocked, preventing any authentication attempt to happen on a PostgreSQL instance. This commit fixes the case of TRUNCATE so as RangeVarGetRelidExtended is used with a callback doing the necessary ACL checks at an earlier stage, avoiding lock queuing issues, so as an immediate failure happens for unprivileged users instead of waiting on a lock that would not be taken. This is rather similar to the type of work done in `cbe24a6` for CLUSTER, and the code of TRUNCATE is this time refactored so as there is no user-facing changes. As the commit for CLUSTER, no back-patch is done. Reported-by: Lloyd Albin, Jeremy Schneider Author: Michael Paquier Reviewed by: Nathan Bossart, Kyotaro Horiguchi Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org Discussion: https://postgr.es/m/20180806165816.GA19883@paquier.xyz	2018-08-10 18:26:59 +02:00

1 2 3 4 5 ...

18802 Commits