postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	3ad2afc2e9	Define LDAPS_PORT if it's missing and disable implicit LDAPS on Windows Some versions of Windows don't define LDAPS_PORT. Also, Windows' ldap_sslinit() is documented to use LDAPS even if you said secure=0 when the port number happens to be 636 or 3269. Let's avoid using the port number to imply that you want LDAPS, so that connection strings have the same meaning on Windows and Unix. Author: Thomas Munro Discussion: https://postgr.es/m/CAEepm%3D23B7GV4AUz3MYH1TKpTv030VHxD2Sn%2BLYWDv8d-qWxww%40mail.gmail.com	2018-01-04 10:34:41 -05:00
Robert Haas	c759395617	Code review for Parallel Append. - Remove unnecessary #include mistakenly added in execnodes.h. - Fix mistake in comment in choose_next_subplan_for_leader. - Adjust row estimates in cost_append for a possibly-different parallel divisor. - Clamp row estimates in cost_append after operations that may not produce integers. Amit Kapila, with cosmetic adjustments by me. Discussion: http://postgr.es/m/CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com	2018-01-04 07:56:09 -05:00
Tom Lane	47c6772eb7	Clean up tupdesc.c for recent changes. TupleDescCopy needs to have the same effects as CreateTupleDescCopy in that, since it doesn't copy constraints, it should clear the per-attribute fields associated with them. Oversight in commit `cc5f81366`. Since TupleDescCopy has already established the presumption that it can just flat-copy the entire attribute array in one go, propagate that approach into CreateTupleDescCopy and CreateTupleDescCopyConstr. (I'm suspicious that this would lead to valgrind complaints if we had any trailing padding in the struct, but we do not, and anyway fixing that seems like a job for a separate commit.) Add some better comments. Thomas Munro, reviewed by Vik Fearing, some additional hacking by me Discussion: https://postgr.es/m/CAEepm=0NvOGZ8B6GbQyQe2C_c2m3LKJ9w=8OMBaYRLgZ_Gw6Nw@mail.gmail.com	2018-01-03 17:53:41 -05:00
Alvaro Herrera	bab2969867	Fix typo Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/d8jefpk4jtd.fsf@dalvik.ping.uio.no	2018-01-03 19:12:06 -03:00
Alvaro Herrera	3c27944fb2	Make XactLockTableWait work for transactions that are not yet self-locked XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. To fix, change SubTransGetParent to SubTransGetTopmostTransaction which has a slightly different API: it returns the argument Xid if there is no parent, and it goes all the way to the top instead of moving up the levels one by one. Also, to avoid busy-waiting, add a 1ms sleep to give the other process time to register itself in the lock table. For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas	2018-01-03 17:26:20 -03:00
Tom Lane	6fcde24063	Fix some minor errors in new PHJ code. Correct ExecParallelHashTuplePrealloc's estimate of whether the space_allowed limit is exceeded. Be more consistent about tuples that are exactly HASH_CHUNK_THRESHOLD in size (they're "small", not "large"). Neither of these things explain the current buildfarm unhappiness, but they're still bugs. Thomas Munro, per gripe by me Discussion: https://postgr.es/m/CAEepm=34PDuR69kfYVhmZPgMdy8pSA-MYbpesEN1SR+2oj3Y+w@mail.gmail.com	2018-01-03 12:53:49 -05:00
Tom Lane	3decd150a2	Teach eval_const_expressions() to handle some more cases. Add some infrastructure (mostly macros) to make it easier to write typical cases for constant-expression simplification. Add simplification processing for ArrayRef, RowExpr, and ScalarArrayOpExpr node types, which formerly went unsimplified even if all their inputs were constants. Also teach it to simplify FieldSelect from a composite constant. Make use of the new infrastructure to reduce the amount of code needed for the existing ArrayExpr and ArrayCoerceExpr cases. One existing test case changes output as a result of the fact that RowExpr can now be folded to a constant. All the new code is exercised by existing test cases according to gcov, so I feel no need to add additional tests. Tom Lane, reviewed by Dmitry Dolgov Discussion: https://postgr.es/m/3be3b82c-e29c-b674-2163-bf47d98817b1@iki.fi	2018-01-03 12:35:09 -05:00
Peter Eisentraut	35c0754fad	Allow ldaps when using ldap authentication While ldaptls=1 provides an RFC 4513 conforming way to do LDAP authentication with TLS encryption, there was an earlier de facto standard way to do LDAP over SSL called LDAPS. Even though it's not enshrined in a standard, it's still widely used and sometimes required by organizations' network policies. There seems to be no reason not to support it when available in the client library. Therefore, add support when using OpenLDAP 2.4+ or Windows. It can be configured with ldapscheme=ldaps or ldapurl=ldaps://... Add tests for both ways of requesting LDAPS and a test for the pre-existing ldaptls=1. Modify the 001_auth.pl test for "diagnostic messages", which was previously relying on the server rejecting ldaptls=1. Author: Thomas Munro Reviewed-By: Peter Eisentraut Discussion: https://postgr.es/m/CAEepm=1s+pA-LZUjQ-9GQz0Z4rX_eK=DFXAF1nBQ+ROPimuOYQ@mail.gmail.com	2018-01-03 10:11:26 -05:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Andres Freund	f9ccf92e16	Simplify representation of aggregate transition values a bit. Previously aggregate transition values for hash and other forms of aggregation (i.e. sort and no group by) were represented differently. Hash based aggregation used a grouping set indexed array pointing to an array of transition values, whereas other forms of aggregation used one flattened array with the index being computed out of grouping set and transition offsets. That made upcoming changes hard, so represent both as grouping set indexed array of per-group data. As a nice side-effect this also makes aggregation slightly faster, because computing offsets with `transno + (setno * numTrans)` turns out not to be that cheap (too big for x86 lea for example). Author: Andres Freund Discussion: https://postgr.es/m/20171128003121.nmxbm2ounxzb6n2t@alap3.anarazel.de	2018-01-02 18:23:37 -08:00
Tom Lane	5dc692f78d	Ensure proper alignment of tuples in HashMemoryChunkData buffers. The previous coding relied (without any documentation) on the data[] member of HashMemoryChunkData being at a MAXALIGN'ed offset. If it was not, the tuples would not be maxaligned either, leading to failures on alignment-picky machines. While there seems to be no live bug on any platform we support, this is clearly pretty fragile: any addition to or rearrangement of the fields in HashMemoryChunkData could break it. Let's remove the hazard by getting rid of the data[] member and instead using pointer arithmetic with an explicitly maxalign'ed offset. Discussion: https://postgr.es/m/14483.1514938129@sss.pgh.pa.us	2018-01-02 21:23:06 -05:00
Alvaro Herrera	54eff5311d	Fix deadlock hazard in CREATE INDEX CONCURRENTLY Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit `c3d09b3bd2` specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by `8aa3e47510` keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com	2018-01-02 19:16:16 -03:00
Peter Eisentraut	438036264a	Don't cast between GinNullCategory and bool The original idea was that we could use an isNull-style bool array directly as a GinNullCategory array. However, the existing code already acknowledges that that doesn't really work, because of the possibility that bool as currently defined can have arbitrary bit patterns for true values. So it has to loop through the nullFlags array to set each bool value to an acceptable value. But if we are looping through the whole array anyway, we might as well build a proper GinNullCategory array instead and abandon the type casting. That makes the code much safer in case bool is ever changed to something else. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-01-02 12:20:56 -05:00
Andres Freund	93ea78b17c	Fix EXPLAIN ANALYZE output for Parallel Hash. In a race case, EXPLAIN ANALYZE could fail to display correct nbatch and size information. Refactor so that participants report only on batches they worked on rather than trying to report on all of them, and teach explain.c to consider the HashInstrumentation object from all participants instead of picking the first one it can find. This should fix an occasional build farm failure in the "join" regression test. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/30219.1514428346%40sss.pgh.pa.us	2018-01-01 14:38:23 -08:00
Andres Freund	b40933101c	Perform slot validity checks in a separate pass over expression. This reduces code duplication a bit, but the primary benefit that it makes JITing expression evaluation easier. When doing so we can't, as previously done in the interpreted case, really change opcode without recompiling. Nor dow we just carry around unnecessary branches to avoid re-checking over and over. As a minor side-effect this makes ExecEvalStepOp() O(log(N)) rather than O(N). Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2017-12-29 12:45:25 -08:00
Andres Freund	4717fdb14c	Rely on executor utils to build targetlist for DML RETURNING. This is useful because it gets rid of the sole direct user of ExecAssignResultType(). A future commit will likely make use of that and combine creating the targetlist with the initialization of the result slot. But it seems like good code hygiene anyway. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2017-12-29 12:26:29 -08:00
Magnus Hagander	d02974e32e	Properly set base backup backends to active in pg_stat_activity When walsenders were included in pg_stat_activity, only the ones actually streaming WAL were listed as active when they were active. In particular, the connections sending base backups were listed as being idle. Which means that a regular pg_basebackup would show up with one active and one idle connection, when both were active. This patch updates to set all walsenders to active when they are (including those doing very fast things like IDENTIFY_SYSTEM), and then back to idle. Details about exactly what they are doing is available in pg_stat_replication. Patch by me, review by Michael Paquier and David Steele.	2017-12-29 16:28:32 +01:00
Simon Riggs	48c9f49265	Fix race condition when changing synchronous_standby_names A momentary window exists when synchronous_standby_names changes that allows commands issued after the change to continue to act as async until the change becomes visible. Remove the race by using more appropriate test in syncrep.c Author: Asim Rama Praveen and Ashwin Agrawal Reported-by: Xin Zhang, Ashwin Agrawal, and Asim Rama Praveen Reviewed-by: Michael Paquier, Masahiko Sawada	2017-12-29 14:30:33 +00:00
Simon Riggs	2958a672b1	Extend near-wraparound hints to include replication slots Author: Feike Steenbergen Reviewed-by: Michael Paquier	2017-12-29 14:01:25 +00:00
Andres Freund	f83040c62a	Fix rare assertion failure in parallel hash join. When a backend runs out of inner tuples to hash, it should detach from grow_batch_barrier only after it has flushed all batches to disk and merged counters, not before. Otherwise a concurrent backend in ExecParallelHashIncreaseNumBatches() could stop waiting for this backend and try to read tuples before they have been written. This commit reorders those operations and should fix the assertion failures seen occasionally on the build farm since commit `1804284042`. Author: Thomas Munro Discussion: https://postgr.es/m/E1eRwXy-0004IK-TO%40gemulon.postgresql.org	2017-12-28 02:41:53 -08:00
Alvaro Herrera	be2343221f	Protect against hypothetical memory leaks in RelationGetPartitionKey Also, fix a comment that commit `8a0596cb65` made obsolete. Reported-by: Robert Haas Discussion: http://postgr.es/m/CA+TgmoYbpuUUUp2GhYNwWm0qkah39spiU7uOiNXLz20ASfKYoA@mail.gmail.com	2017-12-27 18:06:14 -03:00
Robert Haas	62d02f39e7	Fix race-under-concurrency in PathNameCreateTemporaryDir. Thomas Munro Discussion: http://postgr.es/m/CAEepm=1Vp1e3KtftLtw4B60ZV9teNeKu6HxoaaBptQMsRWjJbQ@mail.gmail.com	2017-12-27 10:56:14 -08:00
Teodor Sigaev	ad337c76b6	Update relation's stats in pg_class during vacuum full. Hash index depends on estimation of numbers of tuples and pages of relations, incorrect value could be a reason of significantly growing of index. Vacuum full recreates heap and reindex all indexes before renewal stats. The patch fixes that, so indexes will see correct values. Backpatch to v10 only because earlier versions haven't usable hash index and growing of hash index is a single user-visible symptom. Author: Amit Kapila Reviewed-by: Ashutosh Sharma, me Discussion: https://www.postgresql.org/message-id/flat/20171115232922.5tomkxnw3iq6jsg7@inml.weebeastie.net	2017-12-27 18:25:37 +03:00
Teodor Sigaev	ff963b393c	Add polygon opclass for SP-GiST Polygon opclass uses compress method feature of SP-GiST added earlier. For now it's a single operator class which uses this feature. SP-GiST actually indexes a bounding boxes of input polygons, so part of supported operations are lossy. Opclass uses most methods of corresponding opclass over boxes of SP-GiST and treats bounding boxes as point in 4D-space. Bump catalog version. Authors: Nikita Glukhov, Alexander Korotkov with minor editorization by me Reviewed-By: all authors + Darafei Praliaskouski Discussion: https://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru	2017-12-25 18:59:38 +03:00
Andres Freund	4e2970f880	Fix assert with side effects in the new PHJ code. Instead of asserting the assert just set the value to what it was supposed to test... Per coverity.	2017-12-24 02:57:55 -08:00
Tom Lane	c4c2885cbb	Fix UNION/INTERSECT/EXCEPT over no columns. Since 9.4, we've allowed the syntax "select union select" and variants of that. However, the planner wasn't expecting a no-column set operation and ended up treating the set operation as if it were UNION ALL. Turns out it's trivial to fix in v10 and later; we just need to be careful about not generating a Sort node with no sort keys. However, since a weird corner case like this is never going to be exercised by developers, we'd better have thorough regression tests if we want to consider it supported. Per report from Victor Yegorov. Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com	2017-12-22 12:08:06 -05:00
Teodor Sigaev	854823fa33	Add optional compression method to SP-GiST Patch allows to have different types of column and value stored in leaf tuples of SP-GiST. The main application of feature is to transform complex column type to simple indexed type or for truncating too long value, transformation could be lossy. Simple example: polygons are converted to their bounding boxes, this opclass follows. Authors: me, Heikki Linnakangas, Alexander Korotkov, Nikita Glukhov Reviewed-By: all authors + Darafei Praliaskouski Discussions: https://www.postgresql.org/message-id/5447B3FF.2080406@sigaev.ru https://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru#54907069.1030506@sigaev.ru	2017-12-22 13:33:16 +03:00
Alvaro Herrera	9373baa0f7	Minor edits to catalog files and scripts This fixes a few typos and small mistakes; it also cleans a few minor stylistic issues. The biggest functional change is that Gen_fmgrtab.pl no longer knows the OID of language 'internal'. Author: John Naylor Discussion: https://postgr.es/m/CAJVSVGXAkwbk-A9QHHHf00N905kKisyQbaYwKqaRpze_gPXGfg@mail.gmail.com	2017-12-21 19:07:32 -03:00
Robert Haas	cce1ecfc77	Adjust assertion in GetCurrentCommandId. currentCommandIdUsed is only used to skip redundant increments of the command counter, and CommandCounterIncrement() is categorically denied under parallelism anyway. Therefore, it's OK for GetCurrentCommandId() to mark the counter value used, as long as it happens in the leader, not a worker. Prior to commit `e9baa5e9fa`, the slightly incorrect check didn't matter, but now it does. A test case added by commit `1804284042` uncovered the problem by accident; it caused failures with force_parallel_mode=on/regress. Report and review by Andres Freund. Patch by me. Discussion: http://postgr.es/m/20171221143106.5lhtygohvmazli3x@alap3.anarazel.de	2017-12-21 13:19:59 -05:00
Tom Lane	6719b238e8	Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit. This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit `6c82d8d1f`, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us	2017-12-21 12:57:45 -05:00
Alvaro Herrera	8a0596cb65	Get rid of copy_partition_key That function currently exists to avoid leaking memory in CacheMemoryContext in case of trouble while the partition key is being built, but there's a better way: allocate everything in a memcxt that goes away if the current (sub)transaction fails, and once the partition key is built and no further errors can occur, make the memcxt permanent by making it a child of CacheMemoryContext. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20171027172730.eh2domlkpn4ja62m@alvherre.pgsql	2017-12-21 14:21:39 -03:00
Alvaro Herrera	9ef6aba1d3	Fix typo	2017-12-21 13:36:52 -03:00
Tom Lane	c98c35cd08	Avoid putting build-location-dependent strings into generated files. Various Perl scripts we use to generate files were in the habit of printing things like "generated by $0" into their output files. That looks like a fine idea at first glance, but it results in non-reproducible output, because in VPATH builds $0 won't be just the name of the script file, but a full path for it. We'd prefer that you get identical results whether using VPATH or not, so this is a bad thing. Some of these places also printed their input file name(s), causing an additional hazard of the same type. Hence, establish a policy that thou shalt not print $0, nor input file pathnames, into output files (they're still allowed in error messages, though). Instead just write the script name verbatim. While we are at it, we can make these annotations more useful by giving the script's full relative path name within the PG source tree, eg instead of Gen_fmgrtab.pl let's print src/backend/utils/Gen_fmgrtab.pl. Not all of the changes made here actually affect any files shipped in finished tarballs today, but it seems best to apply the policy everyplace so that nobody copies unsafe code into places where it could matter. Christoph Berg and Tom Lane Discussion: https://postgr.es/m/20171215102223.GB31812@msg.df7cb.de	2017-12-21 10:57:06 -05:00
Robert Haas	59d1e2b95a	Cancel CV sleep during subtransaction abort. Generally, error recovery paths that need to do things like LWLockReleaseAll and pgstat_report_wait_end also need to call ConditionVariableCancelSleep, but AbortSubTransaction was missed. Since subtransaction abort might destroy up the DSM segment that contains the ConditionVariable stored in cv_sleep_target, this can result in a crash for anything using condition variables. Reported and diagnosed by Andres Freund. Discussion: http://postgr.es/m/20171221110048.rxk6464azzl5t2fi@alap3.anarazel.de	2017-12-21 09:24:30 -05:00
Andres Freund	1804284042	Add parallel-aware hash joins. Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, meaning that the work of the inner subplan was duplicated in every worker. After this commit, the planner will consider using a partial subplan on the inner side too, using the Parallel Hash node to divide the work over the available CPU cores and combine its results in shared memory. If the join needs to be split into multiple batches in order to respect work_mem, then workers process different batches as much as possible and then work together on the remaining batches. The advantages of a parallel-aware hash join over a parallel-oblivious hash join used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUs One disadvantage is that there is some communication between the participating CPUs which might outweigh the benefits of parallelism in the case of small hash tables. This is avoided by the planner's existing reluctance to supply partial plans for small scans, but it may be necessary to estimate synchronization costs in future if that situation changes. Another is that outer batch 0 must be written to disk if multiple batches are required. A potential future advantage of parallel-aware hash joins is that right and full outer joins could be supported, since there is a single set of matched bits for each hashtable, but that is not yet implemented. A new GUC enable_parallel_hash is defined to control the feature, defaulting to on. Author: Thomas Munro Reviewed-By: Andres Freund, Robert Haas Tested-By: Rafia Sabih, Prabhat Sahu Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com	2017-12-21 00:43:41 -08:00
Robert Haas	f94eec490b	When passing query strings to workers, pass the terminating \0. Otherwise, when the query string is read, we might trailing garbage beyond the end, unless there happens to be a \0 there by good luck. Report and patch by Thomas Munro. Reviewed by Rafia Sabih. Discussion: http://postgr.es/m/CAEepm=2SJs7X+_vx8QoDu8d1SMEOxtLhxxLNzZun_BvNkuNhrw@mail.gmail.com	2017-12-20 17:26:50 -05:00
Robert Haas	8526bcb2df	Try again to fix accumulation of parallel worker instrumentation. When a Gather or Gather Merge node is started and stopped multiple times, accumulate instrumentation data only once, at the end, instead of after each execution, to avoid recording inflated totals. Commit `778e78ae9f`, the previous attempt at a fix, instead reset the state after every execution, which worked for the general instrumentation data but had problems for the additional instrumentation specific to Sort and Hash nodes. Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila, following a design proposal from Thomas Munro, with a comment tweak by me. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com	2017-12-19 12:21:56 -05:00
Robert Haas	38fc54703e	Re-fix wrong costing of Sort under Gather Merge. Commit `dc02c7bca4` changed this call to create_sort_path() to take -1 rather than limit_tuples because, at that time, there was no way for a Sort beneath a Gather Merge to become a top-N sort. Later, commit `3452dc5240` provided a way for a Sort beneath a Gather Merge to become a top-N sort, but failed to revert the previous commit in the process. Do that. Report and analysis by Jeff Janes; patch by Thomas Munro; review by Amit Kapila and by me. Discussion: http://postgr.es/m/CAEepm=1BWtC34vUroA0Uqjw02MaqdUrW+d6WD85_k8SLyPiKHQ@mail.gmail.com	2017-12-19 10:42:17 -05:00
Andres Freund	ab9e0e718a	Add shared tuplestores. SharedTuplestore allows multiple participants to write into it and then read the tuples back from it in parallel. Each reader receives partial results. For now it always uses disk files, but other buffering policies and other kinds of scans (ie each reader receives complete results) may be useful in future. The upcoming parallel hash join feature will use this facility. Author: Thomas Munro Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com	2017-12-18 14:23:19 -08:00
Peter Eisentraut	25d532698d	Move SCRAM-related name definitions to scram-common.h Mechanism names for SCRAM and channel binding names have been included in scram.h by the libpq frontend code, and this header references a set of routines which are only used by the backend. scram-common.h is on the contrary usable by both the backend and libpq, so getting those names from there seems more reasonable. Author: Michael Paquier <michael.paquier@gmail.com>	2017-12-18 16:59:48 -05:00
Fujii Masao	56a95ee511	Fix bug in cancellation of non-exclusive backup to avoid assertion failure. Previously an assertion failure occurred when pg_stop_backup() for non-exclusive backup was aborted while it's waiting for WAL files to be archived. This assertion failure happened in do_pg_abort_backup() which was called when a non-exclusive backup was canceled. do_pg_abort_backup() assumes that there is at least one non-exclusive backup running when it's called. But pg_stop_backup() can be canceled even after it marks the end of non-exclusive backup (e.g., during waiting for WAL archiving). This broke the assumption that do_pg_abort_backup() relies on, and which caused an assertion failure. This commit changes do_pg_abort_backup() so that it does nothing when non-exclusive backup has been already marked as completed. That is, the asssumption is also changed, and do_pg_abort_backup() now can handle even the case where it's called when there is no running backup. Backpatch to 9.6 where SQL-callable non-exclusive backup was added. Author: Masahiko Sawada and Michael Paquier Reviewed-By: Robert Haas and Fujii Masao Discussion: https://www.postgresql.org/message-id/CAD21AoD2L1Fu2c==gnVASMyFAAaq3y-AQ2uEVj-zTCGFFjvmDg@mail.gmail.com	2017-12-19 03:46:14 +09:00
Robert Haas	fd7c0fa732	Fix crashes on plans with multiple Gather (Merge) nodes. es_query_dsa turns out to be broken by design, because it supposes that there is only one DSA for the whole query, whereas there is actually one per Gather (Merge) node. For now, work around that problem by setting and clearing the pointer around the sections of code that might need it. It's probably a better idea to get rid of es_query_dsa altogether in favor of having each node keep track individually of which DSA is relevant, but that seems like more than we would want to back-patch. Thomas Munro, reviewed and tested by Andreas Seltenreich, Amit Kapila, and by me. Discussion: http://postgr.es/m/CAEepm=1U6as=brnVvMNixEV2tpi8NuyQoTmO8Qef0-VV+=7MDA@mail.gmail.com	2017-12-18 12:22:31 -05:00
Magnus Hagander	7731c32087	Fix typo on comment Author: David Rowley	2017-12-18 11:24:55 +01:00
Tom Lane	b31a9d7dd3	Suppress compiler warning about no function return value. Compilers that don't know that ereport(ERROR) doesn't return complained about the new coding in scanint8() introduced by commit `101c7ee3e`. Tweak coding to avoid the warning. Per buildfarm.	2017-12-17 00:41:41 -05:00
Andres Freund	699bf7d05c	Perform a lot more sanity checks when freezing tuples. The previous commit has shown that the sanity checks around freezing aren't strong enough. Strengthening them seems especially important because the existance of the bug has caused corruption that we don't want to make even worse during future vacuum cycles. The errors are emitted with ereport rather than elog, despite being "should never happen" messages, so a proper error code is emitted. To avoid superflous translations, mark messages as internal. Author: Andres Freund and Alvaro Herrera Reviewed-By: Alvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-	2017-12-14 18:20:47 -08:00
Andres Freund	9c2f0a6c3c	Fix pruning of locked and updated tuples. Previously it was possible that a tuple was not pruned during vacuum, even though its update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, which in turn can lead two to breakages: First, failing index lookups, which in turn can e.g lead to constraints being violated. Second, future hot prunes / vacuums can end up making invisible tuples visible again. There's other harmful scenarios. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. A followup commit will harden the code somewhat against future similar bugs and already corrupted data. Author: Andres Freund, with changes by Alvaro Herrera Reported-By: Daniel Wood Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-	2017-12-14 18:20:47 -08:00
Andrew Dunstan	0fedb4ea69	Fix walsender timeouts when decoding a large transaction The logical slots have a fast code path for sending data so as not to impose too high a per message overhead. The fast path skips checks for interrupts and timeouts. However, the existing coding failed to consider the fact that a transaction with a large number of changes may take a very long time to be processed and sent to the client. This causes the walsender to ignore interrupts for potentially a long time and more importantly it will result in the walsender being killed due to timeout at the end of such a transaction. This commit changes the fast path to also check for interrupts and only allows calling the fast path when the last keepalive check happened less than half the walsender timeout ago. Otherwise the slower code path will be taken. Backpatched to 9.4 Petr Jelinek, reviewed by Kyotaro HORIGUCHI, Yura Sokolov, Craig Ringer and Robert Haas. Discussion: https://postgr.es/m/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com	2017-12-14 11:13:14 -05:00
Andres Freund	538d114f6d	Allow executor nodes to change their ExecProcNode function. In order for executor nodes to be able to change their ExecProcNode function after ExecInitNode() has finished, provide ExecSetExecProcNode(). This allows any wrappers functions that only execProcnode.c knows about to be reinstalled. The motivation for wanting to change ExecProcNode after ExecInitNode() has finished is that it is not known until later whether parallel query is available, so if a parallel variant is to be installed then ExecInitNode() is too soon to decide. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=09rr65VN+cAV5FgyM_z=D77Xy8Fuc9CDDDYbq3pQUezg@mail.gmail.com	2017-12-13 15:47:01 -08:00
Andres Freund	923e8dee88	Add defenses against pre-crash files to BufFileOpenShared(). Crash restarts currently don't clean up temporary files, as a debugging aid. If a left-over file happens to have the same name as a segment file we're trying to create, we'll just truncate and reuse it, but there is a problem: BufFileOpenShared() determines how many segment files exist by trying to open .0, .1, .2, ... until it finds no more files. It might be confused by a junk file that has the next segment number. To defend against that, make sure we always create a gap after the end file by unlinking the following name if it exists. Also make it an error to try to open a BufFile that doesn't exist (has no segment 0), so as not to encourage the development of client code that depends on an interface that we can't reliably provide. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm%3D2jhCbC_GFQJaaDhWxLB4EXtT3vVd5czuRNaqF5CWSTog%40mail.gmail.com	2017-12-13 13:27:41 -08:00
Robert Haas	884a60840c	Fix parallel index scan hang with deleted or half-dead pages. The previous coding forgot to release the scan before seizing it again, leading to a lockup. Report by Patrick Hemmer. Diagnosis by Thomas Munro. Patch by Amit Kapila. Discussion: http://postgr.es/m/CAEepm=2xZUcOGP9V0O_G0=2P2wwXwPrkF=upWTCJSisUxMnuSg@mail.gmail.com	2017-12-13 16:15:44 -05:00
Robert Haas	1d6fb35ad6	Revert "Fix accumulation of parallel worker instrumentation." This reverts commit `2c09a5c12a`. Per further discussion, that doesn't seem to be the best possible fix. Discussion: http://postgr.es/m/CAA4eK1LW2aFKzY3=vwvc=t-juzPPVWP2uT1bpx_MeyEqnM+p8g@mail.gmail.com	2017-12-13 15:19:28 -05:00
Tom Lane	9fa6f00b13	Rethink MemoryContext creation to improve performance. This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now required to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us	2017-12-13 13:55:16 -05:00
Peter Eisentraut	3d8874224f	Fix crash when using CALL on an aggregate Author: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> Reported-by: Rushabh Lathia <rushabh.lathia@gmail.com>	2017-12-13 10:37:48 -05:00
Andres Freund	8e211f5391	Add float.h include to int8.c, for isnan(). port.h redirects isnan() to _isnan() on windows, which in turn is provided by float.h rather than math.h. Therefore include the latter as well. Per buildfarm.	2017-12-12 23:32:43 -08:00
Andres Freund	f512a6e132	Consistently use PG_INT(16\|32\|64)_(MIN\|MAX). Per buildfarm animal woodlouse.	2017-12-12 18:19:13 -08:00
Andres Freund	101c7ee3ee	Use new overflow aware integer operations. A previous commit added inline functions that provide fast(er) and correct overflow checks for signed integer math. Use them in a significant portion of backend code. There's more to touch in both backend and frontend code, but these were the easily identifiable cases. The old overflow checks are noticeable in integer heavy workloads. A secondary benefit is that getting rid of overflow checks that rely on signed integer overflow wrapping around, will allow us to get rid of -fwrapv in the future. Which in turn slows down other code. Author: Andres Freund Discussion: https://postgr.es/m/20171024103954.ztmatprlglz3rwke@alap3.anarazel.de	2017-12-12 16:55:37 -08:00
Robert Haas	95b52351fe	Remove obsolete comment. Commit `8b304b8b72` removed replacement selection, but left behind this comment text. The optimization to which the comment refers is not relevant without replacement selection, because if we had so few tuples as to require only one tape, we would have just completed the sort in memory. Peter Geoghegan Discussion: http://postgr.es/m/CAH2-WznqupLA8CMjp+vqzoe0yXu0DYYbQSNZxmgN76tLnAOZ_w@mail.gmail.com	2017-12-12 19:33:50 -05:00
Robert Haas	d329dc2ea4	Remove bug from OPTIMIZER_DEBUG code for partition-wise join. Etsuro Fujita, reviewed by Ashutosh Bapat Discussion: http://postgr.es/m/5A2A60E6.6000008@lab.ntt.co.jp	2017-12-12 10:52:15 -05:00
Peter Eisentraut	4034db215b	Fix comment Reported-by: Noah Misch <noah@leadboat.com>	2017-12-11 16:37:39 -05:00
Tom Lane	7eb16ab17d	Fix corner-case coredump in _SPI_error_callback(). I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL, and only fills it in some time later. If an error were to happen in between, _SPI_error_callback would try to dereference the null pointer. This is unlikely --- there's not much between those points except push-snapshot calls --- but it's clearly not impossible. Tweak the callback to do nothing if the pointer isn't set yet. It's been like this for awhile, so back-patch to all supported branches.	2017-12-11 16:34:28 -05:00
Robert Haas	01a0ca1bed	Improve comment about PartitionBoundInfoData. Ashutosh Bapat, per discussion with Julien Rouhaund, who also reviewed this patch. Discussion: http://postgr.es/m/CAFjFpReBR3ftK9C23LLCZY_TDXhhjB_dgE-L9+mfTnA=gkvdvQ@mail.gmail.com	2017-12-11 12:52:15 -05:00
Magnus Hagander	d8f632caec	Fix typo Reported by Robins Tharakan	2017-12-09 11:40:31 +01:00
Peter Eisentraut	005ac298b1	Prohibit identity columns on typed tables and partitions Those cases currently crash and supporting them is more work then originally thought, so we'll just prohibit these scenarios for now. Author: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Reported-by: Мансур Галиев <gomer94@yandex.ru> Bug: #14866	2017-12-08 12:13:04 -05:00
Peter Eisentraut	af9f8b7ca3	Fix mistake in comment Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>	2017-12-08 11:23:36 -05:00
Peter Eisentraut	2d2d06b7e2	Apply identity sequence values on COPY A COPY into a table should apply identity sequence values just like it does for ordinary defaults. This was previously forgotten, leading to null values being inserted, which in turn would fail because identity columns have not-null constraints. Author: Michael Paquier <michael.paquier@gmail.com> Reported-by: Steven Winfield <steven.winfield@cantabcapital.com> Bug: #14952	2017-12-08 09:18:18 -05:00
Robert Haas	28724fd90d	Report failure to start a background worker. When a worker is flagged as BGW_NEVER_RESTART and we fail to start it, or if it is not marked BGW_NEVER_RESTART but is terminated before startup succeeds, what BgwHandleStatus should be reported? The previous code really hadn't considered this possibility (as indicated by the comments which ignore it completely) and would typically return BGWH_NOT_YET_STARTED, but that's not a good answer, because then there's no way for code using GetBackgroundWorkerPid() to tell the difference between a worker that has not started but will start later and a worker that has not started and will never be started. So, when this case happens, return BGWH_STOPPED instead. Update the comments to reflect this. The preceding fix by itself is insufficient to fix the problem, because the old code also didn't send a notification to the process identified in bgw_notify_pid when startup failed. That might've been technically correct under the theory that the status of the worker was BGWH_NOT_YET_STARTED, because the status would indeed not change when the worker failed to start, but now that we're more usefully reporting BGWH_STOPPED, a notification is needed. Without these fixes, code which starts background workers and then uses the recommended APIs to wait for those background workers to start would hang indefinitely if the postmaster failed to fork a worker. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com	2017-12-06 08:58:27 -05:00
Robert Haas	9c64ddd414	Fix Parallel Append crash. Reported by Tom Lane and the buildfarm. Amul Sul and Amit Khandekar Discussion: http://postgr.es/m/17868.1512519318@sss.pgh.pa.us Discussion: http://postgr.es/m/CAJ3gD9cJQ4d-XhmZ6BqM9rMM2KDBfpkdgOAb4+psz56uBuMQ_A@mail.gmail.com	2017-12-06 08:42:50 -05:00
Robert Haas	ab72716778	Support Parallel Append plan nodes. When we create an Append node, we can spread out the workers over the subplans instead of piling on to each subplan one at a time, which should typically be a bit more efficient, both because the startup cost of any plan executed entirely by one worker is paid only once and also because of reduced contention. We can also construct Append plans using a mix of partial and non-partial subplans, which may allow for parallelism in places that otherwise couldn't support it. Unfortunately, this patch doesn't handle the important case of parallelizing UNION ALL by running each branch in a separate worker; the executor infrastructure is added here, but more planner work is needed. Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com	2017-12-05 17:28:39 -05:00
Robert Haas	2c09a5c12a	Fix accumulation of parallel worker instrumentation. When a Gather or Gather Merge node is started and stopped multiple times, the old code wouldn't reset the shared state between executions, potentially resulting in dramatically inflated instrumentation data for nodes beneath it. (The per-worker instrumentation ended up OK, I think, but the overall totals were inflated.) Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila, reviewed and tweaked a bit by me. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com	2017-12-05 14:35:33 -05:00
Andres Freund	5bcf389ecf	Fix EXPLAIN ANALYZE of hash join when the leader doesn't participate. If a hash join appears in a parallel query, there may be no hash table available for explain.c to inspect even though a hash table may have been built in other processes. This could happen either because parallel_leader_participation was set to off or because the leader happened to hit the end of the outer relation immediately (even though the complete relation is not empty) and decided not to build the hash table. Commit `bf11e7ee` introduced a way for workers to exchange instrumentation via the DSM segment for Sort nodes even though they are not parallel-aware. This commit does the same for Hash nodes, so that explain.c has a way to find instrumentation data from an arbitrary participant that actually built the hash table. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm%3D3DUQC2-z252N55eOcZBer6DPdM%3DFzrxH9dZc5vYLsjaA%40mail.gmail.com	2017-12-05 10:55:56 -08:00
Tom Lane	8dc3c971a9	Treat directory open failures as hard errors in ResetUnloggedRelations(). Previously, this code just reported such problems at LOG level and kept going. The problem with this approach is that transient failures (e.g., ENFILE) could prevent us from resetting unlogged relations to empty, yet allow recovery to appear to complete successfully. That seems like a data corruption hazard large enough to treat such problems as reasons to fail startup. For the same reason, treat unlink failures for unlogged files as hard errors not just LOG messages. It's a little odd that we did it like that when file-level errors in other steps (copy_file, fsync_fname) are ERRORs. The sole case that I left alone is that ENOENT failure on a tablespace (not database) directory is not an error, though it will now be logged rather than just silently ignored. This is to cover the scenario where a previous DROP TABLESPACE removed the tablespace directory but failed before removing the pg_tblspc symlink. I'm not sure that that's very likely in practice, but that seems like the only real excuse for the old behavior here, so let's allow for it. (As coded, this will also allow ENOENT on $PGDATA/base/. But since we'll fail soon enough if that's gone, I don't think we need to complicate this code by distinguishing that from a true tablespace case.) Discussion: https://postgr.es/m/21040.1512418508@sss.pgh.pa.us	2017-12-04 20:52:59 -05:00
Tom Lane	066bc21c0e	Simplify do_pg_start_backup's API by opening pg_tblspc internally. do_pg_start_backup() expects its callers to pass in an open DIR pointer for the pg_tblspc directory, but there's no apparent advantage in that. It complicates the callers without adding any flexibility, and there's no robustness advantage, since we surely have to be prepared for errors during the scan of pg_tblspc anyway. In fact, by holding an extra kernel resource during operations like the preliminary checkpoint, we might be making things a fraction more failure-prone not less. Hence, remove that argument and open the directory just for the duration of the actual scan. Discussion: https://postgr.es/m/28752.1512413887@sss.pgh.pa.us	2017-12-04 18:37:54 -05:00
Tom Lane	561885db05	Improve error handling in RemovePgTempFiles(). Modify this function and its subsidiaries so that syscall failures are reported via ereport(LOG), rather than silently ignored as before. We don't want to throw a hard ERROR, as that would prevent database startup, and getting rid of leftover temporary files is not important enough for that. On the other hand, not reporting trouble at all seems like an odd choice not in line with current project norms, especially since any failure here is quite unexpected. On the same reasoning, adjust these functions' AllocateDir/ReadDir calls so that failure to scan a directory results in LOG not ERROR. I also removed the previous practice of silently ignoring ENOENT failures during directory opens --- there are some corner cases where that could happen given a previous database crash, but that seems like a bad excuse for ignoring a condition that isn't expected in most cases. A LOG message during postmaster start seems OK in such situations, and better than no output at all. In passing, make RemovePgTempRelationFiles' test for "is the file name all digits" look more like the way it's done elsewhere. Discussion: https://postgr.es/m/19907.1512402254@sss.pgh.pa.us	2017-12-04 17:59:35 -05:00
Tom Lane	2069e6faa0	Clean up assorted messiness around AllocateDir() usage. This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com	2017-12-04 17:02:56 -05:00
Robert Haas	ab6eaee884	When VACUUM or ANALYZE skips a concurrently dropped table, log it. Hopefully, the additional logging will help avoid confusion that could otherwise result. Nathan Bossart, reviewed by Michael Paquier, Fabrízio Mello, and me	2017-12-04 15:25:55 -05:00
Tom Lane	ecc27d55f4	Support boolean columns in functional-dependency statistics. There's no good reason that the multicolumn stats stuff shouldn't work on booleans. But it looked only for "Var = pseudoconstant" clauses, and it will seldom find those for boolean Vars, since earlier phases of planning will fold "boolvar = true" or "boolvar = false" to just "boolvar" or "NOT boolvar" respectively. Improve dependencies_clauselist_selectivity() to recognize such clauses as equivalent to equality restrictions. This fixes a failure of the extended stats mechanism to apply in a case reported by Vitaliy Garnashevich. It's not a complete solution to his problem because the bitmap-scan costing code isn't consulting extended stats where it should, but that's surely an independent issue. In passing, improve some comments, get rid of a NumRelids() test that's redundant with the preceding bms_membership() test, and fix dependencies_clauselist_selectivity() so that estimatedclauses actually is a pure output argument as stated by its API contract. Back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/73a4936d-2814-dc08-ed0c-978f76f435b0@gmail.com	2017-12-04 11:51:43 -05:00
Robert Haas	9f4992e2a9	Remove memory leak protection from Gather and Gather Merge nodes. Before commit `6b65a7fe62`, tqueue.c could perform tuple remapping and thus leak memory, which is why commit `af33039317` made TupleQueueReaderNext run in a short-lived context. Now, however, tqueue.c has been reduced to a shadow of its former self, and there shouldn't be any chance of leaks any more. Accordingly, remove some tuple copying and memory context manipulation to speed up processing. Patch by me, reviewed by Amit Kapila. Some testing by Rafia Sabih. Discussion: http://postgr.es/m/CAA4eK1LSDydwrNjmYSNkfJ3ZivGSWH9SVswh6QpNzsMdj_oOQA@mail.gmail.com	2017-12-04 10:39:24 -05:00
Andres Freund	ec6a040056	Adjust #ifdef EXEC_BACKEND RemovePgTempFilesInDir() call. Other callers were adjusted in the course of `dc6c4c9dc2`. Per buildfarm.	2017-12-01 17:28:05 -08:00
Andres Freund	dc6c4c9dc2	Add infrastructure for sharing temporary files between backends. SharedFileSet allows temporary files to be created by one backend and then exported for read-only access by other backends, with clean-up managed by reference counting associated with a DSM segment. This includes changes to fd.c and buffile.c to support the new kind of temporary file. This will be used by an upcoming patch adding support for parallel hash joins. Author: Thomas Munro Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas, Rushabh Lathia Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAH2-WznJ_UgLux=_jTgCQ4yFz0iBntudsNKa1we3kN1BAG=88w@mail.gmail.com	2017-12-01 16:30:56 -08:00
Robert Haas	35438e5763	Minor code beautification in partition_bounds_equal. Use get_greatest_modulus more consistently, instead of doing the same thing in an ad-hoc manner in this one place. Ashutosh Bapat Discussion: http://postgr.es/m/CAFjFpReT9L4RCiJBKOyWC2=i02kv9uG2fx=4Fv7kFY2t0SPCgw@mail.gmail.com	2017-12-01 13:52:59 -05:00
Robert Haas	87c37e3291	Re-allow INSERT .. ON CONFLICT DO NOTHING on partitioned tables. Commit `8355a011a0` was reverted in `f05230752d`, but this attempt is hopefully better-considered: we now pass the correct value to ExecOpenIndices, which should avoid the crash that we hit before. Amit Langote, reviewed by Simon Riggs and by me. Some final editing by me. Discussion: http://postgr.es/m/7ff1e8ec-dc39-96b1-7f47-ff5965dceeac@lab.ntt.co.jp	2017-12-01 12:53:21 -05:00
Robert Haas	1cbc17aaca	Try to exclude partitioned tables in toto. Ashutosh Bapat, reviewed by Jeevan Chalke. Comment by me. Discussion: http://postgr.es/m/CAFjFpRcuRaydz88CY_aQekmuvmN2A9ax5z0k=ppT+s8KS8xMRA@mail.gmail.com	2017-12-01 10:59:09 -05:00
Robert Haas	59c8078744	Fix uninitialized memory reference. Without this, when partdesc->nparts == 0, we end up calling ExecBuildSlotPartitionKeyDescription without initializing values and isnull. Reported by Coverity via Michael Paquier. Patch by Michael Paquier, reviewed and revised by Amit Langote. Discussion: http://postgr.es/m/CAB7nPqQ3mwkdMoPY-ocgTpPnjd8TKOadMxdTtMLvEzF8480Zfg@mail.gmail.com	2017-12-01 10:05:00 -05:00
Peter Eisentraut	86ab28fbd1	Check channel binding flag at end of SCRAM exchange We need to check whether the channel-binding flag encoded in the client-final-message is the same one sent in the client-first-message. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-12-01 09:53:26 -05:00
Robert Haas	06ae669c92	Remove extra word from comment. David Rowley, who also was the primary author of the patch that added this function; the attribution in my previous commit, `84940644de`, was incorrect due to sloppiness on my part. Discussion: http://postgr.es/m/CAKJS1f_0iSiLQsf_c06AzOWAc3eS6ePjjVQFpcFv3W-O5aktnQ@mail.gmail.com	2017-11-30 16:22:38 -05:00
Peter Eisentraut	e4128ee767	SQL procedures This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>	2017-11-30 11:03:20 -05:00
Robert Haas	1761653bbb	Make create_unique_path manage memory like mark_dummy_rel. Put the unique path in the same context as the owning RelOptInfo, rather than the toplevel planner context. This is how this function worked originally, but commit `f41803bb39` changed it without explanation. mark_dummy_rel adopted the older (or newer?) technique in commit `eca75a12a2`, which also featured a much better explanation of why it is correct. So, switch back to that technique here, with the same explanation given there. Although this fixes a possible memory leak when GEQO is in use, the leak is minor and probably nobody cares, so no back-patch. Ashutosh Bapat, reviewed by Tom Lane and by me Discussion: http://postgr.es/m/CAFjFpRcXkHHrXyD9BCvkgGJV4TnHG2SWJ0PhJfrDu3NAcQvh7g@mail.gmail.com	2017-11-30 09:50:10 -05:00
Tom Lane	7ca25b7de6	Fix neqjoinsel's behavior for semi/anti join cases. Previously, this function estimated the selectivity as 1 minus eqjoinsel() for the negator equality operator, regardless of join type (I think there was an expectation that eqjoinsel would handle the join type). But actually this is completely wrong for semijoin cases: the fraction of the LHS that has a non-matching row is not one minus the fraction of the LHS that has a matching row. In reality a semijoin with <> will nearly always succeed: it can only fail when the RHS is empty, or it contains a single distinct value that is equal to the particular LHS value, or the LHS value is null. The only one of those things we should have much confidence in estimating is the fraction of LHS values that are null, so let's just take the selectivity as 1 minus outer nullfrac. Per coding convention, antijoin should be estimated the same as semijoin. Arguably this is a bug fix, but in view of the lack of field complaints and the risk of destabilizing plans, no back-patch. Thomas Munro, reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CAEepm=270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg@mail.gmail.com	2017-11-29 22:00:37 -05:00
Andres Freund	1145acc70d	Add a barrier primitive for synchronizing backends. Provide support for dynamic or static parties of processes to wait for all processes to reach point in the code before continuing. This is similar to the mechanism of the same name in POSIX threads and MPI, though has explicit phasing and dynamic party support like the Java core library's Phaser. This will be used by an upcoming patch adding support for parallel hash joins. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=2_y7oi01OjA_wLvYcWMc9_d=LaoxrY3eiROCZkB_qakA@mail.gmail.com	2017-11-29 17:07:16 -08:00
Robert Haas	84940644de	New C function: bms_add_range This will be used by pending patches to improve partition pruning. Amit Langote and Kyotaro Horiguchi, per a suggestion from David Rowley. Review and testing of the larger patch set of which this is a part by Ashutosh Bapat, David Rowley, Dilip Kumar, Jesper Pedersen, Rajkumar Raghuwanshi, Beena Emerson, Amul Sul, and Kyotaro Horiguchi. Discussion: http://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp	2017-11-29 17:12:05 -05:00
Robert Haas	eaedf0df71	Update typedefs.list and re-run pgindent Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com	2017-11-29 09:24:24 -05:00
Tom Lane	801386af62	Clarify old comment about qual_is_pushdown_safe's handling of subplans. This comment glossed over the difference between initplans and subplans, but they are indeed different for our purposes here.	2017-11-28 23:32:17 -05:00
Alvaro Herrera	3848d21673	Make memset() use sizeof() rather than re-compute size This is simpler and more closely follows overwhelming precedent. Report and patch by Mark Dilger. Discussion: https://postgr.es/m/9A68FB88-5F45-4848-9926-8586E2D777D1@gmail.com	2017-11-29 00:13:45 -03:00
Alvaro Herrera	414cd434ff	Fix extstat collection when no stats are produced for a column This is a mistakenly placed conditional in `bf2a691e02`. Reported by Justin Pryzby Discussion: https://postgr.es/m/20171117214352.GE25796@telsasoft.com	2017-11-28 23:40:11 -03:00
Robert Haas	cd482f2951	Fix wrong function name in comment. Rushabh Lathia Discussion: http://postgr.es/m/CAGPqQf2z5g+7YmGZSZgKoiFsaUB+63Rzmz8-5PQHuS6hd14FEg@mail.gmail.com	2017-11-28 14:17:21 -05:00
Robert Haas	2d7950f222	If a range-partitioned table has no default partition, reject null keys. Commit `4e5fe9ad19` introduced this problem. Also add a test so it doesn't get broken again. Report by Rushabh Lathia. Fix by Amit Langote. Reviewed by Rushabh Lathia and Amul Sul. Tweaked by me. Discussion: http://postgr.es/m/CAGPqQf0Y1iJyk4QJBdMf=pS9i6Q0JUMM_h5-qkR3OMJ-e04PyA@mail.gmail.com	2017-11-28 14:11:16 -05:00
Robert Haas	445dbd82a3	Fix ReinitializeParallelDSM to tolerate finding no error queues. Commit `d466335064` changed things so that shm_toc_lookup would fail with an error rather than silently returning NULL in the hope that such failures would be reported in a useful way rather than via a system crash. However, it overlooked the fact that the lookup of PARALLEL_KEY_ERROR_QUEUE in ReinitializeParallelDSM is expected to fail when no DSM segment was created in the first place; in that case, we end up with a backend-private memory segment that still contains an entry for PARALLEL_KEY_FIXED but no others. Consequently a benign failure to initialize parallelism can escalate into an elog(ERROR); repair. Discussion: http://postgr.es/m/CA+Tgmob8LFw55DzH1QEREpBEA9RJ_W_amhBFCVZ6WMwUhVpOqg@mail.gmail.com	2017-11-28 12:15:38 -05:00
Robert Haas	c6755e233b	Teach bitmap heap scan to cope with absence of a DSA. If we have a plan that uses parallelism but are unable to execute it using parallelism, for example due to a lack of available DSM segments, then the EState's es_query_dsa will be NULL. Parallel bitmap heap scan needs to fall back to a non-parallel scan in such cases. Patch by me, reviewed by Dilip Kumar Discussion: http://postgr.es/m/CAEepm=0kADK5inNf_KuemjX=HQ=PuTP0DykM--fO5jS5ePVFEA@mail.gmail.com	2017-11-28 11:44:59 -05:00
Robert Haas	7b88d63a91	Add null test to partition constraint for default range partitions. Non-default range partitions have a constraint which include null tests, and both default and non-default list partitions also have a constraint which includes null tests, but for some reason this was missed for default range partitions. This could cause the partition constraint to evaluate to false for rows that were (correctly) routed to that partition by insert tuple routing, which could in turn cause constraint exclusion to prune the default partition in cases where it should not. Amit Langote, reviewed by Kyotaro Horiguchi Discussion: http://postgr.es/m/ba7aaeb1-4399-220e-70b4-62eade1522d0@lab.ntt.co.jp	2017-11-28 10:51:01 -05:00
Robert Haas	487a0c1518	Fix typo. Masahiko Sawada Discussion: http://postgr.es/m/CAD21AoCq_QG6UEo6yb-purmhLQjDLsCA2_B+8Wh3ah9P2-XmrQ@mail.gmail.com	2017-11-28 08:19:01 -05:00

1 2 3 4 5 ...

17864 Commits