postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	af33039317	Fix worst memory leaks in tqueue.c. TupleQueueReaderNext() leaks like a sieve if it has to do any tuple disassembly/reconstruction. While we could try to clean up its allocations piecemeal, it seems like a better idea just to insist that it should be run in a short-lived memory context, so that any transient space goes away automatically. I chose to have nodeGather.c switch into its existing per-tuple context before the call, rather than inventing a separate context inside tqueue.c. This is sufficient to stop all leakage in the simple case I exhibited earlier today (see link below), but it does not deal with leaks induced in more complex cases by tqueue.c's insistence on using TopMemoryContext for data that it's not actually trying hard to keep track of. That issue is intertwined with another major source of inefficiency, namely failure to cache lookup results across calls, so it seems best to deal with it separately. In passing, improve some comments, and modify gather_readnext's method for deciding when it's visited all the readers so that it's more obviously correct. (I'm not actually convinced that the previous code is correct in the case of a reader deletion; it certainly seems fragile.) Discussion: <32763.1469821037@sss.pgh.pa.us>	2016-07-29 19:31:06 -04:00
Tom Lane	bf4ae685ae	Fix tqueue.c's range-remapping code. It's depressingly clear that nobody ever tested this.	2016-07-29 14:13:19 -04:00
Peter Eisentraut	ef5d4a3cfa	Message style improvements	2016-07-28 16:34:44 -04:00
Tom Lane	e1a93dd6ae	tqueue.c's record-typmod hashtables need the HASH_BLOBS option. The keys are integers, not strings. The code accidentally worked on little-endian machines, at least up to 256 distinct record types within a session, but failed utterly on big-endian. This was unexpectedly exposed by a test case added by commit `4452000f3`, which apparently is the only parallelizable query in the regression suite that uses more than one anonymous record type. Fortunately, buildfarm member mandrill is big-endian and is running with force_parallel_mode on, so it failed.	2016-07-28 02:08:52 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Robert Haas	c6dbf1fe79	Stop the executor if no more tuples can be sent from worker to leader. If a Gather node has read as many tuples as it needs (for example, due to Limit) it may detach the queue connecting it to the worker before reading all of the worker's tuples. Rather than let the worker continue to generate and send all of the results, have it stop after sending the next tuple. More could be done here to stop the worker even quicker, but this is about as well as we can hope to do for 9.6. This is in response to a problem report from Andreas Seltenreich. Commit `44339b892a` should be actually be sufficient to fix that example even without this change, but it seems better to do this, too, since we might otherwise waste quite a large amount of effort in one or more workers. Discussion: CAA4eK1KOKGqmz9bGu+Z42qhRwMbm4R5rfnqsLCNqFs9j14jzEA@mail.gmail.com Amit Kapila	2016-06-06 14:52:58 -04:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Robert Haas	2bdfcb52c5	Fix TupleQueueReaderNext not to ignore its nowait argument. This was a silly goof on my (rhaas's) part. Report and fix by Rushabh Lathia.	2015-12-18 12:37:43 -05:00
Robert Haas	adeee97486	Fix dumb bug in tqueue.c When I wrote this code originally, the intention was to recompute the remapinfo only when the tupledesc changes. This presumably only happens once per query, but I copied the design pattern from other DestReceivers. However, due to a silly oversight on my part, tqueue->tupledesc never got set, leading to recomputation for every tuple. This should improve the performance of parallel scans that return a significant number of tuples. Report by Amit Kapila; patch by me, reviewed by him.	2015-11-18 08:25:33 -05:00
Robert Haas	179c97bf58	Remove accidentally-committed debugging code. Amit Kapila	2015-11-15 18:07:57 -05:00
Robert Haas	89ff5c7f75	Add a dummy return statement to TupleQueueRemap. This is unreachable for multiple reasons, but per Amit Kapila the Windows compiler he is using still thinks we can get there.	2015-11-09 10:45:32 -05:00
Robert Haas	fba60e573e	Remove set-but-not-used variables. Reported by both Peter Eisentraunt and Kevin Grittner.	2015-11-07 20:25:32 -05:00
Robert Haas	8d7396e509	Try to convince gcc that TupleQueueRemap never falls off the end. Without this, MacOS gcc version 4.2.1 isn't convinced.	2015-11-06 23:04:21 -05:00
Robert Haas	6e71dd7ce9	Modify tqueue infrastructure to support transient record types. Commit `4a4e6893aa`, which introduced this mechanism, failed to account for the fact that the RECORD pseudo-type uses transient typmods that are only meaningful within a single backend. Transferring such tuples without modification between two cooperating backends does not work. This commit installs a system for passing the tuple descriptors over the same shm_mq being used to send the tuples themselves. The two sides might not assign the same transient typmod to any given tuple descriptor, so we must also substitute the appropriate receiver-side typmod for the one used by the sender. That adds some CPU overhead, but still seems better than being unable to pass records between cooperating parallel processes. Along the way, move the logic for handling multiple tuple queues from tqueue.c to nodeGather.c; tqueue.c now provides a TupleQueueReader, which reads from a single queue, rather than a TupleQueueFunnel, which potentially reads from multiple queues. This change was suggested previously as a way to make sure that nodeGather.c rather than tqueue.c had policy control over the order in which to read from queues, but it wasn't clear to me until now how good an idea it was. typmod mapping needs to be performed separately for each queue, and it is much simpler if the tqueue.c code handles that and leaves multiplexing multiple queues to higher layers of the stack.	2015-11-06 16:58:45 -05:00
Robert Haas	d1b7c1ffe7	Parallel executor support. This code provides infrastructure for a parallel leader to start up parallel workers to execute subtrees of the plan tree being executed in the master. User-supplied parameters from ParamListInfo are passed down, but PARAM_EXEC parameters are not. Various other constructs, such as initplans, subplans, and CTEs, are also not currently shared. Nevertheless, there's enough here to support a basic implementation of parallel query, and we can lift some of the current restrictions as needed. Amit Kapila and Robert Haas	2015-09-28 21:55:57 -04:00
Robert Haas	4a4e6893aa	Glue layer to connect the executor to the shm_mq mechanism. The shm_mq mechanism was built to send error (and notice) messages and tuples between backends. However, shm_mq itself only deals in raw bytes. Since commit `2bd9e412f9`, we have had infrastructure for one message to redirect protocol messages to a queue and for another backend to parse them and do useful things with them. This commit introduces a somewhat analogous facility for tuples by adding a new type of DestReceiver, DestTupleQueue, which writes each tuple generated by a query into a shm_mq, and a new TupleQueueFunnel facility which reads raw tuples out of the queue and reconstructs the HeapTuple format expected by the executor. The TupleQueueFunnel abstraction supports reading from multiple tuple streams at the same time, but only in round-robin fashion. Someone could imaginably want other policies, but this should be good enough to meet our short-term needs related to parallel query, and we can always extend it later. This also makes one minor addition to the shm_mq API that didn' seem worth breaking out as a separate patch. Extracted from Amit Kapila's parallel sequential scan patch. This code was originally written by me, and then it was revised by Amit, and then it was revised some more by me.	2015-09-18 21:56:58 -04:00

16 Commits