postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	e1a11d9311	Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[]. This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me	2015-02-21 15:13:06 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Kevin Grittner	30d7ae3c76	Increase number of hash join buckets for underestimate. If we expect batching at the very beginning, we size nbuckets for "full work_mem" (see how many tuples we can get into work_mem, while not breaking NTUP_PER_BUCKET threshold). If we expect to be fine without batching, we start with the 'right' nbuckets and track the optimal nbuckets as we go (without actually resizing the hash table). Once we hit work_mem (considering the optimal nbuckets value), we keep the value. At the end of the first batch, we check whether (nbuckets != nbuckets_optimal) and resize the hash table if needed. Also, we keep this value for all batches (it's OK because it assumes full work_mem, and it makes the batchno evaluation trivial). So the resize happens only once. There could be cases where it would improve performance to allow the NTUP_PER_BUCKET threshold to be exceeded to keep everything in one batch rather than spilling to a second batch, but attempts to generate such a case have so far been unsuccessful; that issue may be addressed with a follow-on patch after further investigation. Tomas Vondra with minor format and comment cleanup by me Reviewed by Robert Haas, Heikki Linnakangas, and Kevin Grittner	2014-10-13 10:16:36 -05:00
Heikki Linnakangas	2df465e696	Fix pointer type in size passed to memset. Pointers are all the same size, so it makes no practical difference, but let's be tidy. Found by Coverity, noted off-list by Tom Lane.	2014-09-14 16:48:57 +03:00
Robert Haas	8cce08f168	Change NTUP_PER_BUCKET to 1 to improve hash join lookup speed. Since this makes the bucket headers use ~10x as much memory, properly account for that memory when we figure out whether everything fits in work_mem. This might result in some cases that previously used only a single batch getting split into multiple batches, but it's unclear as yet whether we need defenses against that case, and if so, what the shape of those defenses should be. It's worth noting that even in these edge cases, users should still be no worse off than they would have been last week, because commit `45f6240a8f` saved a big pile of memory on exactly the same workloads. Tomas Vondra, reviewed and somewhat revised by me.	2014-09-12 16:18:09 -04:00
Heikki Linnakangas	45f6240a8f	Pack tuples in a hash join batch densely, to save memory. Instead of palloc'ing each HashJoinTuple individually, allocate 32kB chunks and pack the tuples densely in the chunks. This avoids the AllocChunk header overhead, and the space wasted by standard allocator's habit of rounding sizes up to the nearest power of two. This doesn't contain any planner changes, because the planner's estimate of memory usage ignores the palloc overhead. Now that the overhead is smaller, the planner's estimates are in fact more accurate. Tomas Vondra, reviewed by Robert Haas.	2014-09-10 21:24:52 +03:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Tom Lane	691c5ebf79	Add defenses against integer overflow in dynahash numbuckets calculations. The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit `299d171652`, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.	2012-12-11 22:09:05 -05:00
Alvaro Herrera	c219d9b0a5	Split tuple struct defs from htup.h to htup_details.h This reduces unnecessary exposure of other headers through htup.h, which is very widely included by many files. I have chosen to move the function prototypes to the new file as well, because that means htup.h no longer needs to include tupdesc.h. In itself this doesn't have much effect in indirect inclusion of tupdesc.h throughout the tree, because it's also required by execnodes.h; but it's something to explore in the future, and it seemed best to do the htup.h change now while I'm busy with it.	2012-08-30 16:52:35 -04:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Tom Lane	a0185461dd	Rearrange the implementation of index-only scans. This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).	2011-10-11 14:21:30 -04:00
Tom Lane	f197272365	Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps. This provides information about the numbers of tuples that were visited but not returned by table scans, as well as the numbers of join tuples that were considered and discarded within a join plan node. There is still some discussion going on about the best way to report counts for outer-join situations, but I think most of what's in the patch would not change if we revise that, so I'm going to go ahead and commit it as-is. Documentation changes to follow (they weren't in the submitted patch either). Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom	2011-09-22 11:30:11 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Tom Lane	f4e4b32743	Support RIGHT and FULL OUTER JOIN in hash joins. This is advantageous first because it allows us to hash the smaller table regardless of the outer-join type, and second because hash join can be more flexible than merge join in dealing with arbitrary join quals in a FULL join. For merge join all the join quals have to be mergejoinable, but hash join will work so long as there's at least one hashjoinable qual --- the others can be any condition. (This is true essentially because we don't keep per-inner-tuple match flags in merge join, while hash join can do so.) To do this, we need a has-it-been-matched flag for each tuple in the hashtable, not just one for the current outer tuple. The key idea that makes this practical is that we can store the match flag in the tuple's infomask, since there are lots of bits there that are of no interest for a MinimalTuple. So we aren't increasing the size of the hashtable at all for the feature. To write this without turning the hash code into even more of a pile of spaghetti than it already was, I rewrote ExecHashJoin in a state-machine style, similar to ExecMergeJoin. Other than that decision, it was pretty straightforward.	2010-12-30 20:26:08 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	53e757689c	Make NestLoop plan nodes pass outer-relation variables into their inner relation using the general PARAM_EXEC executor parameter mechanism, rather than the ad-hoc kluge of passing the outer tuple down through ExecReScan. The previous method was hard to understand and could never be extended to handle parameters coming from multiple join levels. This patch doesn't change the set of possible plans nor have any significant performance effect, but it's necessary infrastructure for future generalization of the concept of an inner indexscan plan. ExecReScan's second parameter is now unused, so it's removed.	2010-07-12 17:01:06 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Robert Haas	e26c539e9f	Wrap calls to SearchSysCache and related functions using macros. The purpose of this change is to eliminate the need for every caller of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists, GetSysCacheOid, and SearchSysCacheList to know the maximum number of allowable keys for a syscache entry (currently 4). This will make it far easier to increase the maximum number of keys in a future release should we choose to do so, and it makes the code shorter, too. Design and review by Tom Lane.	2010-02-14 18:42:19 +00:00
Robert Haas	42a8ab0a14	Augment EXPLAIN output with more details on Hash nodes. We show the number of buckets, the number of batches (and also the original number if it has changed), and the peak space used by the hash table. Minor executor changes to track peak space used.	2010-02-01 15:43:36 +00:00
Tom Lane	40608e7f94	When estimating the selectivity of an inequality "column > constant" or "column < constant", and the comparison value is in the first or last histogram bin or outside the histogram entirely, try to fetch the actual column min or max value using an index scan (if there is an index on the column). If successful, replace the lower or upper histogram bound with that value before carrying on with the estimate. This limits the estimation error caused by moving min/max values when the comparison value is close to the min or max. Per a complaint from Josh Berkus. It is tempting to consider using this mechanism for mergejoinscansel as well, but that would inject index fetches into main-line join estimation not just endpoint cases. I'm refraining from that until we can get a better handle on the costs of doing this type of lookup.	2010-01-04 02:44:40 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	649b5ec7c8	Add the ability to store inheritance-tree statistics in pg_statistic, and teach ANALYZE to compute such stats for tables that have subclasses. Per my proposal of yesterday. autovacuum still needs to be taught about running ANALYZE on parent tables when their subclasses change, but the feature is useful even without that.	2009-12-29 20:11:45 +00:00
Tom Lane	8442317beb	Make the overflow guards in ExecChooseHashTableSize be more protective. The original coding ensured nbuckets and nbatch didn't exceed INT_MAX, which while not insane on its own terms did nothing to protect subsequent code like "palloc(nbatch * sizeof(BufFile *))". Since enormous join size estimates might well be planner error rather than reality, it seems best to constrain the initial sizes to be not more than work_mem/sizeof(pointer), thus ensuring the allocated arrays don't exceed work_mem. We will allow nbatch to get bigger than that during subsequent ExecHashIncreaseNumBatches calls, but we should still guard against integer overflow in those palloc requests. Per bug #5145 from Bernt Marius Johnsen. Although the given test case only seems to fail back to 8.2, previous releases have variants of this issue, so patch all supported branches.	2009-10-30 20:58:45 +00:00
Tom Lane	421d7d8edb	Remove no-longer-needed ExecCountSlots infrastructure.	2009-09-27 21:10:53 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Bruce Momjian	0e550ff617	Revert DTrace patch from Robert Lor	2009-04-02 20:59:10 +00:00
Bruce Momjian	227f817c1f	Add support for additional DTrace probes. Robert Lor	2009-04-02 19:14:34 +00:00
Tom Lane	596efd27ed	Optimize multi-batch hash joins when the outer relation has a nonuniform distribution, by creating a special fast path for the (first few) most common values of the outer relation. Tuples having hashvalues matching the MCVs are effectively forced to be in the first batch, so that we never write them out to the batch temp files. Bryce Cutt and Ramon Lawrence, with some editorialization by me.	2009-03-21 00:04:40 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	24ee8af573	Rework temp_tablespaces patch so that temp tablespaces are assigned separately for each temp file, rather than once per sort or hashjoin; this allows spreading the data of a large sort or join across multiple tablespaces. (I remain dubious that this will make any difference in practice, but certain people insisted.) Arrange to cache the results of parsing the GUC variable instead of recomputing from scratch on every demand, and push usage of the cache down to the bottommost fd.c level.	2007-06-07 19:19:57 +00:00
Tom Lane	acfce502ba	Create a GUC parameter temp_tablespaces that allows selection of the tablespace(s) in which to store temp tables and temporary files. This is a list to allow spreading the load across multiple tablespaces (a random list element is chosen each time a temp object is to be created). Temp files are not stored in per-database pgsql_tmp/ directories anymore, but per-tablespace directories. Jaime Casanova and Albert Cervera, with review by Bernd Helmle and Tom Lane.	2007-06-03 17:08:34 +00:00
Tom Lane	bd2c980b22	Buy back some of the cycles spent in more-expensive hash functions by selecting power-of-2, rather than prime, numbers of buckets in hash joins. If the hash functions are doing their jobs properly by making all hash bits equally random, this is good enough, and it saves expensive integer division and modulus operations.	2007-06-01 17:38:44 +00:00
Tom Lane	3c5985b473	Fix bug I introduced in recent patch to make hash joins discard null tuples immediately: ExecHashGetHashValue failed to restore the caller's memory context before taking the failure exit.	2007-02-22 22:49:27 +00:00
Tom Lane	a635c08fa1	Add support for cross-type hashing in hash index searches and hash joins. Hashing for aggregation purposes still needs work, so it's not time to mark any cross-type operators as hashable for general use, but these cases work if the operators are so marked by hand in the system catalogs.	2007-01-30 01:33:36 +00:00
Tom Lane	b39e91501c	Improve hash join to discard input tuples immediately if they can't match because they contain a null join key (and the join operator is known strict). Improves performance significantly when the inner relation contains a lot of nulls, as per bug #2930.	2007-01-28 23:21:26 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Bruce Momjian	03c2e5924e	Add additional includes needed on some platforms.	2006-07-14 04:44:46 +00:00
Bruce Momjian	fad1ea86bd	Move math.h after postgresql.h	2006-07-13 20:14:12 +00:00
Bruce Momjian	a22d76d96a	Allow include files to compile own their own. Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.	2006-07-13 16:49:20 +00:00
Tom Lane	69d0a15e2a	Convert hash join code to use MinimalTuple format in tuple hash table and batch files. Should reduce memory and I/O demands for such joins.	2006-06-27 21:31:20 +00:00
Bruce Momjian	87bd07d979	Make EXPLAIN sampling smarter, to avoid excessive sampling delay. Martijn van Oosterhout	2006-05-30 14:01:58 +00:00
Tom Lane	798e63ffb0	Remove CXT_printf/CXT1_printf macros. If anyone had found them to be of any use in the past many years, we'd have made some effort to include them in all executor node types; but in fact they were only in nodeAppend.c and nodeIndexscan.c, up until I copied nodeIndexscan.c's occurrence into the new bitmap node types. Remove some other unused macros in execdebug.h, too. Some day the whole header probably ought to go away in favor of better-designed facilities.	2006-05-23 15:21:52 +00:00
Bruce Momjian	f2f5b05655	Update copyright for 2006. Update scripts.	2006-03-05 15:59:11 +00:00
Tom Lane	2c0ef9777c	Extend the ExecInitNode API so that plan nodes receive a set of flag bits indicating which optional capabilities can actually be exercised at runtime. This will allow Sort and Material nodes, and perhaps later other nodes, to avoid unnecessary overhead in common cases. This commit just adds the infrastructure and arranges to pass the correct flag values down to plan nodes; none of the actual optimizations are here yet. I'm committing this separately in case anyone wants to measure the added overhead. (It should be negligible.) Simon Riggs and Tom Lane	2006-02-28 04:10:28 +00:00
Tom Lane	4dd2048a47	Get rid of ExecAssignResultTypeFromOuterPlan() and make all plan node types generate their output tuple descriptors from their target lists (ie, using ExecAssignResultTypeFromTL()). We long ago fixed things so that all node types have minimally valid tlists, so there's no longer any good reason to have two different ways of doing it. This change is needed to fix bug reported by Hayden James: the fix of 2005-11-03 to emit the correct column names after optimizing away a SubqueryScan node didn't work if the new top-level plan node used ExecAssignResultTypeFromOuterPlan to generate its tupdesc, since the next plan node down won't have the correct column labels.	2005-11-23 20:27:58 +00:00
Bruce Momjian	436a2956d8	Re-run pgindent, fixing a problem where comment lines after a blank comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.	2005-11-22 18:17:34 +00:00
Tom Lane	dd218ae7b0	Remove the t_datamcxt field of HeapTupleData. This was introduced for the convenience of tuptoaster.c and is no longer needed, so may as well get rid of some small amount of overhead.	2005-11-20 19:49:08 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	e990b9ce23	The original patch to avoid building a hash join's hashtable when the outer relation is empty did not work, per test case from Patrick Welche. It tried to use nodeHashjoin.c's high-level mechanisms for fetching an outer-relation tuple, but that code expected the hash table to be filled already. As patched, the code failed in corner cases such as having no outer-relation tuples for the first hash batch. Revert and rewrite.	2005-09-25 19:37:35 +00:00
Neil Conway	c119c5bd49	Change the implementation of hash join to attempt to avoid unnecessary work if either of the join relations are empty. The logic is: (1) if the inner relation's startup cost is less than the outer relation's startup cost and this is not an outer join, read a single tuple from the inner relation via ExecHash() - if NULL, we're done (2) read a single tuple from the outer relation - if NULL, we're done (3) build the hash table on the inner relation - if hash table is empty and this is not an outer join, we're done (4) otherwise, do hash join as usual The implementation uses the new MultiExecProcNode API, per a suggestion from Tom: invoking ExecHash() now produces the first tuple from the Hash node's child node, whereas MultiExecHash() builds the hash table. I had to put in a bit of a kludge to get the row count returned for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to return a tuple, and then MultiExecHash() is invoked, we would return one too many tuples to EXPLAIN ANALYZE. I hacked around this by just manually detecting this situation and subtracting 1 from the EXPLAIN ANALYZE row count.	2005-06-15 07:27:44 +00:00
Tom Lane	d8b1bf4791	Create a new 'MultiExecProcNode' call API for plan nodes that don't return just a single tuple at a time. Currently the only such node type is Hash, but I expect we will soon have indexscans that can return tuple bitmaps. A side benefit is that EXPLAIN ANALYZE now shows the correct tuple count for a Hash node.	2005-04-16 20:07:35 +00:00
Neil Conway	aeb502346b	Minor code cleanup: ExecHash() was returning a null TupleTableSlot, and an old comment in the code claimed that this was necessary. Since it is not actually necessary any more, it is clearer to remove the comment and just return NULL instead -- the return value of ExecHash() is not used.	2005-03-31 02:02:52 +00:00
Tom Lane	f97aebd162	Revise TupleTableSlot code to avoid unnecessary construction and disassembly of tuples when passing data up through multiple plan nodes. A slot can now hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting of Datum/isnull arrays. Upper plan levels can usually just copy the Datum arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr() calls to extract the data again. This work extends Atsushi Ogawa's earlier patch, which provided the key idea of adding Datum arrays to TupleTableSlots. (I believe however that something like this was foreseen way back in Berkeley days --- see the old comment on ExecProject.) A test case involving many levels of join of fairly wide tables (about 80 columns altogether) showed about 3x overall speedup, though simple queries will probably not be helped very much. I have also duplicated some code in heaptuple.c in order to provide versions of heap_formtuple and friends that use "bool" arrays to indicate null attributes, instead of the old convention of "char" arrays containing either 'n' or ' '. This provides a better match to the convention used by ExecEvalExpr. While I have not made a concerted effort to get rid of uses of the old routines, I think they should be deprecated and eventually removed.	2005-03-16 21:38:10 +00:00
Tom Lane	dffbbb3e55	Forgot that I had intended to replace division by masking in hash calculation.	2005-03-13 19:59:40 +00:00
Tom Lane	849074f9ae	Revise hash join code so that we can increase the number of batches on-the-fly, and thereby avoid blowing out memory when the planner has underestimated the hash table size. Hash join will now obey the work_mem limit with some faithfulness. Per my recent proposal (hash aggregate part isn't done yet though).	2005-03-06 22:15:05 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Tom Lane	9fcbe2af11	Arrange for hash join to skip scanning the outer relation if it detects that the inner one is completely empty. Per recent discussion. Also some cosmetic cleanups in nearby code.	2004-09-22 19:13:52 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Neil Conway	72b6ad6313	Use the new List API function names throughout the backend, and disable the list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.	2004-05-30 23:40:41 +00:00
Neil Conway	d0b4399d81	Reimplement the linked list data structure used throughout the backend. In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.	2004-05-26 04:41:50 +00:00
Tom Lane	c1352052ef	Replace the switching function ExecEvalExpr() with a macro that jumps directly to the appropriate per-node execution function, using a function pointer stored by ExecInitExpr. This speeds things up by eliminating one level of function call. The function-pointer technique also enables further small improvements such as only making one-time tests once (and then changing the function pointer). Overall this seems to gain about 10% on evaluation of simple expressions, which isn't earthshaking but seems a worthwhile gain for a relatively small hack. Per recent discussion on pghackers.	2004-03-17 01:02:24 +00:00
Tom Lane	391c3811a2	Rename SortMem and VacuumMem to work_mem and maintenance_work_mem. Make btree index creation and initial validation of foreign-key constraints use maintenance_work_mem rather than work_mem as their memory limit. Add some code to guc.c to allow these variables to be referenced by their old names in SHOW and SET commands, for backwards compatibility.	2004-02-03 17:34:04 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Tom Lane	a64846f3ad	Get rid of hashkeys field of Hash plan node, since it's redundant with the hashclauses field of the parent HashJoin. This avoids problems with duplicated links to SubPlans in hash clauses, as per report from Andrew Holm-Hansen.	2003-11-25 21:00:54 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	5e6d691e0d	Error message editing in backend/executor.	2003-07-21 17:05:12 +00:00
Tom Lane	bff0422b6c	Revise hash join and hash aggregation code to use the same datatype- specific hash functions used by hash indexes, rather than the old not-datatype-aware ComputeHashFunc routine. This makes it safe to do hash joining on several datatypes that previously couldn't use hashing. The sets of datatypes that are hash indexable and hash joinable are now exactly the same, whereas before each had some that weren't in the other.	2003-06-22 22:04:55 +00:00
Bruce Momjian	54f7338fa1	This patch implements holdable cursors, following the proposal (materialization into a tuple store) discussed on pgsql-hackers earlier. I've updated the documentation and the regression tests. Notes on the implementation: - I needed to change the tuple store API slightly -- it assumes that it won't be used to hold data across transaction boundaries, so the temp files that it uses for on-disk storage are automatically reclaimed at end-of-transaction. I added a flag to tuplestore_begin_heap() to control this behavior. Is changing the tuple store API in this fashion OK? - in order to store executor results in a tuple store, I added a new CommandDest. This works well for the most part, with one exception: the current DestFunction API doesn't provide enough information to allow the Executor to store results into an arbitrary tuple store (where the particular tuple store to use is chosen by the call site of ExecutorRun). To workaround this, I've temporarily hacked up a solution that works, but is not ideal: since the receiveTuple DestFunction is passed the portal name, we can use that to lookup the Portal data structure for the cursor and then use that to get at the tuple store the Portal is using. This unnecessarily ties the Portal code with the tupleReceiver code, but it works... The proper fix for this is probably to change the DestFunction API -- Tom suggested passing the full QueryDesc to the receiveTuple function. In that case, callers of ExecutorRun could "subclass" QueryDesc to add any additional fields that their particular CommandDest needed to get access to. This approach would work, but I'd like to think about it for a little bit longer before deciding which route to go. In the mean time, the code works fine, so I don't think a fix is urgent. - (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and adjusted the behavior of SCROLL in accordance with the discussion on -hackers. - (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml Neil Conway	2003-03-27 16:51:29 +00:00
Tom Lane	1afac12910	Create a new file executor/execGrouping.c to centralize utility routines shared by nodeGroup, nodeAgg, and soon nodeSubplan.	2003-01-10 23:54:24 +00:00
Tom Lane	a0fa0117a5	Better solution to integer overflow problem in hash batch-number computation: reduce the bucket number mod nbatch. This changes the association between original bucket numbers and batches, but that doesn't matter. Minor other cleanups in hashjoin code to help centralize decisions.	2002-12-30 15:21:23 +00:00
Tom Lane	b33265e9e6	Adjust hash table sizing algorithm to avoid integer overflow in ExecHashJoinGetBatch(). Fixes core dump on large hash joins, as in example from Rae Stiening.	2002-12-29 22:28:50 +00:00
Tom Lane	5bab36e9f6	Revise executor APIs so that all per-query state structure is built in a per-query memory context created by CreateExecutorState --- and destroyed by FreeExecutorState. This provides a final solution to the longstanding problem of memory leaked by various ExecEndNode calls.	2002-12-15 16:17:59 +00:00
Tom Lane	3a4f7dde16	Phase 3 of read-only-plans project: ExecInitExpr now builds expression execution state trees, and ExecEvalExpr takes an expression state tree not an expression plan tree. The plan tree is now read-only as far as the executor is concerned. Next step is to begin actually exploiting this property.	2002-12-13 19:46:01 +00:00
Tom Lane	1fd0c59e25	Phase 1 of read-only-plans project: cause executor state nodes to point to plan nodes, not vice-versa. All executor state nodes now inherit from struct PlanState. Copying of plan trees has been simplified by not storing a list of SubPlans in Plan nodes (eliminating duplicate links). The executor still needs such a list, but it can build it during ExecutorStart since it has to scan the plan tree anyway. No initdb forced since no stored-on-disk structures changed, but you will need a full recompile because of node-numbering changes.	2002-12-05 15:50:39 +00:00
Tom Lane	ddb2d78de0	Upgrade planner and executor to allow multiple hash keys for a hash join, instead of only one. This should speed up planning (only one hash path to consider for a given pair of relations) as well as allow more effective hashing, when there are multiple hashable joinclauses.	2002-11-30 00:08:22 +00:00
Tom Lane	2103b7baa2	Phase 2 of hashed-aggregation project. nodeAgg.c now knows how to do hashed aggregation, but there's not yet planner support for it.	2002-11-06 22:31:24 +00:00
Bruce Momjian	e50f52a074	pgindent run.	2002-09-04 20:31:48 +00:00
Bruce Momjian	97ac103289	Remove sys/types.h in files that include postgres.h, and hence c.h, because c.h has sys/types.h.	2002-09-02 02:47:07 +00:00
Tom Lane	976246cc7e	The cstring datatype can now be copied, passed around, etc. The typlen value '-2' is used to indicate a variable-width type whose width is computed as strlen(datum)+1. Everything that looks at typlen is updated except for array support, which Joe Conway is working on; at the moment it wouldn't work to try to create an array of cstring.	2002-08-24 15:00:47 +00:00
Bruce Momjian	d84fe82230	Update copyright to 2002.	2002-06-20 20:29:54 +00:00
Tom Lane	c422b5ca6b	Code review for improved-hashing patch. Fix some portability issues (char != unsigned char, Datum != uint32); make use of new hash code in dynahash hash tables and hash joins.	2002-03-09 17:35:37 +00:00
Bruce Momjian	7ab7467318	I've attached a patch which implements Bob Jenkin's hash function for PostgreSQL. This hash function replaces the one used by hash indexes and the catalog cache. Hash joins use a different, relatively poor-quality hash function, but I'll fix that later. As suggested by Tom Lane, this patch also changes the size of the fixed hash table used by the catalog cache to be a power-of-2 (instead of a prime: I chose 256 instead of 257). This allows the catcache to lookup hash buckets using a simple bitmask. This should improve the performance of the catalog cache slightly, since the previous method (modulo a prime) was slow. In my tests, this improves the performance of hash indexes by between 4% and 8%; the performance when using btree indexes or seqscans is basically unchanged. Neil Conway <neilconway@rogers.com>	2002-03-06 20:49:46 +00:00
Bruce Momjian	b81844b173	pgindent run on all C files. Java run to follow. initdb/regression tests pass.	2001-10-25 05:50:21 +00:00
Tom Lane	38cfc95865	Make hashjoin give the right answer with toasted input data.	2001-08-13 19:50:11 +00:00
Tom Lane	01a819abe3	Make planner compute the number of hash buckets the same way that nodeHash.c will compute it (by sharing code).	2001-06-11 00:17:08 +00:00
Tom Lane	a3855c5761	Cause ExecCountSlots() accounting to bear some relationship to reality. Rather surprising we hadn't seen bug reports about this ...	2001-05-27 20:42:20 +00:00
Bruce Momjian	0686d49da0	Remove dashes in comments that don't need them, rewrap with pgindent.	2001-03-22 06:16:21 +00:00
Bruce Momjian	9e1552607a	pgindent run. Make it all clean.	2001-03-22 04:01:46 +00:00
Bruce Momjian	623bf843d2	Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group.	2001-01-24 19:43:33 +00:00
Tom Lane	a933ee38bb	Change SearchSysCache coding conventions so that a reference count is maintained for each cache entry. A cache entry will not be freed until the matching ReleaseSysCache call has been executed. This eliminates worries about cache entries getting dropped while still in use. See my posting to pg-hackers of even date for more info.	2000-11-16 22:30:52 +00:00
Tom Lane	782c16c6a1	SQL-language functions are now callable in ordinary fmgr contexts ... for example, an SQL function can be used in a functional index. (I make no promises about speed, but it'll work ;-).) Clean up and simplify handling of functions returning sets.	2000-08-24 03:29:15 +00:00
Tom Lane	0147b1934f	Fix a many-legged critter reported by chifungfan@yahoo.com: under the right circumstances a hash join executed as a DECLARE CURSOR/FETCH query would crash the backend. Problem as seen in current sources was that the hash tables were stored in a context that was a child of TransactionCommandContext, which got zapped at completion of the FETCH command --- but cursor cleanup executed at COMMIT expected the tables to still be valid. I haven't chased down the details as seen in 7.0.* but I'm sure it's the same general problem.	2000-08-22 04:06:22 +00:00
Tom Lane	bec98a31c5	Revise aggregate functions per earlier discussions in pghackers. There's now only one transition value and transition function. NULL handling in aggregates is a lot cleaner. Also, use Numeric accumulators instead of integer accumulators for sum/avg on integer datatypes --- this avoids overflow at the cost of being a little slower. Implement VARIANCE() and STDDEV() aggregates in the standard backend. Also, enable new LIKE selectivity estimators by default. Unrelated change, but as long as I had to force initdb anyway...	2000-07-17 03:05:41 +00:00

1 2 3 4

199 Commits