Commit Graph

223 Commits

Author SHA1 Message Date
Tom Lane a0fa0117a5 Better solution to integer overflow problem in hash batch-number
computation: reduce the bucket number mod nbatch.  This changes the
association between original bucket numbers and batches, but that
doesn't matter.  Minor other cleanups in hashjoin code to help
centralize decisions.
2002-12-30 15:21:23 +00:00
Tom Lane b33265e9e6 Adjust hash table sizing algorithm to avoid integer overflow in
ExecHashJoinGetBatch().  Fixes core dump on large hash joins, as in
example from Rae Stiening.
2002-12-29 22:28:50 +00:00
Tom Lane 5bab36e9f6 Revise executor APIs so that all per-query state structure is built in
a per-query memory context created by CreateExecutorState --- and destroyed
by FreeExecutorState.  This provides a final solution to the longstanding
problem of memory leaked by various ExecEndNode calls.
2002-12-15 16:17:59 +00:00
Tom Lane 3a4f7dde16 Phase 3 of read-only-plans project: ExecInitExpr now builds expression
execution state trees, and ExecEvalExpr takes an expression state tree
not an expression plan tree.  The plan tree is now read-only as far as
the executor is concerned.  Next step is to begin actually exploiting
this property.
2002-12-13 19:46:01 +00:00
Tom Lane 1fd0c59e25 Phase 1 of read-only-plans project: cause executor state nodes to point
to plan nodes, not vice-versa.  All executor state nodes now inherit from
struct PlanState.  Copying of plan trees has been simplified by not
storing a list of SubPlans in Plan nodes (eliminating duplicate links).
The executor still needs such a list, but it can build it during
ExecutorStart since it has to scan the plan tree anyway.
No initdb forced since no stored-on-disk structures changed, but you
will need a full recompile because of node-numbering changes.
2002-12-05 15:50:39 +00:00
Tom Lane ddb2d78de0 Upgrade planner and executor to allow multiple hash keys for a hash join,
instead of only one.  This should speed up planning (only one hash path
to consider for a given pair of relations) as well as allow more effective
hashing, when there are multiple hashable joinclauses.
2002-11-30 00:08:22 +00:00
Tom Lane 2103b7baa2 Phase 2 of hashed-aggregation project. nodeAgg.c now knows how to do
hashed aggregation, but there's not yet planner support for it.
2002-11-06 22:31:24 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Bruce Momjian 97ac103289 Remove sys/types.h in files that include postgres.h, and hence c.h,
because c.h has sys/types.h.
2002-09-02 02:47:07 +00:00
Tom Lane 976246cc7e The cstring datatype can now be copied, passed around, etc. The typlen
value '-2' is used to indicate a variable-width type whose width is
computed as strlen(datum)+1.  Everything that looks at typlen is updated
except for array support, which Joe Conway is working on; at the moment
it wouldn't work to try to create an array of cstring.
2002-08-24 15:00:47 +00:00
Bruce Momjian d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Tom Lane c422b5ca6b Code review for improved-hashing patch. Fix some portability issues
(char != unsigned char, Datum != uint32); make use of new hash code in
dynahash hash tables and hash joins.
2002-03-09 17:35:37 +00:00
Bruce Momjian 7ab7467318 I've attached a patch which implements Bob Jenkin's hash function for
PostgreSQL. This hash function replaces the one used by hash indexes and
the catalog cache. Hash joins use a different, relatively poor-quality
hash function, but I'll fix that later.

As suggested by Tom Lane, this patch also changes the size of the fixed
hash table used by the catalog cache to be a power-of-2 (instead of a
prime: I chose 256 instead of 257). This allows the catcache to lookup
hash buckets using a simple bitmask. This should improve the performance
of the catalog cache slightly, since the previous method (modulo a
prime) was slow.

In my tests, this improves the performance of hash indexes by between 4%
and 8%; the performance when using btree indexes or seqscans is
basically unchanged.

Neil Conway <neilconway@rogers.com>
2002-03-06 20:49:46 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Tom Lane 38cfc95865 Make hashjoin give the right answer with toasted input data. 2001-08-13 19:50:11 +00:00
Tom Lane 01a819abe3 Make planner compute the number of hash buckets the same way that
nodeHash.c will compute it (by sharing code).
2001-06-11 00:17:08 +00:00
Tom Lane a3855c5761 Cause ExecCountSlots() accounting to bear some relationship to reality.
Rather surprising we hadn't seen bug reports about this ...
2001-05-27 20:42:20 +00:00
Bruce Momjian 0686d49da0 Remove dashes in comments that don't need them, rewrap with pgindent. 2001-03-22 06:16:21 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Bruce Momjian 623bf843d2 Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group. 2001-01-24 19:43:33 +00:00
Tom Lane a933ee38bb Change SearchSysCache coding conventions so that a reference count is
maintained for each cache entry.  A cache entry will not be freed until
the matching ReleaseSysCache call has been executed.  This eliminates
worries about cache entries getting dropped while still in use.  See
my posting to pg-hackers of even date for more info.
2000-11-16 22:30:52 +00:00
Tom Lane 782c16c6a1 SQL-language functions are now callable in ordinary fmgr contexts ...
for example, an SQL function can be used in a functional index.  (I make
no promises about speed, but it'll work ;-).)  Clean up and simplify
handling of functions returning sets.
2000-08-24 03:29:15 +00:00
Tom Lane 0147b1934f Fix a many-legged critter reported by chifungfan@yahoo.com: under the
right circumstances a hash join executed as a DECLARE CURSOR/FETCH
query would crash the backend.  Problem as seen in current sources was
that the hash tables were stored in a context that was a child of
TransactionCommandContext, which got zapped at completion of the FETCH
command --- but cursor cleanup executed at COMMIT expected the tables
to still be valid.  I haven't chased down the details as seen in 7.0.*
but I'm sure it's the same general problem.
2000-08-22 04:06:22 +00:00
Tom Lane bec98a31c5 Revise aggregate functions per earlier discussions in pghackers.
There's now only one transition value and transition function.
NULL handling in aggregates is a lot cleaner.  Also, use Numeric
accumulators instead of integer accumulators for sum/avg on integer
datatypes --- this avoids overflow at the cost of being a little slower.
Implement VARIANCE() and STDDEV() aggregates in the standard backend.

Also, enable new LIKE selectivity estimators by default.  Unrelated
change, but as long as I had to force initdb anyway...
2000-07-17 03:05:41 +00:00
Tom Lane badce86a2c First stage of reclaiming memory in executor by resetting short-term
memory contexts.  Currently, only leaks in expressions executed as
quals or projections are handled.  Clean up some old dead cruft in
executor while at it --- unused fields in state nodes, that sort of thing.
2000-07-12 02:37:39 +00:00
Tom Lane 1aebc3618a First phase of memory management rewrite (see backend/utils/mmgr/README
for details).  It doesn't really do that much yet, since there are no
short-term memory contexts in the executor, but the infrastructure is
in place and long-term contexts are handled reasonably.  A few long-
standing bugs have been fixed, such as 'VACUUM; anything' in a single
query string crashing.  Also, out-of-memory is now considered a
recoverable ERROR, not FATAL.
Eliminate a large amount of crufty, now-dead code in and around
memory management.
Fix problem with holding off SIGTRAP, SIGSEGV, etc in postmaster and
backend startup.
2000-06-28 03:33:33 +00:00
Bruce Momjian 946e80c435 Final #include cleanup. 2000-06-15 04:10:30 +00:00
Peter Eisentraut 6a68f42648 The heralded `Grand Unified Configuration scheme' (GUC)
That means you can now set your options in either or all of $PGDATA/configuration,
some postmaster option (--enable-fsync=off), or set a SET command. The list of
options is in backend/utils/misc/guc.c, documentation will be written post haste.

pg_options is gone, so is that pq_geqo config file. Also removed were backend -K,
-Q, and -T options (no longer applicable, although -d0 does the same as -Q).

Added to configure an --enable-syslog option.

changed all callers from TPRINTF to elog(DEBUG)
2000-05-31 00:28:42 +00:00
Tom Lane 25442d8d2f Correct oversight in hashjoin cost estimation: nodeHash sizes its hash
table for an average of NTUP_PER_BUCKET tuples/bucket, but cost_hashjoin
was assuming a target load of one tuple/bucket.  This was causing a
noticeable underestimate of hashjoin costs.
2000-04-18 05:43:02 +00:00
Bruce Momjian 5c25d60244 Add:
* Portions Copyright (c) 1996-2000, PostgreSQL, Inc

to all files copyright Regents of Berkeley.  Man, that's a lot of files.
2000-01-26 05:58:53 +00:00
Tom Lane 6d1efd76fb Fix handling of NULL constraint conditions: per SQL92 spec, a NULL result
from a constraint condition does not violate the constraint (cf. discussion
on pghackers 12/9/99).  Implemented by adding a parameter to ExecQual,
specifying whether to return TRUE or FALSE when the qual result is
really NULL in three-valued boolean logic.  Currently, ExecRelCheck is
the only caller that asks for TRUE, but if we find any other places that
have the wrong response to NULL, it'll be easy to fix them.
2000-01-19 23:55:03 +00:00
Tom Lane 166b5c1def Another round of planner/optimizer work. This is just restructuring and
code cleanup; no major improvements yet.  However, EXPLAIN does produce
more intuitive outputs for nested loops with indexscans now...
2000-01-09 00:26:47 +00:00
Jan Wieck 397e9b32a3 Some changes to prepare for LONG attributes.
Jan
1999-12-16 22:20:03 +00:00
Bruce Momjian 97dec77fab Rename several destroy* functions/tags to drop*. 1999-12-10 03:56:14 +00:00
Tom Lane db3c4c3a2d Split 'BufFile' routines out of fd.c into a new module, buffile.c. Extend
BufFile so that it handles multi-segment temporary files transparently.
This allows sorts and hashes to work with data exceeding 2Gig (or whatever
the local limit on file size is).  Change psort.c to use relative seeks
instead of absolute seeks for backwards scanning, so that it won't fail
when the data volume exceeds 2Gig.
1999-10-13 15:02:32 +00:00
Bruce Momjian 3406901a29 Move some system includes into c.h, and remove duplicates. 1999-07-17 20:18:55 +00:00
Bruce Momjian 2e6b1e63a3 Remove unused #includes in *.c files. 1999-07-15 22:40:16 +00:00
Bruce Momjian 07842084fe pgindent run over code. 1999-05-25 16:15:34 +00:00
Tom Lane 26069a58e8 Rewrite hash join to use simple linked lists instead of a
fixed-size hashtable.  This should prevent 'hashtable out of memory' errors,
unless you really do run out of memory.  Note: target size for hashtable
is now taken from -S postmaster switch, not -B, since it is local memory
in the backend rather than shared memory.
1999-05-18 21:33:06 +00:00
Tom Lane 71d5d95376 Update hash and join routines to use fd.c's new temp-file
code, instead of not-very-bulletproof stuff they had before.
1999-05-09 00:53:22 +00:00
Tom Lane 9f82f9e459 Fix some nasty coredump bugs in hashjoin. This code was just
about certain to fail anytime it decided the relation to be hashed was
too big to fit in memory --- the code for 'batching' a series of hashjoins
had multiple errors.  I've fixed the easier problems.  A remaining big
problem is that you can get 'hashtable out of memory' if the code's
guesstimate about how much overflow space it will need turns out wrong.
That will require much more extensive revisions to fix, so I'm committing
these fixes now before I start on that problem.
1999-05-06 00:30:47 +00:00
Tom Lane af87148065 Fix some more hashjoin-related bugs in pg_operator. Fix
hashjoin's hashFunc() so that it does the right thing with pass-by-value
data types (the old code would always return 0 for int2 or char values,
which would work but would slow things down a lot).  Extend opr_sanity
regress test to catch more kinds of errors.
1999-04-07 23:33:33 +00:00
Bruce Momjian 6724a50787 Change my-function-name-- to my_function_name, and optimizer renames. 1999-02-13 23:22:53 +00:00
Bruce Momjian 9322950aa4 Cleanup of source files where 'return' or 'var =' is alone on a line. 1999-02-03 21:18:02 +00:00
Bruce Momjian 7a6b562fdf Apply Win32 patch from Horak Daniel. 1999-01-17 06:20:06 +00:00
Vadim B. Mikheev 3f7fbf85dc Initial MVCC code.
New code for locking buffer' context.
1998-12-15 12:47:01 +00:00
Marc G. Fournier 9396802f14 more cleanups...of note, appendStringInfo now performs like sprintf(),
where you state a format and arguments.  the old behavior required
each appendStringInfo to have to have a sprintf() before it if any
formatting was required.

Also shortened several instances where there were multiple appendStringInfo()
calls in a row, doing nothing more then adding one more word to the String,
instead of doing them all in one call.
1998-12-14 08:11:17 +00:00
Marc G. Fournier df1468e251 Many more cleanups... 1998-12-14 06:50:32 +00:00
Marc G. Fournier 7c3b7d2744 Initial attempt to clean up the code...
Switch sprintf() to snprintf()
Remove any/all #if 0 -or- #ifdef NOT_USED -or- #ifdef FALSE sections of
	code
1998-12-14 05:19:16 +00:00
Vadim B. Mikheev 6beba218d7 New HeapTuple structure/interface. 1998-11-27 19:52:36 +00:00
Bruce Momjian fa1a8d6a97 OK, folks, here is the pgindent output. 1998-09-01 04:40:42 +00:00
Bruce Momjian af74855a60 Renaming cleanup, no pgindent yet. 1998-09-01 03:29:17 +00:00
Bruce Momjian 6bd323c6b3 Remove un-needed braces around single statements. 1998-06-15 19:30:31 +00:00
Bruce Momjian a32450a585 pgindent run before 6.3 release, with Thomas' requested changes. 1998-02-26 04:46:47 +00:00
Vadim B. Mikheev 1a105cefbd Support for subselects.
ExecReScan for nodeAgg, nodeHash, nodeHashjoin, nodeNestloop and nodeResult.
Fixed ExecReScan for nodeMaterial.
Get rid of #ifdef INDEXSCAN_PATCH.
Get rid of ExecMarkPos and ExecRestrPos in nodeNestloop.
1998-02-13 03:26:53 +00:00
Bruce Momjian 24cab6bd0d Goodbye register keyword. Compiler knows better. 1998-02-11 19:14:04 +00:00
Bruce Momjian c16ebb0f67 getpid/pid cleanup 1998-01-25 05:15:15 +00:00
Marc G. Fournier 374bb5d261 Some *very* major changes by darrenk@insightdist.com (Darren King)
==========================================
What follows is a set of diffs that cleans up the usage of BLCKSZ.

As a side effect, the person compiling the code can change the
value of BLCKSZ _at_their_own_risk_.  By that, I mean that I've
tried it here at 4096 and 16384 with no ill-effects.  A value
of 4096 _shouldn't_ affect much as far as the kernel/file system
goes, but making it bigger than 8192 can have severe consequences
if you don't know what you're doing.  16394 worked for me, _BUT_
when I went to 32768 and did an initdb, the SCSI driver broke and
the partition that I was running under went to hell in a hand
basket. Had to reboot and do a good bit of fsck'ing to fix things up.

The patch can be safely applied though.  Just leave BLCKSZ = 8192
and everything is as before.  It basically only cleans up all of the
references to BLCKSZ in the code.

If this patch is applied, a comment in the config.h file though above
the BLCKSZ define with warning about monkeying around with it would
be a good idea.

Darren  darrenk@insightdist.com

(Also cleans up some of the #includes in files referencing BLCKSZ.)
==========================================
1998-01-13 04:05:12 +00:00
Bruce Momjian 679d39b9c8 Goodbye ABORT. Hello ERROR for all errors. 1998-01-07 21:07:04 +00:00
Bruce Momjian 0d9fc5afd6 Change elog(WARN) to elog(ERROR) and elog(ABORT). 1998-01-05 03:35:55 +00:00
Bruce Momjian 59f6a57e59 Used modified version of indent that understands over 100 typedefs. 1997-09-08 21:56:23 +00:00
Bruce Momjian 319dbfa736 Another PGINDENT run that changes variable indenting and case label indenting. Also static variable indenting. 1997-09-08 02:41:22 +00:00
Bruce Momjian 1ccd423235 Massive commit to run PGINDENT on all *.c and *.h files. 1997-09-07 05:04:48 +00:00
Bruce Momjian 1d8bbfd2e7 Make functions static where possible, enclose unused functions in #ifdef NOT_USED. 1997-08-19 21:40:56 +00:00
Bruce Momjian 79e78f0b80 Added SCO support, from Daniel Harris. 1997-07-28 00:57:08 +00:00
Vadim B. Mikheev 051b4210e3 Fix for Hash and arrays 1997-04-22 03:32:38 +00:00
Marc G. Fournier ce4c0ce1de Some compile failure fixes from Keith Parks <emkxp01@mtcc.demon.co.uk> 1996-11-06 06:52:23 +00:00
Marc G. Fournier 3df33180a1 add #include "postgres.h", as required by all .c files 1996-10-31 10:12:26 +00:00
Marc G. Fournier 87b48ff032 D'Arcy's cleanups 1996-10-26 04:15:05 +00:00
Marc G. Fournier f796387b60 |From: Dan McGuirk <mcguirk@indirect.com>
|
|This patch fixes a backend crash that happens sometimes when you try to
|join on a field that contains NULL in some rows.  Postgres tries to
|compute a hash value of the field you're joining on, but when the field
|is NULL, the pointer it thinks is pointing to the data is really just
|pointing to random memory.  This forces the hash value of NULL to be 0.
|
|It seems that nothing matches NULL on joins, even other NULL's (with or
|without this patch).  Is that what's supposed to happen?
|
1996-08-19 01:52:36 +00:00
Marc G. Fournier e4b2558fa3 Minor bug fix 1996-07-26 20:03:21 +00:00
Marc G. Fournier e11744e164 More of Dr. George's changes...
- src/backend/catalog/*
                - no changes
        - src/backend/executor/*
                - change how nodeHash.c handles running out of memory
        - src/backend/optimizer/*
                - mostly cosmetic changes
1996-07-22 23:30:57 +00:00
Marc G. Fournier d31084e9d1 Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 06:22:35 +00:00