* new split algorithm (as proposed in http://archives.postgresql.org/pgsql-hackers/2006-06/msg00254.php)
* possible call pickSplit() for second and below columns
* add spl_(l|r)datum_exists to GIST_SPLITVEC -
pickSplit should check its values to use already defined
spl_(l|r)datum for splitting. pickSplit should set
spl_(l|r)datum_exists to 'false' (if they was 'true') to
signal to caller about using spl_(l|r)datum.
* support for old pickSplit(): not very optimal
but correct split
* remove 'bytes' field from GISTENTRY: in any case size of
value is defined by it's type.
* split GIST_SPLITVEC to two structures: one for using in picksplit
and second - for internal use.
* some code refactoring
* support of subsplit to rtree opclasses
TODO: add support of subsplit to contrib modules
this someday, but right now it seems that posix_fadvise is immature to
the point of being broken on many platforms ... and we don't have any
benchmark evidence proving it's worth spending time on.
per-tuple space overhead for sorts in memory. I chose to replace the
previous patch that tried to write out the bare minimum amount of data
when sorting on disk; instead, just dump the MinimalTuples as-is. This
wastes 3 to 10 bytes per tuple depending on architecture and null-bitmap
length, but the simplification in the writetup/readtup routines seems
worth it.
analyzing, so that future analyze threshold calculations don't get confused.
Also, make sure we correctly track the decrease of live tuples cause by
deletes.
Per report from Dylan Hansen, patches by Tom Lane and me.
tuples with less header overhead than a regular HeapTuple, per my
recent proposal. Teach TupleTableSlot code how to deal with these.
As proof of concept, change tuplestore.c to store MinimalTuples instead
of HeapTuples. Future patches will expand the concept to other places
where it is useful.
will be expanded to a list of their member fields, rather than creating
a nested rowtype field as formerly. (The old behavior is still available
by omitting '.*'.) This syntax is not allowed by the SQL spec AFAICS,
so changing its behavior doesn't violate the spec. The new behavior is
substantially more useful since it allows, for example, triggers to check
for data changes with 'if row(new.*) is distinct from row(old.*)'. Per
my recent proposal.
palloc() will normally round allocation requests up to the next power of 2,
so make dynahash choose allocation sizes that are as close to a power of 2
as possible.
Back-patch to 8.1 --- the problem exists further back, but a much larger
patch would be needed and it doesn't seem worth taking any risks.
aggregates. We just disallowed that, and AFAICS there should be no other
cases where direct (non-aggregated) references to input columns are allowed
in a query with aggregation and no GROUP BY.
This is disallowed by the SQL spec because it doesn't have any very sensible
interpretation. Historically Postgres has allowed it but behaved strangely.
As of PG 8.1 a server crash is possible if the MIN/MAX index optimization gets
applied; rather than try to "fix" that, it seems best to just enforce the
spec restriction. Per report from Josh Drake and Alvaro Herrera.
changing semantics too much. statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead. I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>. There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
libpq/md5.h, so that there's a clear separation between backend-only
definitions and shared frontend/backend definitions. (Turns out this
is reversing a bad decision from some years ago...) Fix up references
to crypt.h as needed. I looked into moving the code into src/port, but
the headers in src/include/libpq are sufficiently intertwined that it
seems more work than it's worth to do that.
current commands; instead, store current-status information in shared
memory. This substantially reduces the overhead of stats_command_string
and also ensures that pg_stat_activity is fully up to date at all times.
Per my recent proposal.
for it. Hopefully will fix core dump evidenced by some buildfarm members
since fadvise patch went in. The actual definition of the function is not
ABI-compatible with compiler's default assumption in the absence of any
declaration, so it's clearly unsafe to try to call it without seeing a
declaration.
by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries. The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient. Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
remove the infrastructure needed to enforce the limit, ie, the global
LRU list of cache entries. On small-to-middling databases this wins
because maintaining the LRU list is a waste of time. On large databases
this wins because it's better to keep more cache entries (we assume
such users can afford to use some more per-backend memory than was
contemplated in the Berkeley-era catcache design). This provides a
noticeable improvement in the speed of psql \d on a 10000-table
database, though it doesn't make it instantaneous.
While at it, use per-catcache settings for the number of hash buckets
per catcache, rather than the former one-size-fits-all value. It's a
bit silly to be using the same number of hash buckets for, eg, pg_am
and pg_attribute. The specific values I used might need some tuning,
but they seem to be in the right ballpark based on CATCACHE_STATS
results from the standard regression tests.
function call. Previously, there may have been no CHECK_FOR_INTERRUPTS
at all in the fastpath code path, making it impossible to cancel an
operation such as \lo_import externally. This addition doesn't ensure
you can cancel, since your SIGINT may arrive while the backend is idle
waiting for the client, but it gives the largest window we can easily
provide. Noted while experimenting with new control-C code for psql.
already-aborted transaction block. GetSnapshotData throws an Assert if
not in a valid transaction; hence we mustn't attempt to set a snapshot
for the function until after checking for aborted transaction. This is
harmless AFAICT if Asserts aren't enabled (GetSnapshotData will compute
a bogus snapshot, but it doesn't matter since HandleFunctionRequest will
throw an error shortly anywy). Hence, not a major bug.
Along the way, add some ability to log fastpath calls when statement
logging is turned on. This could probably stand to be improved further,
but not logging anything is clearly undesirable.
Backpatched as far as 8.0; bug doesn't exist before that.
LWLocks during a panic exit. This avoids the possible self-deadlock pointed
out by Qingqing Zhou. Also, I noted that an error during LoadFreeSpaceMap()
or BuildFlatFiles() would result in exit(0) which would leave the postmaster
thinking all is well. Added a critical section to ensure such errors don't
allow startup to proceed.
Backpatched to 8.1. The 8.0 code is a bit different and I'm not sure if the
problem exists there; given we've not seen this reported from the field, I'm
going to be conservative about backpatching any further.
o remove many WIN32_CLIENT_ONLY defines
o add WIN32_ONLY_COMPILER define
o add 3rd argument to open() for portability
o add include/port/win32_msvc directory for
system includes
Magnus Hagander
it is just the total time to do INSTR_TIME_SET_CURRENT(), and not any of
the other code involved in InstrStartNode/InstrStopNode. Even though I
fear we may end up reverting this patch altogether, we may as well have
the most correct version in our CVS archive.
choose_bitmap_and(). It was way too fuzzy --- per comment, it was meant to be
1% relative difference, but was actually coded as 0.01 absolute difference,
thus causing selectivities of say 0.001 and 0.000000000001 to be treated as
equal. I believe this thinko explains Maxim Boguk's recent complaint. While
we could change it to a relative test coded like compare_fuzzy_path_costs(),
there's a bigger problem here, which is that any fuzziness at all renders the
comparison function non-transitive, which could confuse qsort() to the point
of delivering completely wrong results. So forget the whole thing and just
do an exact comparison.