> Changes to avoid collisions with WIN32 & MFC names...
> 1. Renamed:
> a. PROC => PGPROC
> b. GetUserName() => GetUserNameFromId()
> c. GetCurrentTime() => GetCurrentDateTime()
> d. IGNORE => IGNORE_DTF in include/utils/datetime.h & utils/adt/datetim
>
> 2. Added _P to some lex/yacc tokens:
> CONST, CHAR, DELETE, FLOAT, GROUP, IN, OUT
Jan
transaction, so as to avoid returning them out of the index AM. Saves
repeated heap_fetch operations on frequently-updated rows. Also detect
queries on unique keys (equality to all columns of a unique index), and
don't bother continuing scan once we have found first match.
Killing is implemented in the btree and hash AMs, but not yet in rtree
or gist, because there isn't an equally convenient place to do it in
those AMs (the outer amgetnext routine can't do it without re-pinning
the index page).
Did some small cleanup on APIs of HeapTupleSatisfies, heap_fetch, and
index_insert to make this a little easier.
in snapshots, per my proposal of a few days ago. Also, tweak heapam.c
routines (heap_insert, heap_update, heap_delete, heap_mark4update) to
be passed the command ID to use, instead of doing GetCurrentCommandID.
For catalog updates they'll still get passed current command ID, but
for updates generated from the main executor they'll get passed the
command ID saved in the snapshot the query is using. This should fix
some corner cases associated with functions and triggers that advance
current command ID while an outer query is still in progress.
yesterday's proposal to pghackers. Also remove unnecessary parameters
to heap_beginscan, heap_rescan. I modified pg_proc.h to reflect the
new numbers of parameters for the AM interface routines, but did not
force an initdb because nothing actually looks at those fields.
As proof of concept, provide an alternate implementation based on POSIX
semaphores. Also push the SysV shared-memory implementation into a
separate file so that it can be replaced conveniently.
was in the thread "make BufferGetBlockNumber() a macro". Tom
objected to the original patch, so I prepared a new one which
doesn't change BufferGetBlockNumber() into a macro, it just
cleans up some comments and fixes an assertion. The patch
is attached.
Neil Conway
for Solaris on SPARC. Scott Brunza (sbrunza@sonalysts.com) gets
credit for identifying the issue, making the change, and doing
the regression tests.
Earlier testing on 7.2rc2 and 7.2 showed performance gains of
1% to 10% on pgbench, osdb-pg, and some locally developed apps.
Solaris Intimate Shared Memory is described in "SOLARIS INTERNALS
Core Kernel Components" by Jim Mauro and Richard McDougall,
Copyright 2001 Sun Microsystem, Inc. ISBN 0-13-022496-0
P.J. "Josh" Rovero
The patch enables the mips2 ISA for the ll/sc operations, and then restores
it when done. The kernel/libc emulation code will take over on CPUs without
ll/sc, and on CPUs with it, it'll use the operations provided by the CPU.
Combined with the earlier fix (removing -mips2), postgresql builds again on
mips and mipsel. The patch is against 7.2-7.
Oliver Elphick
o Change all current CVS messages of NOTICE to WARNING. We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.
o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.
o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.
o Remove INFO from the client_min_messages options and add NOTICE.
Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.
Regression passed.
now just below FATAL in server_min_messages. Added more text to
highlight ordering difference between it and client_min_messages.
---------------------------------------------------------------------------
REALLYFATAL => PANIC
STOP => PANIC
New INFO level the prints to client by default
New LOG level the prints to server log by default
Cause VACUUM information to print only to the client
NOTICE => INFO where purely information messages are sent
DEBUG => LOG for purely server status messages
DEBUG removed, kept as backward compatible
DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added
DebugLvl removed in favor of new DEBUG[1-5] symbols
New server_min_messages GUC parameter with values:
DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC
New client_min_messages GUC parameter with values:
DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC
Server startup now logged with LOG instead of DEBUG
Remove debug_level GUC parameter
elog() numbers now start at 10
Add test to print error message if older elog() values are passed to elog()
Bootstrap mode now has a -d that requires an argument, like postmaster
removes any empty chunks, the chunk previously added won't be there
anymore, so it's possible there is zero free space in the rel's page list
afterwards. Must loop back and rerun the part that adds a chunk to
the list.
to prevent spreading of corruption when page header pointers are bad.
Merge PageZero into PageInit, since it was never used separately, and
remove separate memset calls used at most other PageInit call points.
Remove IndexPageCleanup, which wasn't used at all.
granted the lock when awakened; the signal now only means that the lock
is potentially available. The waiting process must retry its attempt
to get the lock when it gets to run. This allows the lock releasing
process to re-acquire the lock later in its timeslice. Since LWLocks
are usually held for short periods, it is possible for a process to
acquire and release the same lock many times in a timeslice. The old
spinlock-based implementation of these locks allowed for that; but the
original coding of LWLock would force a process swap for each acquisition
if there was any contention. Although this approach reopens the door to
process starvation (a waiter might repeatedly fail to get the lock),
the odds of that being a big problem seem low, and the performance cost
of the previous approach is considerable.
'volatile' pointers to access those structures, so that optimizing
compilers will not decide to move the structure accesses outside of the
spinlock-acquire-to-spinlock-release sequence. There are no known bugs
in these uses at present, but based on bad experience with lwlock.c,
it seems prudent to ensure that we protect these other uses too.
Per pghackers discussion around 12-Dec. (Note: it should not be
necessary to worry about structures protected by LWLocks, since the
LWLock acquire and release operations are not inline macros.)
should be accounted for in the PROC_SEM_MAP_ENTRIES() macro. Otherwise
the ports that rely on this macro to size data structures are broken.
Mea culpa.
so that only one signal number is used not three. Flags in shared
memory tell the reason(s) for the current signal. This method is
extensible to handle more signal reasons without chewing up even more
signal numbers, but the immediate reason is to keep pg_pwd reloads
separate from SIGHUP processing in the postmaster.
Also clean up some problems in the postmaster with delayed response to
checkpoint status changes --- basically, it wouldn't schedule a checkpoint
if it wasn't getting connection requests on a regular basis.
never overwrite adjacent pages with copied data, even if page header
and/or item pointers are already corrupt. Change inspired by trouble
report from Alvaro Herrera.
readability. Bizarre '(long *) TRUE' return convention is gone,
in favor of just raising an error internally in dynahash.c when
we detect hashtable corruption. HashTableWalk is gone, in favor
of using hash_seq_search directly, since it had no hope of working
with non-LONGALIGNable datatypes. Simplify some other code that was
made undesirably grotty by promixity to HashTableWalk.
portability issues). Caller-visible data structures are now allocated
on MAXALIGN boundaries, allowing safe use of datatypes wider than 'long'.
Rejigger hash_create API so that caller specifies size of key and
total size of entry, not size of key and size of rest of entry.
This simplifies life considerably since each number is just a sizeof(),
and padding issues etc. are taken care of automatically.
upper limit on what we will believe from sysconf(_SC_OPEN_MAX). The
default value is 1000, so that under ordinary conditions it won't
affect the behavior. But on platforms where the kernel promises far
more than it can deliver, this can be used to prevent running out of
file descriptors. See numerous past discussions, eg, pgsql-hackers
around 23-Dec-2000.
existing lock manager and spinlocks: it understands exclusive vs shared
lock but has few other fancy features. Replace most uses of spinlocks
with lightweight locks. All remaining uses of spinlocks have very short
lock hold times (a few dozen instructions), so tweak spinlock backoff
code to work efficiently given this assumption. All per my proposal on
pghackers 26-Sep-01.
a hung client or lost connection can't indefinitely block a postmaster
child (not to mention the possibility of deliberate DoS attacks).
Timeout is controlled by new authentication_timeout GUC variable,
which I set to 60 seconds by default ... does that seem reasonable?
for them, and making them just wastes time during backend startup/shutdown.
Also, remove compile-time MAXBACKENDS limit per long-ago proposal.
You can now set MaxBackends as high as your kernel can stand without
any reconfiguration/recompilation.
available in freeSemMap. As noted by Tatsuo, this is now a likely
scenario for detecting MaxBackends-exceeded; if MaxBackends is a multiple
of PROC_NSEMS_PER_SET then we will fail here and not in sinval.c. The
cleanup path did not work correctly before, anyway.
system. Some systems did not understand the 'l' section, and in general
it wasn't entirely appropriate.
On SCO OpenServer, the man pages won't be installed at all until someone
figures out their man system.
buffer manager with 'pg_clog', a specialized access method modeled
on pg_xlog. This simplifies startup (don't need to play games to
open pg_log; among other things, OverrideTransactionSystem goes away),
should improve performance a little, and opens the door to recycling
commit log space by removing no-longer-needed segments of the commit
log. Actual recycling is not there yet, but I felt I should commit
this part separately since it'd still be useful if we chose not to
do transaction ID wraparound.
cvs.
The Debian bug report says, "The upstream source makes use of NOFILE
unconditionalized. As the Hurd doesn't have an arbitrary limit on the
number of open files, this is not defined. But _SC_OPEN_MAX works fine
and returns 1024 (applications can increase this as they want), so I
suggest the below diff. Please forward this upstream, too."
Oliver Elphick
in GetSnapshotData, GetNewTransactionId, CommitTransaction, AbortTransaction,
etc. Correct race condition in transaction status testing in
HeapTupleSatisfiesVacuum --- this wasn't important for old VACUUM with
exclusive lock on its table, but it sure is important now. All per
pghackers discussion 7/11/01 and 7/12/01.
validity checking rules for VACUUM. Make some other rearrangements of the
VACUUM code to allow more code to be shared between full and lazy VACUUM.
Minor code cleanups and added comments for TransactionId manipulations.
useful as yet, since its primary source of information is (full) VACUUM,
which makes a concerted effort to get rid of free space before telling
the map about it ... next stop is concurrent VACUUM ...
stub) into the rest of the system. Adopt a cleaner approach to preventing
deadlock in concurrent heap_updates: allow RelationGetBufferForTuple to
select any page of the rel, and put the onus on it to lock both buffers
in a consistent order. Remove no-longer-needed isExtend hack from
API of ReleaseAndReadBuffer.
do anything yet, but it has the necessary connections to initialization
and so forth. Make some gestures towards allowing number of blocks in
a relation to be BlockNumber, ie, unsigned int, rather than signed int.
(I doubt I got all the places that are sloppy about it, yet.) On the
way, replace the hardwired NLOCKS_PER_XACT fudge factor with a GUC
variable.
SI messages now include the relevant database OID, so that operations
in one database do not cause useless cache flushes in backends attached
to other databases. Declare SI messages properly using a union, to
eliminate the former assumption that Oid is the same size as int or Index.
Rewrite the nearly-unreadable code in inval.c, and document it better.
Arrange for catcache flushes at end of command/transaction to happen before
relcache flushes do --- this avoids loading a new tuple into the catcache
while setting up new relcache entry, only to have it be flushed again
immediately.
detected sooner in backend startup, and is treated as an expected error
(it gives 'Sorry, too many clients already' now). This allows us not
to have to enforce the MaxBackends limit exactly in the postmaster.
Also, remove ProcRemove() and fold its functionality into ProcKill().
There's no good reason for a backend not to be responsible for removing
its PROC entry, and there are lots of good reasons for the postmaster
not to be touching shared-memory data structures.
pg_database now has unique indexes on oid and on datname.
pg_shadow now has unique indexes on usename and on usesysid.
pg_am now has unique index on oid.
pg_opclass now has unique index on oid.
pg_amproc now has unique index on amid+amopclaid+amprocnum.
Remove pg_rewrite's unnecessary index on oid, delete unused RULEOID syscache.
Remove index on pg_listener and associated syscache for performance reasons
(caching rows that are certain to change before you need 'em again is
rather pointless).
Change pg_attrdef's nonunique index on adrelid into a unique index on
adrelid+adnum.
Fix various incorrect settings of pg_class.relisshared, make that the
primary reference point for whether a relation is shared or not.
IsSharedSystemRelationName() is now only consulted to initialize relisshared
during initial creation of tables and indexes. In theory we might now
support shared user relations, though it's not clear how one would get
entries for them into pg_class &etc of multiple databases.
Fix recently reported bug that pg_attribute rows created for an index all have
the same OID. (Proof that non-unique OID doesn't matter unless it's
actually used to do lookups ;-))
There's no need to treat pg_trigger, pg_attrdef, pg_relcheck as bootstrap
relations. Convert them into plain system catalogs without hardwired
entries in pg_class and friends.
Unify global.bki and template1.bki into a single init script postgres.bki,
since the alleged distinction between them was misleading and pointless.
Not to mention that it didn't work for setting up indexes on shared
system relations.
Rationalize locking of pg_shadow, pg_group, pg_attrdef (no need to use
AccessExclusiveLock where ExclusiveLock or even RowExclusiveLock will do).
Also, hold locks until transaction commit where necessary.
directory (which can be made a symlink to put temp files on another disk).
Add code to delete leftover temp files during postmaster startup.
Bruce, with some kibitzing from Tom.
appropriate pin-count manipulation, and instead use ReleaseAndReadBuffer.
Make use of the fact that the passed-in buffer (if there is one) must
be pinned to avoid grabbing the bufmgr spinlock when we are able to
return this same buffer. Eliminate unnecessary 'previous tuple' and
'next tuple' fields of HeapScanDesc and IndexScanDesc, thereby removing
a whole lot of bookkeeping from heap_getnext() and related routines.
checkpoint's redo pointer, not its undo pointer, per discussion in
pghackers a few days ago. No point in hanging onto undo information
until we have the ability to do something with it --- and this solves
a rather large problem with log space for long-running transactions.
Also, change all calls of write() to detect the case where write
returned a count less than requested, but failed to set errno.
Presume that this situation indicates ENOSPC, and give the appropriate
error message, rather than a random message associated with the previous
value of errno.
Python) to support shared extension modules, I have learned that Guido
prefers the style of the attached patch to solve the above problem.
I feel that this solution is particularly appropriate in this case
because the following:
PglargeType
PgType
PgQueryType
are already being handled in the way that I am proposing for PgSourceType.
Jason Tishler
when we need to move to a new page; as long as we can insert the new
tuple on the same page as before, we only need LockBuffer and not the
expensive stuff. Also, twiddle bufmgr interfaces to avoid redundant
lseeks in RelationGetBufferForTuple and BufferAlloc. Successive inserts
now require one lseek per page added, rather than one per tuple with
several additional ones at each page boundary as happened before.
Lock contention when multiple backends are inserting in same table
is also greatly reduced.
not being consulted anywhere, so remove it and remove the _mdnblocks()
calls that were used to set it. Change smgrextend interface to pass in
the target block number (ie, current file length) --- the caller always
knows this already, having already done smgrnblocks(), so it's silly to
do it over again inside mdextend. Net result: extension of a file now
takes one lseek(SEEK_END) and a write(), not three lseeks and a write.
> cronjob:
> NOTICE: RegisterSharedInvalid: SI buffer overflow
> NOTICE: InvalidateSharedInvalid: cache state reset
> I don't understand what these mean. Should I be concerned about them
> and what do they signify?
No real need to worry. Those should've been downgraded to DEBUG-level
messages a release or two back, but nobody bothered...
Tom Lane
VFD entries. On platforms where dereferencing a null pointer doesn't
lead to coredump, it's possible that this omission could have led to
unpleasant behavior like deleting the wrong file.
*before* acquiring shlock on buffer context. This way we should be
protected against conflicts with FlushRelationBuffers.
(Seems we never do excl lock and then StartBufferIO for the same
buffer, so there should be no deadlock here, - but we'd better
check this very soon).
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
only if at least N other backends currently have open transactions. This
is not a great deal of intelligence about whether a delay might be
profitable ... but it beats no intelligence at all. Note that the default
COMMIT_DELAY is still zero --- this new code does nothing unless that
setting is changed.
Also, mark ENABLEFSYNC as a system-wide setting. It's no longer safe to
allow that to be set per-backend, since we may be relying on some other
backend's fsync to have synced the WAL log.
does not lead to a one-second delay, but to an immediate EINVAL failure.
This causes CHECKPOINT to crash with s_lock_stuck much too quickly :-(.
Fix by breaking down the requested wait div/mod 1e6.
> Is there one LOCKMETHODCTL for every backend? I thought there was only
> one of them.
>>
>> You're right, that line is erroneous; it should read
>>
>> size += MAX_LOCK_METHODS * MAXALIGN(sizeof(LOCKMETHODCTL));
>>
>> Not a significant error but it should be changed for clarity ...
waste of cycles on single-CPU machines, and of dubious utility on multi-CPU
machines too.
Tweak s_lock_stuck so that caller can specify timeout interval, and
increase interval before declaring stuck spinlock for buffer locks and XLOG
locks.
On systems that have fdatasync(), use that rather than fsync() to sync WAL
log writes. Ensure that WAL file is entirely allocated during XLogFileInit.
are now separate files "postgres.h" and "postgres_fe.h", which are meant
to be the primary include files for backend .c files and frontend .c files
respectively. By default, only include files meant for frontend use are
installed into the installation include directory. There is a new make
target 'make install-all-headers' that adds the whole content of the
src/include tree to the installed fileset, for use by people who want to
develop server-side code without keeping the complete source tree on hand.
Cleaned up a whole lot of crufty and inconsistent header inclusions.
bothering to check the return value --- which meant that in case the
update or delete failed because of a concurrent update, you'd not find
out about it, except by observing later that the transaction produced
the wrong outcome. There are now subroutines simple_heap_update and
simple_heap_delete that should be used anyplace that you're not prepared
to do the full nine yards of coping with concurrent updates. In
practice, that seems to mean absolutely everywhere but the executor,
because *noplace* else was checking.
rewrite of deadlock checking. Lock holder objects are now reachable from
the associated LOCK as well as from the owning PROC. This makes it
practical to find all the processes holding a lock, as well as all those
waiting on the lock. Also, clean up some of the grottier aspects of the
SHMQueue API, and cause the waitProcs list to be stored in the intuitive
direction instead of the nonintuitive one. (Bet you didn't know that
the code followed the 'prev' link to get to the next waiting process,
instead of the 'next' link. It doesn't do that anymore.)
here is the patch attached which do check in each BLOB operation, if we are
in transaction, and raise an error otherwise. This will prevent such mistakes.
--
Sincerely Yours,
Denis Perchine
are treated more like 'cancel' interrupts: the signal handler sets a
flag that is examined at well-defined spots, rather than trying to cope
with an interrupt that might happen anywhere. See pghackers discussion
of 1/12/01.
are now critical sections, so as to ensure die() won't interrupt us while
we are munging shared-memory data structures. Avoid insecure intermediate
states in some code that proc_exit will call, like palloc/pfree. Rename
START/END_CRIT_CODE to START/END_CRIT_SECTION, since that seems to be
what people tend to call them anyway, and make them be called with () like
a function call, in hopes of not confusing pg_indent.
I doubt that this is sufficient to make SIGTERM safe anywhere; there's
just too much code that could get invoked during proc_exit().
starting a new hashtable search no longer clobbers any other search
active anywhere in the system. Fix RelationCacheInvalidate() so that
it will not crash or go into an infinite loop if invoked recursively,
as for example by a second SI Reset message arriving while we are still
processing a prior one.
In theory we should always get EEXIST if there's a key collision, but
if the kernel code tests error conditions in a weird order, perhaps
EACCES or EIDRM could occur too.
assume that TAS() will always succeed the first time, even if the lock
is known to be free. Also, make sure that code will eventually time out
and report a stuck spinlock, rather than looping forever. Small cleanups
in s_lock.h, too.
level" locks. A session lock is not released at transaction commit (but it
is released on transaction abort, to ensure recovery after an elog(ERROR)).
In VACUUM, use a session lock to protect the master table while vacuuming a
TOAST table, so that the TOAST table can be done in an independent
transaction.
I also took this opportunity to do some cleanup and renaming in the lock
code. The previously noted bug in ProcLockWakeup, that it couldn't wake up
any waiters beyond the first non-wakeable waiter, is now fixed. Also found
a previously unknown bug of the same kind (failure to scan all members of
a lock queue in some cases) in DeadLockCheck. This might have led to failure
to detect a deadlock condition, resulting in indefinite waits, but it's
difficult to characterize the conditions required to trigger a failure.
might change it. Experimentation shows that the signal handler call
mechanism does not save/restore errno for you, at least not on Linux
or HPUX, so this is definitely a real risk.
to ensure that we have released buffer refcounts and so forth, rather than
putting ad-hoc operations before (some of the calls to) proc_exit. Add
commentary to discourage future hackers from repeating that mistake.
included by everything that includes bufmgr.h --- it's supposed to be
internals, after all, not part of the API! This fixes the conflict
against FreeBSD headers reported by Rosenman, by making it unnecessary
for s_lock.h to be included by plperl.c.
IPC key assignment will now work correctly even when multiple postmasters
are using same logical port number (which is possible given -k switch).
There is only one shared-mem segment per postmaster now, not 3.
Rip out broken code for non-TAS case in bufmgr and xlog, substitute a
complete S_LOCK emulation using semaphores in spin.c. TAS and non-TAS
logic is now exactly the same.
When deadlock is detected, "Deadlock detected" is now the elog(ERROR)
message, rather than a NOTICE that comes out before an unhelpful ERROR.
Context diff this time.
Remove -m486 compile args for FreeBSD-i386, compile -O2 on i386.
Compile with only -O on alpha for codegen safety.
Make the port use the TEST_AND_SET for alpha and i386 on FreeBSD.
Fix a lot of bogus string formats for outputting pointers (cast to int
and %u/%x replaced with no cast and %p), and 'Size'(size_t) are now
cast to 'unsigned long' and output with %lu/
Remove an unused variable.
Alfred Perlstein
that search loops only have to scan that far and not through all maxBackends
entries. This eliminates a performance penalty for setting maxBackends
much higher than the average number of active backends. Also, eliminate
no-longer-used 'backend tag' concept. Remove setting of environment
variables at backend start (except for CYR_RECODE), since none of them
are being examined by the backend any longer.
(WAL logging for this is not done yet, however.) Clean up a number of really
crufty things that are no longer needed now that DROP behaves nicely. Make
temp table mapper do the right things when drop or rename affecting a temp
table is rolled back. Also, remove "relation modified while in use" error
check, in favor of locking tables at first reference and holding that lock
throughout the statement.
kibitzing from Tom Lane. Large objects are now all stored in a single
system relation "pg_largeobject" --- no more xinv or xinx files, no more
relkind 'l'. This should offer substantial performance improvement for
large numbers of LOs, since there won't be directory bloat anymore.
It'll also fix problems like running out of locktable space when you
access thousands of LOs in one transaction.
Also clean up cruft in read/write routines. LOs with "holes" in them
(never-written byte ranges) now work just like Unix files with holes do:
a hole reads as zeroes but doesn't occupy storage space.
INITDB forced!
from bufmgr - it would be nice to have separate hash in smgr
for node <--> fd mappings, but for the moment it's easy to
add new hash to relcache.
Fixed small bug in xlog.c:ReadRecord.
and DropBuffers. Formerly we cleared the flag for each buffer currently
belonging to the target rel or database, but that's completely wrong!
Must look at BufferTagLastDirtied to see whether the BufferDirtiedByMe
flag is relevant to target rel or not; this is *independent* of the
current contents of the buffer. Vadim spotted this problem, but his
fix was only partially correct...
> Regression tests opr_sanity and sanity_check are now failing.
Um, Bruce, I've said several times that I didn't think Perchine's large
object changes should be applied until someone had actually reviewed
them.
I tested it restoring my database with > 100000 BLOBS, and dumping it out.
But unfortunatly I can not restore it back due to problems in pg_dump.
--
Sincerely Yours,
Denis Perchine
source directory. This involves mostly makefiles using $(srcdir) when they
might have used ".". (Regression tests don't work with this, yet.)
Sort out usage of CPPFLAGS, CFLAGS (and CXXFLAGS). Add "override" keyword
in most places, to preserve necessary flags even when the user overrode the
flags.
> this is patch v 0.4 to support transactions with BLOBs.
> All BLOBs are in one table. You need to make initdb.
>
> --
> Sincerely Yours,
> Denis Perchine
after that dynamic loading isn't working and shared memory handling is
broken.
Attached with this message, there is a Zip file which contain :
* beos.diff = patch file generated with difforig
* beos = folder with beos support files which need to be moved in /
src/backend/port
* expected = foler with three file for message and precision
difference in regression test
* regression.diff = rule problem (need to kill the backend manualy)
* dynloader = dynloader files (they are also in the pacth files,
but there is so much modification that I have join full files)
Everything works except a problem in 'rules' Is there some problems
with rules in the current tree ? It used to works with last week tree.
Cyril VELTER
working on the VERY latest version of BeOS. I'm sure there will be
alot of comments, but then if there weren't I'd be disappointed!
Thanks for your continuing efforts to get this into your tree.
Haven't bothered with the new files as they haven't changed.
BTW Peter, the compiler is "broken" about the bool define and so on.
I'm filing a bug report to try and get it addressed. Hopefully then we
can tidy up the code a bit.
I await the replies with interest :)
David Reid
virtual FDs, we just return the ENFILE/EMFILE error to the caller,
rather than immediate elog(). This allows more robust behavior in
the postmaster, which uses AllocateFile() but does not want elog().
duplicate keys by letting search go to the left rather than right when an
equal key is seen at an upper tree level. Fix poor choice of page split
point (leading to insertion failures) that was forced by chaining logic.
Don't store leftmost key in non-leaf pages, since it's not necessary.
Don't create root page until something is first stored in the index, so an
unused index is now 8K not 16K. (Doesn't seem to be as easy to get rid of
the metadata page, unfortunately.) Massive cleanup of unreadable code,
fix poor, obsolete, and just plain wrong documentation and comments.
See src/backend/access/nbtree/README for the gory details.
There's now only one transition value and transition function.
NULL handling in aggregates is a lot cleaner. Also, use Numeric
accumulators instead of integer accumulators for sum/avg on integer
datatypes --- this avoids overflow at the cost of being a little slower.
Implement VARIANCE() and STDDEV() aggregates in the standard backend.
Also, enable new LIKE selectivity estimators by default. Unrelated
change, but as long as I had to force initdb anyway...
pass-by-ref data types --- eg, an index on lower(textfield) --- no longer
leak memory during index creation or update. Clean up a lot of redundant
code ... did you know that copy, vacuum, truncate, reindex, extend index,
and bootstrap each basically duplicated the main executor's logic for
extracting information about an index and preparing index entries?
Functional indexes should be a little faster now too, due to removal
of repeated function lookups.
CREATE INDEX 'opt_type' clause is deimplemented by these changes,
but I haven't removed it from the parser yet (need to merge with
Thomas' latest change set first).
Don't go through pg_exec_query_dest(), but directly to the execution
routines. Also, extend parameter lists so that there's no need to
change the global setting of allowSystemTableMods, a hack that was
certain to cause trouble in the event of any error.
for details). It doesn't really do that much yet, since there are no
short-term memory contexts in the executor, but the infrastructure is
in place and long-term contexts are handled reasonably. A few long-
standing bugs have been fixed, such as 'VACUUM; anything' in a single
query string crashing. Also, out-of-memory is now considered a
recoverable ERROR, not FATAL.
Eliminate a large amount of crufty, now-dead code in and around
memory management.
Fix problem with holding off SIGTRAP, SIGSEGV, etc in postmaster and
backend startup.