additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
and possibly for other cases too:
DO NOT cache status of transaction in unknown state
(i.e. non-committed and non-aborted ones)
Example:
T1 reads row updated/inserted by running T2 and cache T2 status.
T2 commits.
Now T1 reads a row updated by T2 and with HEAP_XMAX_COMMITTED
in t_infomask (so cached T2 status is not changed).
Now T1 EvalPlanQual gets updated row version without HEAP_XMIN_COMMITTED
-> TransactionIdDidCommit(t_xmin) and TransactionIdDidAbort(t_xmin)
return FALSE and T2 decides that t_xmin is not committed and gets
ERROR above.
It's too late to find more smart way to handle such cases and so
I just changed xact status caching and got rid TransactionIdFlushCache()
from code.
Changed: transam.c, xact.c, lmgr.c and transam.h - last three
just because of TransactionIdFlushCache() is removed.
2. heapam.c:
T1 marked a row for update. T2 waits for T1 commit/abort.
T1 commits. T3 updates the row before T2 locks row page.
Now T2 sees that new row t_xmax is different from xact id (T1)
T2 was waiting for. Old code did Assert here. New one goes to
HeapTupleSatisfiesUpdate. Obvious changes too.
3. Added Assert to vacuum.c
4. bufmgr.c: break
Assert(buf->r_locks == 0 && !buf->ri_lock)
into two Asserts.
2. varsup.c:ReadNewTransactionId(): don't read nextXid from disk -
this func doesn't allocate next xid, so ShmemVariableCache->nextXid
may be used (but GetNewTransactionId() must be called first).
3. vacuum.c: change elog(ERROR, "Child item....") to elog(NOTICE) -
this is not ERROR, proper handling is just not implemented, yet.
4. s_lock.c: increase S_MAX_BUSY by 2 times.
5. shmem.c:GetSnapshotData(): have to call ReadNewTransactionId()
_after_ SpinAcquire(ShmemIndexLock).
2. Much faster btree tuples deletion in the case when first on page
index tuple is deleted (no movement to the left page(s)).
3. Remember blkno of new root page in BTPageOpaque of
left/right siblings when root page is splitted.
NetBSD/macppc
LinuxPPC
FreeBSD 2.2.6-RELEASE
All of them seem happy with the regression test. Note that, however,
compiling with optimization enabled on NetBSD/macppc causes an initdb
failure (other two platforms are ok). After checking the asm code, we
are suspecting that might be a compiler(egcs) bug.
Tatsuo Ishii
shared memory space allocation. It's a wonder we have not seen bug
reports traceable to this area ... it's quite clear that the routine
dir_realloc() has never worked correctly, for example.
Ok. I made patches replacing all of "#if FALSE" or "#if 0" to "#ifdef
NOT_USED" for current. I have tested these patches in that the
postgres binaries are identical.
Nakajima. Since he is not subscribing the mailing list, I'm posting
his patches by his request. According to him, he has successfully
compiled and passed the regression test on Mac SE/30 running
NetBSD/m68k. Also, another person has reported that with the patches
PostgreSQL is working on NetBSD/sun3 too.
--
Tatsuo Ishii
compiled with -O0. Included are patches that should fix the problem
(of course I have confirmed -O2 works with this patch).
BTW, here is a platforms/regression test failure(serious one--backend
death) matrix.
Tatsuo Ishii
no longer returns buffer pointer, can be gotten from scan;
descriptor; bootstrap can create multi-key indexes;
pg_procname index now is multi-key index; oidint2, oidint4, oidname
are gone (must be removed from regression tests); use System Cache
rather than sequential scan in many places; heap_modifytuple no
longer takes buffer parameter; remove unused buffer parameter in
a few other functions; oid8 is not index-able; remove some use of
single-character variable names; cleanup Buffer variables usage
and scan descriptor looping; cleaned up allocation and freeing of
tuples; 18k lines of diff;
This incorporates all the precedeing patches and emailed suggestions
and the results of the performance testing I posted last week. I
would like to get this tested on as many platforms as possible so
I can verify it went in correctly (as opposed to the horrorshow
last time I sent in a patch).
Once this is confirmed, I will make a tarball of files that can be
dropped into a 6.3.2 source tree as a few people have asked for
this in 6.3.2 as well.
David Gould
Attached patch will add a version() function to Postges, e.g.
template1=> select version();
version
------------------------------------------------------------
PostgreSQL 6.3.2 on i586-pc-linux-gnu, compiled by gcc 2.8.1
(1 row)
Ok, I have finally gotten all of the defines for Dec/Alpha and
Linux/Alpha sorted out as Marc asked. There is no longer any need for
'-Dalpha' or '-Dlinuxalpha' in either the Dec/Alpha or the Linux/Alpha
template files (./src/template/{alpha,linuxalpha}). I have replaced every
instance of 'alpha' or '__alpha__' with '__alpha', as that appears to be
the common symbol between C compilers on both operating systems (RH4.2 &
DecUnix 4.0b) for alpha.
Attached you'll find a (big) patch that fixes make dep and make
depend in all Makefiles where I found it to be appropriate.
It also removes the dependency in Makefile.global for NAMEDATALEN
and OIDNAMELEN by making backend/catalog/genbki.sh and bin/initdb/initdb.sh
a little smarter.
This no longer requires initdb.sh that is turned into initdb with
a sed script when installing Postgres, hence initdb.sh should be
renamed to initdb (after the patch has been applied :-) )
This patch is against the 6.3 sources, as it took a while to
complete.
Please review and apply,
Cheers,
Jeroen van Vianen
[This is a repost - it supercedes the previous one. It fixes the patch so
it doesn't bread aix port, plus there's a file missing out of the
original post because difforig doesn't pick up new files. It's now
attached. peter]
This patch brings the JDBC driver up to the current protocol spec.
Basically, the backend now tells the driver what authentication scheme to
use.
The patch also fixes a performance problem with large objects. In the
buffer manager, each fastpath call was sending multiple Notifications to
the backend (sometimes more data in the form of notifications were being
sent than blob data!).
==========================================
What follows is a set of diffs that cleans up the usage of BLCKSZ.
As a side effect, the person compiling the code can change the
value of BLCKSZ _at_their_own_risk_. By that, I mean that I've
tried it here at 4096 and 16384 with no ill-effects. A value
of 4096 _shouldn't_ affect much as far as the kernel/file system
goes, but making it bigger than 8192 can have severe consequences
if you don't know what you're doing. 16394 worked for me, _BUT_
when I went to 32768 and did an initdb, the SCSI driver broke and
the partition that I was running under went to hell in a hand
basket. Had to reboot and do a good bit of fsck'ing to fix things up.
The patch can be safely applied though. Just leave BLCKSZ = 8192
and everything is as before. It basically only cleans up all of the
references to BLCKSZ in the code.
If this patch is applied, a comment in the config.h file though above
the BLCKSZ define with warning about monkeying around with it would
be a good idea.
Darren darrenk@insightdist.com
(Also cleans up some of the #includes in files referencing BLCKSZ.)
==========================================
Makefile.global.
End result, if all goes well, should allow for much easier porting, since
there will no longer be a concept of a "port". Most, if not everything,
*should* be determined by configure, or by the compiler itself. Still
work to be done though :)
all local buffers @ xact commit, so accordingly nextFreeLocalBuf
is first local buffer now.
It helps to avoid unnecessary local buffer allocations in LocalBufferAlloc()
latter ("memmory leaks" in 'order by').
2. ResetLocalBufferPool() lost allocated local buffers:
memset(LocalBufferDescriptors, 0, sizeof(BufferDesc) * NLocBuffer);
(local buffers leak @ xact aborts).
Reply-To: hackers@hub.org, Dan McGuirk <mcguirk@indirect.com>
To: hackers@hub.org
Subject: [HACKERS] tmin writeback optimization
I was doing some profiling of the backend, and noticed that during a certain
benchmark I was running somewhere between 30% and 75% of the backend's CPU
time was being spent in calls to TransactionIdDidCommit() from
HeapTupleSatisfiesNow() or HeapTupleSatisfiesItself() to determine that
changed rows' transactions had in fact been committed even though the rows'
tmin values had not yet been set.
When a query looks at a given row, it needs to figure out whether the
transaction that changed the row has been committed and hence it should pay
attention to the row, or whether on the other hand the transaction is still
in progress or has been aborted and hence the row should be ignored. If
a tmin value is set, it is known definitively that the row's transaction
has been committed. However, if tmin is not set, the transaction
referred to in xmin must be looked up in pg_log, and this is what the
backend was spending a lot of time doing during my benchmark.
So, implementing a method suggested by Vadim, I created the following
patch that, the first time a query finds a committed row whose tmin value
is not set, sets it, and marks the buffer where the row is stored as
dirty. (It works for tmax, too.) This doesn't result in the boost in
real time performance I was hoping for, however it does decrease backend
CPU usage by up to two-thirds in certain situations, so it could be
rather beneficial in high-concurrency settings.
1. New flag - BM_JUST_DIRTIED - added for BufferDesc;
2. All data "dirtiers" (WriteBuffer and WriteNoReleaseBuffer)
set this flag (and BM_DIRTY too);
3. All data "flushers" (FlushBuffer, BufferSync and BufferReplace)
turn this flag off just before calling smgr[blind]write/smgrflush
and check this flag after flushing buffer: if it turned ON then
BM_DIRTY will stay ON.