Now indexes of pg_class and pg_type are unique indexes
and guarantee the uniqueness of correponding attributes.
heap_create() was changed to take another boolean parameter
which allows to postpone the creation of disk file.
The name of rd_nonameunlinked was changed to rd_unlinked.
It is used generally(not only for noname relations) now.
Requires initdb.
eliminating some wildly inconsistent coding in various parts of the
system. I set MAXPGPATH = 1024 in config.h.in. If anyone is really
convinced that there ought to be a configure-time test to set the
value, go right ahead ... but I think it's a waste of time.
proc_exit time. I discovered that if the frontend closes the connection
when you're inside a transaction block, there is nothing ensuring that
temp files go away ... I wonder whether proc_exit ought to try to do an
explicit transaction abort?
recycle storage within sort temp file on a block-by-block basis. This
reduces peak disk usage to essentially just the volume of data being
sorted, whereas it had been about 4x the data volume before.
BufFile so that it handles multi-segment temporary files transparently.
This allows sorts and hashes to work with data exceeding 2Gig (or whatever
the local limit on file size is). Change psort.c to use relative seeks
instead of absolute seeks for backwards scanning, so that it won't fail
when the data volume exceeds 2Gig.
Cygwin snapshots (tested on 990115 which is recommended to use - it fixes
some errors in B20.1)
And I have another patch for including <sys/ipc.h> before <sys/sem.h> in
backend/storage/lmgr/proc.c - it is required due the design of cygipc
headers
Dan
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
offended my aesthestic sensibility that there was so much unreadable code
doing so little. Rewritten code is about half the size, faster, and
(I hope) much more intelligible.
has positive refcount, it is rebuilt from pg_class data. This ensures
that relcache entries will track changes made by other backends. Formerly,
a shared inval report would just be ignored if it happened to arrive while
the relcache entry was in use. Also, fix relcache to reset ref counts
to zero during transaction abort. Finally, change LockRelation() so that
it checks for shared inval reports after obtaining the lock. In this way,
once any kind of lock has been obtained on a rel, we can trust the relcache
entry to be up-to-date.
the SInval spinlock while it is calling the passed invalFunction or
resetFunction. This is necessary to avoid deadlock with lmgr change;
InvalidateSharedInvalid can be called recursively now. It should be
a good performance improvement anyway --- holding a spinlock for more
than a very short interval is a no-no.
insight that RelationFlushRelation ought to invoke smgrclose, and that the
way to make that work is to ensure that mdclose doesn't fail if the relation
is already closed (or unlinked, if we are looking at a DROP TABLE). While
I was testing that, I was able to identify several problems that we had
with multiple-segment relations. The system is now able to do initdb and
pass the regression tests with a very small segment size (I had it set to
64Kb per segment for testing). I don't believe that ever worked before.
File descriptor leaks seem to be gone too.
I have partially addressed the concerns we had about mdtruncate(), too.
On a Win32 or NFS filesystem it is not possible to unlink a file that
another backend is holding open, so what md.c now does is to truncate
unwanted files to zero length before trying to unlink them. The other
backends will be forced to close their open files by relation cache
invalidation --- but I think it would take considerable work to make
that happen before vacuum truncates the relation rather than after.
Leaving zero-length files lying around seems a usable compromise.
and possibly for other cases too:
DO NOT cache status of transaction in unknown state
(i.e. non-committed and non-aborted ones)
Example:
T1 reads row updated/inserted by running T2 and cache T2 status.
T2 commits.
Now T1 reads a row updated by T2 and with HEAP_XMAX_COMMITTED
in t_infomask (so cached T2 status is not changed).
Now T1 EvalPlanQual gets updated row version without HEAP_XMIN_COMMITTED
-> TransactionIdDidCommit(t_xmin) and TransactionIdDidAbort(t_xmin)
return FALSE and T2 decides that t_xmin is not committed and gets
ERROR above.
It's too late to find more smart way to handle such cases and so
I just changed xact status caching and got rid TransactionIdFlushCache()
from code.
Changed: transam.c, xact.c, lmgr.c and transam.h - last three
just because of TransactionIdFlushCache() is removed.
2. heapam.c:
T1 marked a row for update. T2 waits for T1 commit/abort.
T1 commits. T3 updates the row before T2 locks row page.
Now T2 sees that new row t_xmax is different from xact id (T1)
T2 was waiting for. Old code did Assert here. New one goes to
HeapTupleSatisfiesUpdate. Obvious changes too.
3. Added Assert to vacuum.c
4. bufmgr.c: break
Assert(buf->r_locks == 0 && !buf->ri_lock)
into two Asserts.
2. varsup.c:ReadNewTransactionId(): don't read nextXid from disk -
this func doesn't allocate next xid, so ShmemVariableCache->nextXid
may be used (but GetNewTransactionId() must be called first).
3. vacuum.c: change elog(ERROR, "Child item....") to elog(NOTICE) -
this is not ERROR, proper handling is just not implemented, yet.
4. s_lock.c: increase S_MAX_BUSY by 2 times.
5. shmem.c:GetSnapshotData(): have to call ReadNewTransactionId()
_after_ SpinAcquire(ShmemIndexLock).
through MAXBACKENDS array entries used to be fine when MAXBACKENDS = 64.
It's not so cool with MAXBACKENDS = 1024 (or more!), especially not in a
frequently-used routine like SIDelExpiredDataEntries. Repair by making
procState array size be the soft MaxBackends limit rather than the hard
limit, and by converting SIGetProcStateLimit() to a macro.
looks
like someone just didn't add support for multiple segments for
truncation.
The following patch seems to do the right thing, for me at least.
It passed my tests, my data looks right(no data that shouldn't be in
there) and regression is ok.
Ole Gjerde
files to be closed automatically at transaction abort or commit, should
they still be open. Also close any still-open stdio files allocated with
AllocateFile at abort/commit. This should eliminate problems with leakage
of file descriptors after an error. Also, put in some primitive buffered-IO
support so that psort.c can use virtual files without severe performance
penalties.
2. Much faster btree tuples deletion in the case when first on page
index tuple is deleted (no movement to the left page(s)).
3. Remember blkno of new root page in BTPageOpaque of
left/right siblings when root page is splitted.
NetBSD/macppc
LinuxPPC
FreeBSD 2.2.6-RELEASE
All of them seem happy with the regression test. Note that, however,
compiling with optimization enabled on NetBSD/macppc causes an initdb
failure (other two platforms are ok). After checking the asm code, we
are suspecting that might be a compiler(egcs) bug.
Tatsuo Ishii
shared memory space allocation. It's a wonder we have not seen bug
reports traceable to this area ... it's quite clear that the routine
dir_realloc() has never worked correctly, for example.
Ok. I made patches replacing all of "#if FALSE" or "#if 0" to "#ifdef
NOT_USED" for current. I have tested these patches in that the
postgres binaries are identical.
of MAXBACKENDS is now 1024, since all it's costing is about 32 bytes of memory
per array slot. configure's --with-maxbackends switch now controls DEF_MAXBACKENDS
which is simply the default value of the postmaster's -N switch. Thus,
the out-of-the-box configuration will still limit you to 64 backends,
but you can go up to 1024 backends simply by restarting the postmaster with
a different -N switch --- no rebuild required.
(--with-maxbackends). Add a postmaster switch (-N backends) that allows
the limit to be reduced at postmaster start time. (You can't increase it,
sorry to say, because there are still some fixed-size arrays.)
Grab the number of semaphores indicated by min(MAXBACKENDS, -N) at
postmaster startup, so that this particular form of bogus configuration
is exposed immediately rather than under heavy load.
a field was labelled as a primary key, the system automatically
created a unique index on the field. This patch extends it so
that the index has the indisprimary field set. You can pull a list
of primary keys with the followiing select.
SELECT pg_class.relname, pg_attribute.attname
FROM pg_class, pg_attribute, pg_index
WHERE pg_class.oid = pg_attribute.attrelid AND
pg_class.oid = pg_index.indrelid AND
pg_index.indkey[0] = pg_attribute.attnum AND
pg_index.indisunique = 't';
There is nothing in this patch that modifies the template database to
set the indisprimary attribute for system tables. Should they be
changed or should we only be concerned with user tables?
D'Arcy
Nakajima. Since he is not subscribing the mailing list, I'm posting
his patches by his request. According to him, he has successfully
compiled and passed the regression test on Mac SE/30 running
NetBSD/m68k. Also, another person has reported that with the patches
PostgreSQL is working on NetBSD/sun3 too.
--
Tatsuo Ishii
> Open portability issues:
>
> /usr/local should be searched for lib and include for all ports if
present
> (currently not working, I have libreadline there)
>
> the stream functions on AIX need a size_t for addrlen's in
fe-connect.c and pqcomm.c.
>
> lock.c still has an incompatible TPRINTF(flags, args...) definition
Massimo