Now indexes of pg_class and pg_type are unique indexes
and guarantee the uniqueness of correponding attributes.
heap_create() was changed to take another boolean parameter
which allows to postpone the creation of disk file.
The name of rd_nonameunlinked was changed to rd_unlinked.
It is used generally(not only for noname relations) now.
Requires initdb.
eliminating some wildly inconsistent coding in various parts of the
system. I set MAXPGPATH = 1024 in config.h.in. If anyone is really
convinced that there ought to be a configure-time test to set the
value, go right ahead ... but I think it's a waste of time.
proc_exit time. I discovered that if the frontend closes the connection
when you're inside a transaction block, there is nothing ensuring that
temp files go away ... I wonder whether proc_exit ought to try to do an
explicit transaction abort?
recycle storage within sort temp file on a block-by-block basis. This
reduces peak disk usage to essentially just the volume of data being
sorted, whereas it had been about 4x the data volume before.
BufFile so that it handles multi-segment temporary files transparently.
This allows sorts and hashes to work with data exceeding 2Gig (or whatever
the local limit on file size is). Change psort.c to use relative seeks
instead of absolute seeks for backwards scanning, so that it won't fail
when the data volume exceeds 2Gig.
Cygwin snapshots (tested on 990115 which is recommended to use - it fixes
some errors in B20.1)
And I have another patch for including <sys/ipc.h> before <sys/sem.h> in
backend/storage/lmgr/proc.c - it is required due the design of cygipc
headers
Dan
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
offended my aesthestic sensibility that there was so much unreadable code
doing so little. Rewritten code is about half the size, faster, and
(I hope) much more intelligible.
has positive refcount, it is rebuilt from pg_class data. This ensures
that relcache entries will track changes made by other backends. Formerly,
a shared inval report would just be ignored if it happened to arrive while
the relcache entry was in use. Also, fix relcache to reset ref counts
to zero during transaction abort. Finally, change LockRelation() so that
it checks for shared inval reports after obtaining the lock. In this way,
once any kind of lock has been obtained on a rel, we can trust the relcache
entry to be up-to-date.
the SInval spinlock while it is calling the passed invalFunction or
resetFunction. This is necessary to avoid deadlock with lmgr change;
InvalidateSharedInvalid can be called recursively now. It should be
a good performance improvement anyway --- holding a spinlock for more
than a very short interval is a no-no.
insight that RelationFlushRelation ought to invoke smgrclose, and that the
way to make that work is to ensure that mdclose doesn't fail if the relation
is already closed (or unlinked, if we are looking at a DROP TABLE). While
I was testing that, I was able to identify several problems that we had
with multiple-segment relations. The system is now able to do initdb and
pass the regression tests with a very small segment size (I had it set to
64Kb per segment for testing). I don't believe that ever worked before.
File descriptor leaks seem to be gone too.
I have partially addressed the concerns we had about mdtruncate(), too.
On a Win32 or NFS filesystem it is not possible to unlink a file that
another backend is holding open, so what md.c now does is to truncate
unwanted files to zero length before trying to unlink them. The other
backends will be forced to close their open files by relation cache
invalidation --- but I think it would take considerable work to make
that happen before vacuum truncates the relation rather than after.
Leaving zero-length files lying around seems a usable compromise.
and possibly for other cases too:
DO NOT cache status of transaction in unknown state
(i.e. non-committed and non-aborted ones)
Example:
T1 reads row updated/inserted by running T2 and cache T2 status.
T2 commits.
Now T1 reads a row updated by T2 and with HEAP_XMAX_COMMITTED
in t_infomask (so cached T2 status is not changed).
Now T1 EvalPlanQual gets updated row version without HEAP_XMIN_COMMITTED
-> TransactionIdDidCommit(t_xmin) and TransactionIdDidAbort(t_xmin)
return FALSE and T2 decides that t_xmin is not committed and gets
ERROR above.
It's too late to find more smart way to handle such cases and so
I just changed xact status caching and got rid TransactionIdFlushCache()
from code.
Changed: transam.c, xact.c, lmgr.c and transam.h - last three
just because of TransactionIdFlushCache() is removed.
2. heapam.c:
T1 marked a row for update. T2 waits for T1 commit/abort.
T1 commits. T3 updates the row before T2 locks row page.
Now T2 sees that new row t_xmax is different from xact id (T1)
T2 was waiting for. Old code did Assert here. New one goes to
HeapTupleSatisfiesUpdate. Obvious changes too.
3. Added Assert to vacuum.c
4. bufmgr.c: break
Assert(buf->r_locks == 0 && !buf->ri_lock)
into two Asserts.
2. varsup.c:ReadNewTransactionId(): don't read nextXid from disk -
this func doesn't allocate next xid, so ShmemVariableCache->nextXid
may be used (but GetNewTransactionId() must be called first).
3. vacuum.c: change elog(ERROR, "Child item....") to elog(NOTICE) -
this is not ERROR, proper handling is just not implemented, yet.
4. s_lock.c: increase S_MAX_BUSY by 2 times.
5. shmem.c:GetSnapshotData(): have to call ReadNewTransactionId()
_after_ SpinAcquire(ShmemIndexLock).
through MAXBACKENDS array entries used to be fine when MAXBACKENDS = 64.
It's not so cool with MAXBACKENDS = 1024 (or more!), especially not in a
frequently-used routine like SIDelExpiredDataEntries. Repair by making
procState array size be the soft MaxBackends limit rather than the hard
limit, and by converting SIGetProcStateLimit() to a macro.