functions, which would lead to trouble with datatypes that paid attention
to the typelem or typmod parameters to these functions. In particular,
incorrect code in pg_aggregate.c explains the platform-specific failures
that have been reported in NUMERIC avg().
errors. VACUUM normally compacts the table back-to-front, and stops
as soon as it gets to a page that it has moved some tuples onto.
(This logic doesn't make for a complete packing of the table, but it
should be pretty close.) But the way it was checking whether it had
got to a page with some moved-in tuples was to look at whether the
current page was the same as the last page of the list of pages that
have enough free space to be move-in targets. And there was other
code that would remove pages from that list once they got full.
There was a kluge that prevented the last list entry from being
removed, but it didn't get the job done. Fixed by keeping a separate
variable that contains the largest block number into which a tuple
has been moved. There's no longer any need to protect the last element
of the fraged_pages list.
Also, fix NOTICE messages to describe elapsed user/system CPU time
correctly.
table owner in order to vacuum a table. This is mainly to prevent
denial-of-service attacks via repeated vacuums. Allow VACUUM to gather
statistics about system relations, except for pg_statistic itself ---
not clear that it's worth the trouble to make that case work cleanly.
Cope with possible tuple size overflow in pg_statistic tuples; I'm
surprised we never realized that could happen. Hold a couple of locks
a little longer to try to prevent deadlocks between concurrent VACUUMs.
There still seem to be some problems in that last area though :-(
parallel --- and, not incidentally, removing a common reason for needing
manual cleanup by the DB admin after a crash. Remove initial global
delete of pg_statistics rows in VACUUM ANALYZE; this was not only bad
for performance of other backends that had to run without stats for a
while, but it was fundamentally broken because it was done outside any
transaction. Surprising we didn't see more consequences of that.
Detect attempt to run VACUUM inside a transaction block. Check for
query cancel request before starting vacuum of each table. Clean up
vacuum's private portal storage if vacuum is aborted.
Make all system indexes unique.
Make all cache loads use system indexes.
Rename *rel to *relid in inheritance tables.
Rename cache names to be clearer.
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
neqsel now behave as per my suggestions in pghackers a few days ago.
selectivity for < > <= >= should work OK for integral types as well, but
still need work for nonintegral types. Since these routines have never
actually executed before :-(, this may result in some significant changes
in the optimizer's choices of execution plans. Let me know if you see
any serious misbehavior.
CAUTION: THESE CHANGES REQUIRE INITDB. pg_statistic table has changed.
/*
* Read above about cases when !ItemIdIsUsed(Citemid)
* (child item is removed)... Due to the fact that
* at the moment we don't remove unuseful part of
* update-chain, it's possible to get too old
* parent row here. Like as in the case which
* caused this problem, we stop shrinking here.
* I could try to find real parent row but want
* not to do it because of real solution will
* be implemented anyway, latter, and we are too
* close to 6.5 release. - vadim 06/11/99
*/
if (Ptp.t_data->t_xmax != tp.t_data->t_xmin)
...
and possibly for other cases too:
DO NOT cache status of transaction in unknown state
(i.e. non-committed and non-aborted ones)
Example:
T1 reads row updated/inserted by running T2 and cache T2 status.
T2 commits.
Now T1 reads a row updated by T2 and with HEAP_XMAX_COMMITTED
in t_infomask (so cached T2 status is not changed).
Now T1 EvalPlanQual gets updated row version without HEAP_XMIN_COMMITTED
-> TransactionIdDidCommit(t_xmin) and TransactionIdDidAbort(t_xmin)
return FALSE and T2 decides that t_xmin is not committed and gets
ERROR above.
It's too late to find more smart way to handle such cases and so
I just changed xact status caching and got rid TransactionIdFlushCache()
from code.
Changed: transam.c, xact.c, lmgr.c and transam.h - last three
just because of TransactionIdFlushCache() is removed.
2. heapam.c:
T1 marked a row for update. T2 waits for T1 commit/abort.
T1 commits. T3 updates the row before T2 locks row page.
Now T2 sees that new row t_xmax is different from xact id (T1)
T2 was waiting for. Old code did Assert here. New one goes to
HeapTupleSatisfiesUpdate. Obvious changes too.
3. Added Assert to vacuum.c
4. bufmgr.c: break
Assert(buf->r_locks == 0 && !buf->ri_lock)
into two Asserts.
2. varsup.c:ReadNewTransactionId(): don't read nextXid from disk -
this func doesn't allocate next xid, so ShmemVariableCache->nextXid
may be used (but GetNewTransactionId() must be called first).
3. vacuum.c: change elog(ERROR, "Child item....") to elog(NOTICE) -
this is not ERROR, proper handling is just not implemented, yet.
4. s_lock.c: increase S_MAX_BUSY by 2 times.
5. shmem.c:GetSnapshotData(): have to call ReadNewTransactionId()
_after_ SpinAcquire(ShmemIndexLock).
2. Get rid of locking when updating statistics in vacuum.
3. Use QuerySnapshot in COPY TO and call SetQuerySnashot
in main tcop loop before FETCH and COPY TO.
to save a little bit of backend startup time. This way, the first
backend started after a VACUUM will rebuild the init file with up-to-date
statistics for the critical system indexes.
2. Much faster btree tuples deletion in the case when first on page
index tuple is deleted (no movement to the left page(s)).
3. Remember blkno of new root page in BTPageOpaque of
left/right siblings when root page is splitted.
Ok. I made patches replacing all of "#if FALSE" or "#if 0" to "#ifdef
NOT_USED" for current. I have tested these patches in that the
postgres binaries are identical.