postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	f519d04a43	Update comment that I missed the first time around.	2005-05-19 23:57:11 +00:00
Tom Lane	191b13aaca	Factor out lock cleanup code that is needed in several places in lock.c. Also, remove the rather useless return value of LockReleaseAll. Change response to detection of corruption in the shared lock tables to PANIC, since that is the only way of cleaning up fully. Originally an idea of Heikki Linnakangas, variously hacked on by Alvaro Herrera and Tom Lane.	2005-05-19 23:30:18 +00:00
Tom Lane	ee3b71f6bc	Split the shared-memory array of PGPROC pointers out of the sinval communication structure, and make it its own module with its own lock. This should reduce contention at least a little, and it definitely makes the code seem cleaner. Per my recent proposal.	2005-05-19 21:35:48 +00:00
Neil Conway	f38e413b20	Code cleanup: in C89, there is no point casting the first argument to memset() or MemSet() to a char . For one, memset()'s first argument is a void , and further void * can be implicitly coerced to/from any other pointer type.	2005-05-11 01:26:02 +00:00
Tom Lane	93b2477278	Use the standard lock manager to establish priority order when there is contention for a tuple-level lock. This solves the problem of a would-be exclusive locker being starved out by an indefinite succession of share-lockers. Per recent discussion with Alvaro.	2005-04-30 19:03:33 +00:00
Tom Lane	3a694bb0a1	Restructure LOCKTAG as per discussions of a couple months ago. Essentially, we shoehorn in a lockable-object-type field by taking a byte away from the lockmethodid, which can surely fit in one byte instead of two. This allows less artificial definitions of all the other fields of LOCKTAG; we can get rid of the special pg_xactlock pseudo-relation, and also support locks on individual tuples and general database objects (including shared objects). None of those possibilities are actually exploited just yet, however. I removed pg_xactlock from pg_class, but did not force initdb for that change. At this point, relkind 's' (SPECIAL) is unused and could be removed entirely.	2005-04-29 22:28:24 +00:00
Tom Lane	bedb78d386	Implement sharable row-level locks, and use them for foreign key references to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU data structure (managed much like pg_subtrans) to represent multiple- transaction-ID sets. When more than one transaction is holding a shared lock on a particular row, we create a MultiXactId representing that set of transactions and store its ID in the row's XMAX. This scheme allows an effectively unlimited number of row locks, just as we did before, while not costing any extra overhead except when a shared lock actually has to be shared. Still TODO: use the regular lock manager to control the grant order when multiple backends are waiting for a row lock. Alvaro Herrera and Tom Lane.	2005-04-28 21:47:18 +00:00
Bruce Momjian	3b0a5e50d7	Update VACUUM VERBOSE FSM message, per Tom.	2005-04-24 03:51:49 +00:00
Bruce Momjian	714d5a4c37	Update VACUUM VERBOSE update, per Alvaro.	2005-04-23 21:16:34 +00:00
Bruce Momjian	9ba6587f8b	Update working of VACUUM VERBOSE.	2005-04-23 21:10:20 +00:00
Bruce Momjian	52e08c35f7	Make VACUUM VERBOSE FSM output all output in a single INFO output statement.	2005-04-23 20:56:01 +00:00
Bruce Momjian	e947e1153a	Modify output of VACUUM VERBOSE to be clearer.	2005-04-23 15:20:39 +00:00
Neil Conway	ea208aca00	Remove an unused variable "waitingForSignal". From Qingqing Zhou.	2005-04-15 04:18:10 +00:00
Tom Lane	162bd08b3f	Completion of project to use fixed OIDs for all system catalogs and indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...	2005-04-14 20:03:27 +00:00
Tom Lane	2193a856a2	Simplify initdb-time assignment of OIDs as I proposed yesterday, and avoid encroaching on the 'user' range of OIDs by allowing automatic OID assignment to use values below 16k until we reach normal operation. initdb not forced since this doesn't make any incompatible change; however a lot of stuff will have different OIDs after your next initdb.	2005-04-13 18:54:57 +00:00
Tom Lane	badb83f9ec	If we're going to have a non-panic check for held_lwlocks[] overrun, it must occur before we get into the critical state of holding a lock we have no place to record. Per discussion with Qingqing Zhou.	2005-04-08 14:18:35 +00:00
Tom Lane	e794dfa511	Use an always-there test, not an Assert, to check for overrun of the held_lwlocks[] array. Per Qingqing Zhou.	2005-04-08 03:43:54 +00:00
Neil Conway	5b1c607abe	Remove an unused variable `ShmemBootstrap', and remove an obsolete comment. Patch from Alvaro.	2005-04-04 04:34:41 +00:00
Tom Lane	94e03330cb	Create a routine PageIndexMultiDelete() that replaces a loop around PageIndexTupleDelete() with a single pass of compactification --- logic mostly lifted from PageRepairFragmentation. I noticed while profiling that a VACUUM that's cleaning up a whole lot of deleted tuples would spend as much as a third of its CPU time in PageIndexTupleDelete; not too surprising considering the loop method was roughly O(N^2) in the number of tuples involved.	2005-03-22 06:17:03 +00:00
Tom Lane	354049c709	Remove unnecessary calls of FlushRelationBuffers: there is no need to write out data that we are about to tell the filesystem to drop. smgr_internal_unlink already had a DropRelFileNodeBuffers call to get rid of dead buffers without a write after it's no longer possible to roll back the deleting transaction. Adding a similar call in smgrtruncate simplifies callers and makes the overall division of labor clearer. This patch removes the former behavior that VACUUM would write all dirty buffers of a relation unconditionally.	2005-03-20 22:00:54 +00:00
Tom Lane	91728fa26c	Add temp_buffers GUC variable to allow users to determine the size of the local buffer arena for temporary table access.	2005-03-19 23:27:11 +00:00
Tom Lane	d65522aeb6	Upgrade localbuf.c to use a hash table instead of linear search to find already-allocated local buffers. This is the last obstacle in the way of setting NLocBuffer to something reasonably large.	2005-03-19 17:39:43 +00:00
Tom Lane	88164799ce	Need to reset local buffer pin counts, not only shared buffer pins, before we attempt any file deletions in ShutdownPostgres. Per Tatsuo.	2005-03-18 16:16:09 +00:00
Tom Lane	cef01c3355	Avoid infinite loop in InvalidateBuffer if we ourselves are holding a pin on the victim buffer.	2005-03-18 05:25:23 +00:00
Bruce Momjian	2c4dea126a	Issue free space notices to both the user and the server log file.	2005-03-14 20:15:09 +00:00
Bruce Momjian	45905425a0	Add warning about the need to increase "max_fsm_relations" and "max_fsm_relations" for vacuums. Also improve VACUUM VERBOSE final message text. Ron Mayer	2005-03-12 05:21:52 +00:00
Neil Conway	c129c16492	Slight refactoring and optimization of some code in WaitOnLock().	2005-03-11 03:52:06 +00:00
Tom Lane	5d5087363d	Replace the BufMgrLock with separate locks on the lookup hashtable and the freelist, plus per-buffer spinlocks that protect access to individual shared buffer headers. This requires abandoning a global freelist (since the freelist is a global contention point), which shoots down ARC and 2Q as well as plain LRU management. Adopt a clock sweep algorithm instead. Preliminary results show substantial improvement in multi-backend situations.	2005-03-04 20:21:07 +00:00
Tom Lane	a2ad04f4b0	Release proclock immediately in RemoveFromWaitQueue() if it represents no held locks. This maintains the invariant that proclocks are present only for procs that are holding or awaiting a lock; when this is not true, LockRelease will fail. Per report from Stephen Clouse.	2005-03-01 21:14:59 +00:00
Bruce Momjian	0542b1e2fe	Use _() macro consistently rather than gettext(). Add translation macros around strings that were missing them.	2005-02-22 04:43:23 +00:00
Neil Conway	11635c3f6f	Refactor some duplicated code in lock.c: create UnGrantLock(), move code from LockRelease() and LockReleaseAll() into it. From Heikki Linnakangas.	2005-02-04 02:04:53 +00:00
Tom Lane	cc4f58f4cd	Ensure that all details of the ARC algorithm are hidden within freelist.c. This refactoring does not change any algorithms or data structures, just remove visibility of the ARC datastructures from other source files.	2005-02-03 23:29:19 +00:00
Neil Conway	a885ecd6ef	Change heap_modifytuple() to require a TupleDesc rather than a Relation. Patch from Alvaro Herrera, minor editorializing by Neil Conway.	2005-01-27 23:24:11 +00:00
Tom Lane	0ce4d56924	Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This is the minimum required fix. I want to look next at taking advantage of it by simplifying the message semantics in the shared inval message queue, but that part can be held over for 8.1 if it turns out too ugly.	2005-01-10 20:02:24 +00:00
Tom Lane	c9d8edc906	Repair bufmgr deadlock problem reported by Michael Wildpaner. Must take share lock on a buffer being written out before releasing BufMgrLock in the BufferAlloc code path; if we do it later we might block on someone who's re-pinned the buffer. I believe this is only an issue for BufferAlloc and not the other places that call FlushBuffer. BufferSync must continue to do it the old way since it may well be trying to write buffers that other backends have pinned; but it should not be holding any conflicting locks. FlushRelationBuffers is okay since it's got exclusive lock at the relation level.	2005-01-03 18:49:41 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Tom Lane	96ecf9d5aa	Support Sun's compiler on SunOS4 (a/k/a Solaris 9). Per ayan@ayan.net	2004-12-29 23:47:40 +00:00
Tom Lane	eee5abce46	Refactor EXEC_BACKEND code so that postmaster child processes reattach to shared memory as soon as possible, ie, right after read_backend_variables. The effective difference from the original code is that this happens before instead of after read_nondefault_variables(), which loads GUC information and is apparently capable of expanding the backend's memory allocation more than you'd think it should. This should fix the failure-to-attach-to-shared-memory reports we've been seeing on Windows. Also clean up a few bits of unnecessarily grotty EXEC_BACKEND code.	2004-12-29 21:36:09 +00:00
Bruce Momjian	08690d0688	Allow NetBSD, m64k to compile the ASM spinlock code. R?mi Zara	2004-12-18 22:12:52 +00:00
Neil Conway	4acc97d7e4	Assert that BufferIsPinned() in IncrBufferRefCount(), rather than using a home-brewed combination of assertions that boiled down to the same thing.	2004-11-24 02:56:17 +00:00
Tom Lane	8ecbc46bdb	Reduce the default size of the local lock hash table. There's usually no need for it to be nearly as big as the global hash table, and since it's not in shared memory it can grow if it does need to be bigger. By reducing the size, we speed up hash_seq_search(), which saves a significant fraction of subtransaction entry/exit overhead.	2004-11-20 20:16:54 +00:00
Peter Eisentraut	0ed3c7665e	Small message clarifications	2004-11-05 17:11:34 +00:00
Neil Conway	8ec05b28b7	Modify hash_create() to elog(ERROR) if an error occurs, rather than returning a NULL pointer (some callers remembered to check the return value, but some did not -- it is safer to just bail out). Also, cleanup pgstat.c to use elog(ERROR) rather than elog(LOG) followed by exit().	2004-10-25 00:46:43 +00:00
Tom Lane	4347cc2392	Allow background writing to be shut down by setting limit values to zero. This does not disable the bgwriter process: it still has to wake up often enough to collect fsync requests from backends in a timely fashion. But it responds to the recent gripe about not being able to prevent the disk from being spun up constantly.	2004-10-17 22:01:51 +00:00
Tom Lane	fdd13f1568	Give the ResourceOwner mechanism full responsibility for releasing buffer pins at end of transaction, and reduce AtEOXact_Buffers to an Assert cross-check that this was done correctly. When not USE_ASSERT_CHECKING, AtEOXact_Buffers is a complete no-op. This gets rid of an O(NBuffers) bottleneck during transaction commit/abort, which recent testing has shown becomes significant above a few tens of thousands of shared buffers.	2004-10-16 18:57:26 +00:00
Tom Lane	1c2de47746	Remove BufferLocks[] array in favor of a single pointer to the buffer (if any) currently waited for by LockBufferForCleanup(), which is all that we were using it for anymore. Saves some space and eliminates proportional-to-NBuffers slowdown in UnlockBuffers().	2004-10-16 18:05:07 +00:00
Tom Lane	9ffc8ed58b	Repair possible failure to update hint bits back to disk, per http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php. This fix is intended to be permanent: it moves the responsibility for calling SetBufferCommitInfoNeedsSave() into the tqual.c routines, eliminating the requirement for callers to test whether t_infomask changed. Also, tighten validity checking on buffer IDs in bufmgr.c --- several routines were paranoid about out-of-range shared buffer numbers but not about out-of-range local ones, which seems a tad pointless.	2004-10-15 22:40:29 +00:00
Neil Conway	0683a47556	Allow the spinlock test to be compiled successfully in a vpath build.	2004-10-07 00:08:04 +00:00
Tom Lane	0fb3152ea9	Minor adjustments to improve the accuracy of our computation of required shared memory size.	2004-09-29 15:15:56 +00:00
Tom Lane	3a246cc285	Arrange to preallocate all required space for the buffer and FSM hash tables in shared memory. This ensures that overflow of the lock table creates no long-lasting problems. Per discussion with Merlin Moncure.	2004-09-28 20:46:37 +00:00
Tom Lane	86fff990b2	RecentXmin is too recent to use as the cutoff point for accessing pg_subtrans --- what we need is the oldest xmin of any snapshot in use in the current top transaction. Introduce a new variable TransactionXmin to play this role. Fixes intermittent regression failure reported by Neil Conway.	2004-09-16 18:35:23 +00:00
Tom Lane	8f9f198603	Restructure subtransaction handling to reduce resource consumption, as per recent discussions. Invent SubTransactionIds that are managed like CommandIds (ie, counter is reset at start of each top transaction), and use these instead of TransactionIds to keep track of subtransaction status in those modules that need it. This means that a subtransaction does not need an XID unless it actually inserts/modifies rows in the database. Accordingly, don't assign it an XID nor take a lock on the XID until it tries to do that. This saves a lot of overhead for subtransactions that are only used for error recovery (eg plpgsql exceptions). Also, arrange to release a subtransaction's XID lock as soon as the subtransaction exits, in both the commit and abort cases. This avoids holding many unique locks after a long series of subtransactions. The price is some additional overhead in XactLockTableWait, but that seems acceptable. Finally, restructure the state machine in xact.c to have a more orthogonal set of states for subtransactions.	2004-09-16 16:58:44 +00:00
Tom Lane	abc98dcc15	When LockAcquire fails at the stage of creating a proclock object, be sure to clean up the already-created lock object, if it has no other references. Avoids possibly-permanent leak of shared memory.	2004-09-12 18:30:50 +00:00
Tom Lane	083258e535	Fix a number of places where brittle data structures or overly strong Asserts would lead to a server core dump if an error occurred while trying to abort a failed subtransaction (thereby leading to re-execution of whatever parts of AbortSubTransaction had already run). This of course does not prevent such an error from creating an infinite loop, but at least we don't make the situation worse. Responds to an open item on the subtransactions to-do list.	2004-09-06 23:33:48 +00:00
Tom Lane	23645f0582	Fix incorrect ordering of smgr cleanup relative to buffer pin cleanup during transaction abort. Add a regression test case to catch related mistakes in future. Alvaro Herrera and Tom Lane.	2004-09-06 17:56:33 +00:00
Tom Lane	eb917c1a21	I can't see any good reason for DropRelFileNodeBuffers to be issuing FATAL when it detects a nonzero reference count. Reduce to ERROR.	2004-09-06 17:31:32 +00:00
Tom Lane	a421b4e850	FlushRelationBuffers was also being a bit cavalier about whether the relation is already opened by smgr.	2004-08-31 16:13:06 +00:00
Tom Lane	332ee2dc41	Improve spinlock selftest to make it able to detect misdeclaration of the slock_t datatype (ie, declared type smaller than what the hardware TAS instruction needs).	2004-08-30 23:47:20 +00:00
Tom Lane	303e46ea93	Tweak md.c logic to cope with the situation where WAL replay tries to write into a high-numbered segment of a relation that was later deleted. We need to temporarily recreate missing segment files, instead of failing.	2004-08-30 03:52:43 +00:00
Bruce Momjian	15d3f9f6b7	Another pgindent run with lib typedefs added.	2004-08-30 02:54:42 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Tom Lane	1785acebf2	Introduce local hash table for lock state, as per recent proposal. PROCLOCK structs in shared memory now have only a bitmask for held locks, rather than counts (making them 40 bytes smaller, which is a good thing). Multiple locks within a transaction are counted in the local hash table instead, and we have provision for tracking which ResourceOwner each count belongs to. Solves recently reported problem with memory leakage within long transactions.	2004-08-27 17:07:42 +00:00
Tom Lane	337b513e07	Fix user locks. Broken some time ago for all platforms by Windows-related changes.	2004-08-26 17:23:30 +00:00
Tom Lane	4dbb880d3c	Rearrange pg_subtrans handling as per recent discussion. pg_subtrans updates are no longer WAL-logged nor even fsync'd; we do not need to, since after a crash no old pg_subtrans data is needed again. We truncate pg_subtrans to RecentGlobalXmin at each checkpoint. slru.c's API is refactored a little bit to separate out the necessary decisions.	2004-08-23 23:22:45 +00:00
Tom Lane	f009c316ba	Tweak code so that pg_subtrans is never consulted for XIDs older than RecentXmin (== MyProc->xmin). This ensures that it will be safe to truncate pg_subtrans at RecentGlobalXmin, which should largely eliminate any fear of bloat. Along the way, eliminate SubTransXidsHaveCommonAncestor, which isn't really needed and could not give a trustworthy result anyway under the lookback restriction. In an unrelated but nearby change, #ifdef out GetUndoRecPtr, which has been dead code since 2001 and seems unlikely to ever be resurrected.	2004-08-22 02:41:58 +00:00
Tom Lane	1a3de15a3a	Dept. of further reflection: I looked around to see if any other callers of XLogInsert had the same sort of checkpoint interlock problem as RecordTransactionCommit, and indeed I found some. Btree index build and ALTER TABLE SET TABLESPACE write data outside the friendly confines of the buffer manager, and therefore they have to take their own responsibility for checkpoint interlock. The easiest solution seems to be to force smgrimmedsync at the end of the index build or table copy, even when the operation is being WAL-logged. This is sufficient since the new index or table will be of interest to no one if we don't get as far as committing the current transaction.	2004-08-15 23:44:46 +00:00
Tom Lane	057ea3471f	Xmin calculations should consider only top transaction IDs, and therefore starting with GetCurrentTransactionId is wrong. Fixes miscomputation of RecentGlobalXmin leading to bizarre behavior reported by Gavin Sherry.	2004-08-15 17:03:36 +00:00
Tom Lane	efcaf1e868	Some mop-up work for savepoints (nested transactions). Store a small number of active subtransaction XIDs in each backend's PGPROC entry, and use this to avoid expensive probes into pg_subtrans during TransactionIdIsInProgress. Extend EOXactCallback API to allow add-on modules to get control at subxact start/end. (This is deliberately not compatible with the former API, since any uses of that API probably need manual review anyway.) Add basic reference documentation for SAVEPOINT and related commands. Minor other cleanups to check off some of the open issues for subtransactions. Alvaro Herrera and Tom Lane.	2004-08-01 17:32:22 +00:00
Tom Lane	a393fbf937	Restructure error handling as recently discussed. It is now really possible to trap an error inside a function rather than letting it propagate out to PostgresMain. You still have to use AbortCurrentTransaction to clean up, but at least the error handling itself will cooperate.	2004-07-31 00:45:57 +00:00
Tom Lane	1bf3d61504	Fix subtransaction behavior for large objects, temp namespace, files, password/group files. Also allow read-only subtransactions of a read-write parent, but not vice versa. These are the reasonably noncontroversial parts of Alvaro's recent mop-up patch, plus further work on large objects to minimize use of the TopTransactionResourceOwner.	2004-07-28 14:23:31 +00:00
Tom Lane	cc813fc2b8	Replace nested-BEGIN syntax for subtransactions with spec-compliant SAVEPOINT/RELEASE/ROLLBACK-TO syntax. (Alvaro) Cause COMMIT of a failed transaction to report ROLLBACK instead of COMMIT in its command tag. (Tom) Fix a few loose ends in the nested-transactions stuff.	2004-07-27 05:11:48 +00:00
Tom Lane	2042b3428d	Invent WAL timelines, as per recent discussion, to make point-in-time recovery more manageable. Also, undo recent change to add FILE_HEADER and WASTED_SPACE records to XLOG; instead make the XLOG page header variable-size with extra fields in the first page of an XLOG file. This should fix the boundary-case bugs observed by Mark Kirkwood. initdb forced due to change of XLOG representation.	2004-07-21 22:31:26 +00:00
Tom Lane	fe548629c5	Invent ResourceOwner mechanism as per my recent proposal, and use it to keep track of portal-related resources separately from transaction-related resources. This allows cursors to work in a somewhat sane fashion with nested transactions. For now, cursor behavior is non-subtransactional, that is a cursor's state does not roll back if you abort a subtransaction that fetched from the cursor. We might want to change that later.	2004-07-17 03:32:14 +00:00
Tom Lane	8801110b20	Move TablespaceCreateDbspace() call into smgrcreate(), which is where it probably should have been to begin with; this is to cover cases like needing to recreate the per-db directory during WAL replay. Also, fix heap_create to force pg_class.reltablespace to be zero instead of the database's default tablespace; this makes the world safe for CREATE DATABASE to handle all tables in the default tablespace alike, as per previous discussion. And force pg_class.reltablespace to zero when creating a relation without physical storage (eg, a view); this avoids possibly having dangling references in this column after a subsequent DROP TABLESPACE.	2004-07-11 19:52:52 +00:00
Tom Lane	77a436ba55	Fix seriously nasty memory leak in new TransactionIdIsInProgress code.	2004-07-01 03:13:05 +00:00
Tom Lane	573a71a5da	Nested transactions. There is still much left to do, especially on the performance front, but with feature freeze upon us I think it's time to drive a stake in the ground and say that this will be in 7.5. Alvaro Herrera, with some help from Tom Lane.	2004-07-01 00:52:04 +00:00
Tom Lane	c1d9dec3e3	Looks like s_lock_test needs <time.h> on some platforms.	2004-06-19 20:31:55 +00:00
Tom Lane	1232878159	s_lock_test requires libpgport to build now.	2004-06-19 19:43:11 +00:00
Tom Lane	2467394ee1	Tablespaces. Alternate database locations are dead, long live tablespaces. There are various things left to do: contrib dbsize and oid2name modules need work, and so does the documentation. Also someone should think about COMMENT ON TABLESPACE and maybe RENAME TABLESPACE. Also initlocation is dead, it just doesn't know it yet. Gavin Sherry and Tom Lane.	2004-06-18 06:14:31 +00:00
Tom Lane	bbf0ebadaf	StrategyDirtyBufferList wasn't being careful to honor max_buffers limit. Bug is only latent given that sole caller is passing NBuffers, but it could bite someone in the rear someday.	2004-06-11 17:20:39 +00:00
Tom Lane	e6cba71503	Add some code to Assert that when we release pin on a buffer, we are not holding the buffer's cntx_lock or io_in_progress_lock. A recent report from Litao Wu makes me wonder whether it is ever possible for us to drop a buffer and forget to release its cntx_lock. The Assert does not fire in the regression tests, but that proves little ...	2004-06-11 16:43:24 +00:00
Bruce Momjian	a1ccbb9019	Previous code cleanup was for bufpage.c, not bufmgr.c. This cleanup just cleans up a comment.	2004-06-09 13:11:34 +00:00
Bruce Momjian	ce04221a1e	Stylistic changes in bufmgr.c Basically replaces (*a).b with a->b as it is everywhere else in Postgres. Manfred Koizar	2004-06-08 14:00:35 +00:00
Tom Lane	c3a153afed	Tweak palloc/repalloc to allow zero bytes to be requested, as per recent proposal. Eliminate several dozen now-unnecessary hacks to avoid palloc(0). (It's likely there are more that I didn't find.)	2004-06-05 19:48:09 +00:00
Tom Lane	921d749bd4	Adjust our timezone library to use pg_time_t (typedef'd as int64) in place of time_t, as per prior discussion. The behavior does not change on machines without a 64-bit-int type, but on machines with one, which is most, we are rid of the bizarre boundary behavior at the edges of the 32-bit-time_t range (1901 and 2038). The system will now treat times over the full supported timestamp range as being in your local time zone. It may seem a little bizarre to consider that times in 4000 BC are PST or EST, but this is surely at least as reasonable as propagating Gregorian calendar rules back that far. I did not modify the format of the zic timezone database files, which means that for the moment the system will not know about daylight-savings periods outside the range 1901-2038. Given the way the files are set up, it's not a simple decision like 'widen to 64 bits'; we have to actually think about the range of years that need to be supported. We should probably inquire what the plans of the upstream zic people are before making any decisions of our own.	2004-06-03 02:08:07 +00:00
Bruce Momjian	e8d9d68ca4	Per previous discussions, here are two functions to send INT and TERM (cancel and terminate) signals to other backends. They permit only INT and TERM, and permits sending only to postgresql backends. Magnus Hagander	2004-06-02 21:29:29 +00:00
Tom Lane	2095206de1	Adjust btree index build to not use shared buffers, thereby avoiding the locking conflict against concurrent CHECKPOINT that was discussed a few weeks ago. Also, if not using WAL archiving (which is always true ATM but won't be if PITR makes it into this release), there's no need to WAL-log the index build process; it's sufficient to force-fsync the completed index before commit. This seems to gain about a factor of 2 in my tests, which is consistent with writing half as much data. I did not try it with WAL on a separate drive though --- probably the gain would be a lot less in that scenario.	2004-06-02 17:28:18 +00:00
Tom Lane	91d20ff7aa	Additional mop-up for sync-to-fsync changes: avoid issuing fsyncs for temp tables, and avoid WAL-logging truncations of temp tables. Do issue fsync on truncated files (not sure this is necessary but it seems like a good idea).	2004-05-31 20:31:33 +00:00
Tom Lane	e674707968	Minor code rationalization: FlushRelationBuffers just returns void, rather than an error code, and does elog(ERROR) not elog(WARNING) when it detects a problem. All callers were simply elog(ERROR)'ing on failure return anyway, and I find it hard to envision a caller that would not, so we may as well simplify the callers and produce the more useful error message directly.	2004-05-31 19:24:05 +00:00
Tom Lane	9b178555fc	Per previous discussions, get rid of use of sync(2) in favor of explicitly fsync'ing every (non-temp) file we have written since the last checkpoint. In the vast majority of cases, the burden of the fsyncs should fall on the bgwriter process not on backends. (To this end, we assume that an fsync issued by the bgwriter will force out blocks written to the same file by other processes using other file descriptors. Anyone have a problem with that?) This makes the world safe for WIN32, which ain't even got sync(2), and really makes the world safe for Unixen as well, because sync(2) never had the semantics we need: it offers no way to wait for the requested I/O to finish. Along the way, fix a bug I recently introduced in xlog recovery: file truncation replay failed to clear bufmgr buffers for the dropped blocks, which could result in 'PANIC: heap_delete_redo: no block' later on in xlog replay.	2004-05-31 03:48:10 +00:00
Tom Lane	c6719a2784	Implement new PostmasterIsAlive() check for WIN32, per Claudio Natoli. In passing, align a few error messages with the style guide.	2004-05-30 03:50:15 +00:00
Tom Lane	076a055acf	Separate out bgwriter code into a logically separate module, rather than being random pieces of other files. Give bgwriter responsibility for all checkpoint activity (other than a post-recovery checkpoint); so this child process absorbs the functionality of the former transient checkpoint and shutdown subprocesses. While at it, create an actual include file for postmaster.c, which for some reason never had its own file before.	2004-05-29 22:48:23 +00:00
Tom Lane	1a321f26d8	Code review for EXEC_BACKEND changes. Reduce the number of #ifdefs by about a third, make it work on non-Windows platforms again. (But perhaps I broke the WIN32 code, since I have no way to test that.) Fold all the paths that fork postmaster child processes to go through the single routine SubPostmasterMain, which takes care of resurrecting the state that would normally be inherited from the postmaster (including GUC variables). Clean up some places where there's no particularly good reason for the EXEC and non-EXEC cases to work differently. Take care of one or two FIXMEs that remained in the code.	2004-05-28 05:13:32 +00:00
Tom Lane	ebfc56d3fb	Handle impending sinval queue overflow by means of a separate signal (SIGUSR1, which we have not been using recently) instead of piggybacking on SIGUSR2-driven NOTIFY processing. This has several good results: the processing needed to drain the sinval queue is a lot less than the processing needed to answer a NOTIFY; there's less contention since we don't have a bunch of backends all trying to acquire exclusive lock on pg_listener; backends that are sitting inside a transaction block can still drain the queue, whereas NOTIFY processing can't run if there's an open transaction block. (This last is a fairly serious issue that I don't think we ever recognized before --- with clients like JDBC that tend to sit with open transaction blocks, the sinval queue draining mechanism never really worked as intended, probably resulting in a lot of useless cache-reset overhead.) This is the last of several proposed changes in response to Philip Warner's recent report of sinval-induced performance problems.	2004-05-23 03:50:45 +00:00
Tom Lane	4af3421161	Get rid of rd_nblocks field in relcache entries. Turns out this was costing us lots more to maintain than it was worth. On shared tables it was of exactly zero benefit because we couldn't trust it to be up to date. On temp tables it sometimes saved an lseek, but not often enough to be worth getting excited about. And the real problem was that we forced an lseek on every relcache flush in order to update the field. So all in all it seems best to lose the complexity.	2004-05-08 19:09:25 +00:00
Neil Conway	0370951347	Tiny assorted fixes: correct a typo in a comment in vacuumlazy.c, remove some unused #include directives from bufmgr.c, and clarify comments in bufmgr.h and buf.h	2004-04-25 23:50:58 +00:00
Neil Conway	139abc2896	Make LocalRefCount and PrivateRefCount arrays of int32, rather than long. This saves a small amount of per-backend memory for LP64 machines.	2004-04-22 07:21:55 +00:00
Tom Lane	95a03e9cdf	Another round of code cleanup on bufmgr. Use BM_VALID flag to keep track of whether we have successfully read data into a buffer; this makes the error behavior a bit more transparent (IMHO anyway), and also makes it work correctly for local buffers which don't use Start/TerminateBufferIO. Collapse three separate functions for writing a shared buffer into one. This overlaps a bit with cleanups that Neil proposed awhile back, but seems not to have committed yet.	2004-04-21 18:06:30 +00:00
Tom Lane	011c3e62e7	Code review for ARC patch. Eliminate static variables, improve handling of VACUUM cases so that VACUUM requests don't affect the ARC state at all, avoid corner case where BufferSync would uselessly rewrite a buffer that no longer contains the page that was to be flushed. Make some minor other cleanups in and around the bufmgr as well, such as moving PinBuffer and UnpinBuffer into bufmgr.c where they really belong.	2004-04-19 23:27:17 +00:00
Bruce Momjian	31338352bd	* Most changes are to fix warnings issued when compiling win32 * removed a few redundant defines * get_user_name safe under win32 * rationalized pipe read EOF for win32 (UPDATED PATCH USED) * changed all backend instances of sleep() to pg_usleep - except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a 32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is acceptable, please replace with pg_usleep(2000000000L)] I added a comment to that part of the code: /* * It would be nice to use pg_usleep() here, but only does 2000 sec * or 33 minutes, which seems too short. */ sleep(1000000); Claudio Natoli	2004-04-19 17:42:59 +00:00
Bruce Momjian	48b2802eee	When changing select() calls for delays into pg_usleep(), two comments in s_lock.c were not updated, and still refers to select. Made my grep hit the wrong files, so I figured a simple patch was in order.. (other refs in the same comment block was changed..) Magnus Hagander	2004-03-23 21:39:46 +00:00
Bruce Momjian	3947f653f9	* postmaster.c: cleanup pmdaemonize under win32; missed failure message in CreateOptsFile * s_lock.c: minor comment fix * findbe.c: variables not used under win32 moved within #ifndef WIN32 case Claudio Natoli	2004-03-15 16:18:43 +00:00
Bruce Momjian	c672aa823b	For application to HEAD, following community review. * Changes incorrect CYGWIN defines to __CYGWIN__ * Some localtime returns NULL checks (when unchecked cause SEGVs under Win32 regression tests) * Rationalized CreateSharedMemoryAndSemaphores and AttachSharedMemoryAndSemaphores (Bruce, I finally remembered to do it); requires attention. Claudio Natoli	2004-02-25 19:41:23 +00:00
Tom Lane	7a57a67278	Replace opendir/closedir calls throughout the backend with AllocateDir and FreeDir routines modeled on the existing AllocateFile/FreeFile. Like the latter, these routines will avoid failing on EMFILE/ENFILE conditions whenever possible, and will prevent leakage of directory descriptors if an elog() occurs while one is open. Also, reduce PANIC to ERROR in MoveOfflineLogs() --- this is not critical code and there is no reason to force a DB restart on failure. All per recent trouble report from Olivier Hubaut.	2004-02-23 23:03:10 +00:00
Tom Lane	f83356c7f5	Do a direct probe during postmaster startup to determine the maximum number of openable files and the number already opened. This eliminates depending on sysconf(_SC_OPEN_MAX), and allows much saner behavior on platforms where open-file slots are used up by semaphores.	2004-02-23 20:45:59 +00:00
Bruce Momjian	af3b182a57	Here is a patch that implements setitimer() on win32. With this patch applied, deadlock detection and statement_timeout now works. The file timer.c goes into src/backend/port/win32/. The patch also removes two lines of "printf debugging" accidentally left in pqsignal.h, in the console control handler. Magnus Hagander	2004-02-18 16:25:12 +00:00
Tom Lane	da99cce7cd	Avoid delaying postmaster shutdown by up to 10 seconds on platforms where signals do not terminate sleep() delays.	2004-02-12 20:07:26 +00:00
Jan Wieck	fc65a3e1fd	Fixed bug where FlushRelationBuffers() did call StrategyInvalidateBuffer() for already empty buffers because their buffer tag was not cleard out when the buffers have been invalidated before. Also removed the misnamed BM_FREE bufhdr flag and replaced the checks, which effectively ask if the buffer is unpinned, with checks against the refcount field. Jan	2004-02-12 15:06:56 +00:00
Tom Lane	c3c09be34b	Commit the reasonably uncontroversial parts of J.R. Nield's PITR patch, to wit: Add a header record to each WAL segment file so that it can be reliably identified. Avoid splitting WAL records across segment files (this is not strictly necessary, but makes it simpler to incorporate the header records). Make WAL entries for file creation, deletion, and truncation (as foreseen but never implemented by Vadim). Also, add support for making XLOG_SEG_SIZE configurable at compile time, similarly to BLCKSZ. Fix a couple bugs I introduced in WAL replay during recent smgr API changes. initdb is forced due to changes in pg_control contents.	2004-02-11 22:55:26 +00:00
Tom Lane	58f337a343	Centralize implementation of delay code by creating a pg_usleep() subroutine in src/port/pgsleep.c. Remove platform dependencies from miscadmin.h and put them in port.h where they belong. Extend recent vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and non-btree index vacuuming. By the way, where is the documentation for the cost-based-delay patch?	2004-02-10 03:42:45 +00:00
Tom Lane	87bd956385	Restructure smgr API as per recent proposal. smgr no longer depends on the relcache, and so the notion of 'blind write' is gone. This should improve efficiency in bgwriter and background checkpoint processes. Internal restructuring in md.c to remove the not-very-useful array of MdfdVec objects --- might as well just use pointers. Also remove the long-dead 'persistent main memory' storage manager (mm.c), since it seems quite unlikely to ever get resurrected.	2004-02-10 01:55:27 +00:00
Neil Conway	f06e79525a	Win32 signals cleanup. Patch by Magnus Hagander, with input from Claudio Natoli and Bruce Momjian (and some cosmetic fixes from Neil Conway). Changes: - remove duplicate signal definitions from pqsignal.h - replace pqkill() with kill() and redefine kill() in Win32 - use ereport() in place of fprintf() in some error handling in pqsignal.c - export pg_queue_signal() and make use of it where necessary - add a console control handler for Ctrl-C and similar handling on Win32 - do WaitForSingleObjectEx() in CHECK_FOR_INTERRUPTS() on Win32; query cancelling should now work on Win32 - various other fixes and cleanups	2004-02-08 22:28:57 +00:00
Jan Wieck	f425b605f4	Cost based vacuum delay feature. Jan	2004-02-06 19:36:18 +00:00
Jan Wieck	8d09e25693	Backing out the background writer sync() option. Jan	2004-02-04 01:24:53 +00:00
Bruce Momjian	5ee2ae2049	Remove sleep() and use single PG_SLEEP call for Win32 signal handling and consistency. Change PG_USLEEP to use SleepEx() for signal interuptability.	2004-01-30 15:57:04 +00:00
Bruce Momjian	50491963cb	Here's the latest win32 signals code, this time in the form of a patch against the latest shapshot. It also includes the replacement of kill() with pqkill() and sigsetmask() with pqsigsetmask(). Passes all tests fine on my linux machine once applied. Still doesn't link completely on Win32 - there are a few things still required. But much closer than before. At Bruce's request, I'm goint to write up a README file about the method of signals delivery chosen and why the others were rejected (basically a summary of the mailinglist discussions). I'll finish that up once/if the patch is accepted. Magnus Hagander	2004-01-27 00:45:26 +00:00
Bruce Momjian	eec08b95e7	[all] Removed call to getppid in SendPostmasterSignal, replacing with a PostmasterPid variable, which gets set (early) in PostmasterMain getppid would not be the postmaster? [fork/exec] Implements processCancelRequest by keeping an array of pid/cancel_key structs in shared mem [fork/exec] Moves AttachSharedMemoryAndSemaphores call for backends into SubPostmasterMain [win32] Implements reaper/waitpid by keeping an arrays of children pids,handles in postmaster local mem - this item is largely untested, for reasons which should be obvious, but appears sound [win32/all] Added extern for pgpipe in Win32 case, and changed the second pipe call (which seems to have been missed earlier) to pgpipe [win32] #define'd ftruncate to chsize in the Win32 case [win32] PG_USLEEP for Win32 has a misplaced paren. Fixed. [win32] DLLIMPORT handling for MingW case Claudio Natoli	2004-01-26 22:59:54 +00:00
Bruce Momjian	ede3b762a3	Back out win32 patch so we can apply it separately.	2004-01-26 22:54:58 +00:00
Bruce Momjian	f4921e5ca3	Attached is a patch that fixes some trivial typos and alignment. Please apply. Alvaro Herrera	2004-01-26 22:51:56 +00:00
Tom Lane	c77f363384	Ensure that close() and fclose() are checked for errors, at least in cases involving writes. Per recent discussion about the possibility of close-time failures on some filesystems. There is a TODO item for this, too.	2004-01-26 22:35:32 +00:00
Jan Wieck	d77b63b17c	Added GUC variable bgwriter_flush_method controlling the action done by the background writer between writing dirty blocks and napping. none (default) no action sync bgwriter calls smgrsync() causing a sync(2) A global sync() is only good on dedicated database servers, so more flush methods should be added in the future. Jan	2004-01-24 20:00:46 +00:00
Jan Wieck	dfdd59e918	Adjusted calculation of shared memory requirements to new ARC buffer replacement strategy. Jan	2004-01-15 16:14:26 +00:00
Bruce Momjian	4cdf51e646	Drops in the CreateProcess calls for Win32 (essentially wrapping up the fork/exec portion of the port), and fixes a handful of whitespace issues Claudio Natoli	2004-01-11 03:49:31 +00:00
Bruce Momjian	38081fd000	Change PG_DELAY from msec to usec and use it consistenly rather than select(). Add Win32 Sleep() for delay.	2004-01-09 21:08:50 +00:00
Neil Conway	192ad63bd7	More janitorial work: remove the explicit casting of NULL literals to a pointer type when it is not necessary to do so. For future reference, casting NULL to a pointer type is only necessary when (a) invoking a function AND either (b) the function has no prototype OR (c) the function is a varargs function.	2004-01-07 18:56:30 +00:00
Neil Conway	dfc7e7b71d	Code cleanup, mostly in the smgr: - Update comment in IsReservedName() to the present day - Improve some variable & function names in commands/vacuum.c. I was planning to rewrite this to avoid lappend(), but since I still intend to do the list rewrite, there's no need for that. - Update some smgr comments which seemed to imply that we still forced all dirty pages to disk at commit-time. - Replace some #ifdef DIAGNOSTIC code with assertions. - Make the distinction between OS-level file descriptors and virtual file descriptors a little clearer in a few comments - Other minor comment improvements in the smgr code	2004-01-06 18:07:32 +00:00
Tom Lane	e8aa10ee47	ShmemInitHash forgot to specify HASH_ALLOC flag bit in its hash_create call. You'd think this would cause some problems, but because of the way hash_create is coded, the only side-effect was creation of a useless memory context for the hashtable.	2003-12-30 00:03:03 +00:00
Tom Lane	f8eed65dfb	Improve spinlock code for recent x86 processors: insert a PAUSE instruction in the s_lock() wait loop, and use test before test-and-set in TAS() macro to avoid unnecessary bus traffic. Patch from Manfred Spraul, reworked a bit by Tom.	2003-12-27 20:58:58 +00:00
Bruce Momjian	aeddc2a60d	Continued rearrangement to permit pgstat + BootstrapMain processes to be fork/exec'd, in the same mode as the previous patch for backends. Claudio Natoli	2003-12-25 03:52:51 +00:00
Tom Lane	afb09b5a31	Use inlined TAS() on PA-RISC, if we are compiling with gcc. Patch inspired by original submission from ViSolve.	2003-12-23 22:15:07 +00:00
Tom Lane	9adaf64da3	Mop-up for HAS_TEST_AND_SET refactoring. Un-break two or three platforms that were broken, try to make layout of s_lock.h entries consistent, use HAVE_SPINLOCKS in preference to HAS_TEST_AND_SET everywhere outside s_lock.h itself.	2003-12-23 18:13:17 +00:00
Bruce Momjian	69f2e9b0fc	Move slock_t typdefs into s_lock.h from include/port files for centralization and easier maintanence.	2003-12-23 03:31:30 +00:00
Bruce Momjian	887b5a7be0	Remove NEED_I386_TAS_ASM and just test for compiler defines.	2003-12-23 00:32:06 +00:00
Bruce Momjian	9114cb1c5f	This applied patch remove NEED_SPARC_TAS_ASM and instead uses __sparc \|\| __sparc__.	2003-12-22 23:39:53 +00:00
Bruce Momjian	ced30eb857	[ This description should have been on the earlier fork/exec commit, but I am adding it now so it is in CVS.] The patch basically is a slight rearrangement of the code to allow fork/exec on Unix, with the ultimate goal of doing CreateProcess on Win32. The changes are: o Write out postmaster global variables and per-backend variables to be read by the exec'ed backend o Mark some static variables as global when exec is used so then can be dumped from postmaster.c, marked NON_EXEC_STATIC o Remove value passing with -p now that we have per-backend file o Move some pointer storage out of shared memory for easier dumping. o Modified pgsql_temp directory cleanup to handle per-database directories and the backend exec directory under datadir. Claudio Natoli	2003-12-21 04:30:10 +00:00
Tom Lane	772d0f9345	The recent DUMMY_PROCS patch broke accounting for the number of semaphores needed. This caused us to fail all the time on Darwin, and we'd fail for some values of maxBackends on SysV-sema platforms, too.	2003-12-21 00:33:33 +00:00
Tom Lane	16cc9dff4f	bufmgr.c failed to compile on Darwin, because it didn't include <sys/time.h> where struct timeval is defined.	2003-12-20 22:18:02 +00:00
Bruce Momjian	d75b2ec4eb	This patch is the next step towards (re)allowing fork/exec. Claudio Natoli	2003-12-20 17:31:21 +00:00
Neil Conway	fef0c8345a	I posted some bufmgr cleanup a few weeks ago, but it conflicted with some concurrent changes Jan was making to the bufmgr. Here's an updated version of the patch -- it should apply cleanly to CVS HEAD and passes the regression tests. This patch makes the following changes: - remove the UnlockAndReleaseBuffer() and UnlockAndWriteBuffer() macros, and replace uses of them with calls to the appropriate functions. - remove a bunch of #ifdef BMTRACE code: it is ugly & broken (i.e. it doesn't compile) - make BufferReplace() return a bool, not an int - cleanup some logic in bufmgr.c; should be functionality equivalent to the previous code, just cleaner now - remove the BM_PRIVATE flag as it is unused - improve a few comments, etc.	2003-12-14 00:34:47 +00:00
Peter Eisentraut	2afacfc403	This patch properly sets the prototype for the on_shmem_exit and on_proc_exit functions, and adjust all other related code to use the proper types too. by Kurt Roeckx	2003-12-12 18:45:10 +00:00
Bruce Momjian	e7ca867485	Try to reduce confusion about what is a lock method identifier, a lock method control structure, or a table of control structures. . Use type LOCKMASK where an int is not a counter. . Get rid of INVALID_TABLEID, use INVALID_LOCKMETHOD instead. . Use INVALID_LOCKMETHOD instead of (LOCKMETHOD) NULL, because LOCKMETHOD is not a pointer. . Define and use macro LockMethodIsValid. . Rename LOCKMETHOD to LOCKMETHODID. . Remove global variable LongTermTableId in lmgr.c, because it is never used. . Make LockTableId static in lmgr.c, because it is used nowhere else. Why not remove it and use DEFAULT_LOCKMETHOD? . Rename the lock method control structure from LOCKMETHODTABLE to LockMethodData. Introduce a pointer type named LockMethod. . Remove elog(FATAL) after InitLockTable() call in CreateSharedMemoryAndSemaphores(), because if something goes wrong, there is elog(FATAL) in LockMethodTableInit(), and if this doesn't help, an elog(ERROR) in InitLockTable() is promoted to FATAL. . Make InitLockTable() void, because its only caller does not use its return value any more. . Rename variables in lock.c to avoid statements like LockMethodTable[NumLockMethods] = lockMethodTable; lockMethodTable = LockMethodTable[lockmethod]; . Change LOCKMETHODID type to uint16 to fit into struct LOCKTAG. . Remove static variables BITS_OFF and BITS_ON from lock.c, because I agree to this doubt: * XXX is a fetch from a static array really faster than a shift? . Define and use macros LOCKBIT_ON/OFF. Manfred Koizar	2003-12-01 21:59:25 +00:00
Tom Lane	0902ece5b9	Force zero_damaged_pages to be effectively ON during recovery from WAL, since there is no need to worry about damaged pages when we are going to overwrite them anyway from the WAL. Per recent discussion.	2003-12-01 16:53:19 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Peter Eisentraut	c9190ef074	Conditionalize variable that is only used conditionally, to avoid warning.	2003-11-27 18:12:50 +00:00
Tom Lane	9ea738827c	Second try at fixing no-room-to-move-down PANIC in compact_fsm_storage. Ward's report that it can still happen in RC2 forces me to realize that this is not a can't-happen condition after all, and that the compaction code had better cope rather than panicking.	2003-11-26 20:50:11 +00:00
Tom Lane	42ce74bf17	COMMENT ON casts, conversions, languages, operator classes, and large objects. Dump all these in pg_dump; also add code to pg_dump user-defined conversions. Make psql's large object code rely on the backend for inserting/deleting LOB comments, instead of trying to hack pg_description directly. Documentation and regression tests added. Christopher Kings-Lynne, code reviewed by Tom	2003-11-21 22:32:49 +00:00
Tom Lane	0a97cb37fc	Remove unused variable.	2003-11-21 17:41:31 +00:00
Jan Wieck	cfeca62148	Background writer process This first part of the background writer does no syncing at all. It's only purpose is to keep the LRU heads clean so that regular backends seldom to never have to call write(). Jan	2003-11-19 15:55:08 +00:00
Jan Wieck	1f45555892	Changed parameter name for shared cache status report interval to debug_shared_buffers = <seconds> as per previous discussion. Jan	2003-11-16 16:41:01 +00:00
Jan Wieck	7c360d65a8	Added documentation for the new interface between the buffer manager and the cache replacement strategy as well as a description of the ARC algorithm and the special tailoring of that done for PostgreSQL. Jan	2003-11-14 04:32:11 +00:00
Jan Wieck	6b86d62b00	2nd try for the ARC strategy. I added a couple more Assertions while tracking down the exact cause of the former bug. All 93 regression tests pass now. Jan	2003-11-13 14:57:15 +00:00
Jan Wieck	923e994d79	ARC strategy backed out ... sorry Jan	2003-11-13 05:34:58 +00:00
Jan Wieck	48adc0b34b	Replacement of the buffer replacement strategy with an ARC algorithm adopted for PostgreSQL. Jan	2003-11-13 00:40:02 +00:00
Tom Lane	fa5c8a055a	Cross-data-type comparisons are now indexable by btrees, pursuant to my pghackers proposal of 8-Nov. All the existing cross-type comparison operators (int2/int4/int8 and float4/float8) have appropriate support. The original proposal of storing the right-hand-side datatype as part of the primary key for pg_amop and pg_amproc got modified a bit in the event; it is easier to store zero as the 'default' case and only store a nonzero when the operator is actually cross-type. Along the way, remove the long-since-defunct bigbox_ops operator class.	2003-11-12 21:15:59 +00:00
Tom Lane	c1d62bfd00	Add operator strategy and comparison-value datatype fields to ScanKey. Remove the 'strategy map' code, which was a large amount of mechanism that no longer had any use except reverse-mapping from procedure OID to strategy number. Passing the strategy number to the index AM in the first place is simpler and faster. This is a preliminary step in planned support for cross-datatype index operations. I'm committing it now since the ScanKeyEntryInitialize() API change touches quite a lot of files, and I want to commit those changes before the tree drifts under me.	2003-11-09 21:30:38 +00:00
Tom Lane	4240d2bffd	Update future-tense comments in README to present tense. Noted by Neil Conway.	2003-10-31 22:48:08 +00:00
Tom Lane	abec4cbf1f	compact_fsm_storage() does need to handle the case where a relation's FSM data has to be both moved down and compressed. Per report from Dror Matalon.	2003-10-29 17:36:57 +00:00
Tom Lane	624292aa35	Ensure that all places that are complaining about exhaustion of shared memory say 'out of shared memory'; some were doing that and some just said 'out of memory'. Also add a HINT about increasing max_locks_per_transaction where relevant, per suggestion from Sean Chittenden. (The former change does not break the strings freeze; the latter does, but I think it's worth doing anyway.)	2003-10-16 20:59:35 +00:00
Bruce Momjian	7fb9893f42	Back out -fstrict-aliasing void* casting.	2003-10-11 18:04:26 +00:00
Bruce Momjian	d51368dbbd	This patch will stop gcc from issuing warnings about type-punned objects when -fstrict-aliasing is turned on, as it is in the latest gcc when you use -O2 Andrew Dunstan	2003-10-11 16:30:55 +00:00
Tom Lane	55d85f42a8	Repair RI trigger visibility problems (this time for sure ;-)) per recent discussion on pgsql-hackers: in READ COMMITTED mode we just have to force a QuerySnapshot update in the trigger, but in SERIALIZABLE mode we have to run the scan under a current snapshot and then complain if any rows would be updated/deleted that are not visible in the transaction snapshot.	2003-10-01 21:30:53 +00:00
Peter Eisentraut	7438af96fa	More message editing, some suggested by Alvaro Herrera	2003-09-29 00:05:25 +00:00
Peter Eisentraut	feb4f44d29	Message editing: remove gratuitous variations in message wording, standardize terms, add some clarifications, fix some untranslatable attempts at dynamic message building.	2003-09-25 06:58:07 +00:00
Tom Lane	a56a016ceb	Repair some REINDEX problems per recent discussions. The relcache is now able to cope with assigning new relfilenode values to nailed-in-cache indexes, so they can be reindexed using the fully crash-safe method. This leaves only shared system indexes as special cases. Remove the 'index deactivation' code, since it provides no useful protection in the shared- index case. Require reindexing of shared indexes to be done in standalone mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes) now prevents using indexes for lookups, but does not disable index updates. It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs can be done without a standalone backend for all cases except shared catalogs.	2003-09-24 18:54:02 +00:00
Tom Lane	5aa29e88e9	Arrange to align shared disk buffers on at least 32-byte boundaries, not just MAXALIGN boundaries. This makes a noticeable difference in the speed of transfers to and from kernel space, at least on recent Pentiums, and might help other CPUs too. We should look at making this happen for local buffers and buffile.c too. Patch from Manfred Spraul.	2003-09-21 17:57:21 +00:00
Bruce Momjian	aaafbdcfd3	Fix old mention of exec() in AttachSharedMemoryAndSemaphores comment.	2003-09-12 02:13:23 +00:00
Tom Lane	7a3693716d	Reimplement hash index locking algorithms, per my recent proposal to pghackers. This fixes the problem recently reported by Markus KrÌutner (hash bucket split corrupts the state of scans being done concurrently), and I believe it also fixes all the known problems with deadlocks in hash index operations. Hash indexes are still not really ready for prime time (since they aren't WAL-logged), but this is a step forward.	2003-09-04 22:06:27 +00:00
Tom Lane	de9c553f6b	Clean up locktable init code per recent gripe from Kurt Roeckx. No change in behavior, but old code would have failed to detect overrun of MAX_LOCKMODES.	2003-08-17 22:41:12 +00:00
Tom Lane	ffafacc1f6	Repair potential deadlock created by recent changes to recycle btree index pages: when _bt_getbuf asks the FSM for a free index page, it is possible (and, in some cases, even moderately likely) that the answer will be the same page that _bt_split is trying to split. _bt_getbuf already knew that the returned page might not be free, but it wasn't prepared for the possibility that even trying to lock the page could be problematic. Fix by doing a conditional rather than unconditional grab of the page lock.	2003-08-10 19:48:08 +00:00
Bruce Momjian	46785776c4	Another pgindent run with updated typedefs.	2003-08-08 21:42:59 +00:00
Tom Lane	d5f7d2c682	Adopt a random backoff algorithm for sleep delays when waiting for a spinlock. Per recent pghackers discussion.	2003-08-06 16:43:43 +00:00
Tom Lane	963c1fa9d3	Minor cleanups in S_LOCK_TEST code.	2003-08-04 15:28:33 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	81b5c8a136	A visit from the message-style police ...	2003-07-28 00:09:16 +00:00
Tom Lane	b556e8200e	elog mop-up: bring some straggling fprintf(stderr)'s into the elog world.	2003-07-27 21:49:55 +00:00
Tom Lane	cfa191f3b8	Error message editing in backend/storage.	2003-07-24 22:04:15 +00:00
Bruce Momjian	acd1536d9f	Up to now, SerializableSnapshot and QuerySnapshot are malloc'ed and free'd for every transaction or statement, respectively. This patch puts these data structures into static memory, thus saving a few CPU cycles and two malloc calls per transaction or (in isolation level READ COMMITTED) per query. Manfred Koizar	2003-06-12 01:42:21 +00:00
Bruce Momjian	0abe7431c6	This patch extracts page buffer pooling and the simple least-recently-used strategy from clog.c into slru.c. It doesn't change any visible behaviour and passes all regression tests plus a TruncateCLOG test done manually. Apart from refactoring I made a little change to SlruRecentlyUsed, formerly ClogRecentlyUsed: It now skips incrementing lru_counts, if slotno is already the LRU slot, thus saving a few CPU cycles. To make this work, lru_counts are initialised to 1 in SimpleLruInit. SimpleLru will be used by pg_subtrans (part of the nested transactions project), so the main purpose of this patch is to avoid future code duplication. Manfred Koizar	2003-06-11 22:37:46 +00:00
Bruce Momjian	98b6f37e47	Make debug_ GUC varables output DEBUG1 rather than LOG, and mention in docs that CLIENT/LOG_MIN_MESSAGES now controls debug_* output location. Doc changes included.	2003-05-27 17:49:47 +00:00
Bruce Momjian	12c9423832	Allow Win32 to compile under MinGW. Major changes are: Win32 port is now called 'win32' rather than 'win' add -lwsock32 on Win32 make gethostname() be only used when kerberos4 is enabled use /port/getopt.c new /port/opendir.c routines disable GUC unix_socket_group on Win32 convert some keywords.c symbols to KEYWORD_P to prevent conflict create new FCNTL_NONBLOCK macro to turn off socket blocking create new /include/port.h file that has /port prototypes, move out of c.h new /include/port/win32_include dir to hold missing include files work around ERROR being defined in Win32 includes	2003-05-15 16:35:30 +00:00
Tom Lane	a4e775a263	Make use of new error context stack mechanism to allow random errors detected during buffer dump to be labeled with the buffer location. For example, if a page LSN is clobbered, we now produce something like ERROR: XLogFlush: request 2C000000/8468EC8 is not satisfied --- flushed only to 0/8468EF0 CONTEXT: writing block 0 of relation 428946/566240 whereas before there was no convenient way to find out which page had been trashed.	2003-05-10 19:04:30 +00:00
Bruce Momjian	d9fd7d12f6	Pass shared memory id and socket descriptor number on command line for fork/exec.	2003-05-06 23:34:56 +00:00
Bruce Momjian	a7fd03e1de	Handle clog structure in shared memory in exec() case, for Win32.	2003-05-03 03:52:07 +00:00
Bruce Momjian	a2e038fbee	Back out last commit --- wrong patch.	2003-05-02 21:59:31 +00:00
Bruce Momjian	fb1f7ccec5	Dump/read non-default GUC values for use by exec'ed backends, for Win32.	2003-05-02 21:52:42 +00:00
Tom Lane	4a5f38c4e6	Code review for holdable-cursors patch. Fix error recovery, memory context sloppiness, some other things. Includes Neil's mopup patch of 22-Apr.	2003-04-29 03:21:30 +00:00
Tom Lane	f9ba0a7fe5	Apple's assembler likes the inlined TAS syntax too, so no reason to maintain a separate out-of-line version of PPC tas() anymore. Also fix S_UNLOCK for __powerpc64__ platforms.	2003-04-20 21:54:34 +00:00
Bruce Momjian	d46e643822	Add Win32 path handling for / vs. \ and drive letters.	2003-04-04 20:42:13 +00:00
Tom Lane	7cd30e1590	TestConfiguration returns int, not bool. This mistake is relatively harmless on signed-char machines but would lead to core dump in the deadlock detection code if char is unsigned. Amazingly, this bug has been here since 7.1 and yet wasn't reported till now. Thanks to Robert Bruccoleri for providing the opportunity to track it down.	2003-03-31 20:32:29 +00:00
Tom Lane	fd42262836	Add code to apply some simple sanity checks to the header fields of a page when it's read in, per pghackers discussion around 17-Feb. Add a GUC variable zero_damaged_pages that causes the response to be a WARNING followed by zeroing the page, rather than the normal ERROR; this is per Hiroshi's suggestion that there needs to be a way to get at the data in the rest of the table.	2003-03-28 20:17:13 +00:00
Bruce Momjian	54f7338fa1	This patch implements holdable cursors, following the proposal (materialization into a tuple store) discussed on pgsql-hackers earlier. I've updated the documentation and the regression tests. Notes on the implementation: - I needed to change the tuple store API slightly -- it assumes that it won't be used to hold data across transaction boundaries, so the temp files that it uses for on-disk storage are automatically reclaimed at end-of-transaction. I added a flag to tuplestore_begin_heap() to control this behavior. Is changing the tuple store API in this fashion OK? - in order to store executor results in a tuple store, I added a new CommandDest. This works well for the most part, with one exception: the current DestFunction API doesn't provide enough information to allow the Executor to store results into an arbitrary tuple store (where the particular tuple store to use is chosen by the call site of ExecutorRun). To workaround this, I've temporarily hacked up a solution that works, but is not ideal: since the receiveTuple DestFunction is passed the portal name, we can use that to lookup the Portal data structure for the cursor and then use that to get at the tuple store the Portal is using. This unnecessarily ties the Portal code with the tupleReceiver code, but it works... The proper fix for this is probably to change the DestFunction API -- Tom suggested passing the full QueryDesc to the receiveTuple function. In that case, callers of ExecutorRun could "subclass" QueryDesc to add any additional fields that their particular CommandDest needed to get access to. This approach would work, but I'd like to think about it for a little bit longer before deciding which route to go. In the mean time, the code works fine, so I don't think a fix is urgent. - (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and adjusted the behavior of SCROLL in accordance with the discussion on -hackers. - (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml Neil Conway	2003-03-27 16:51:29 +00:00
Tom Lane	4b6c198a6a	Add code to dump contents of free space map into $PGDATA/global/pg_fsm.cache at database shutdown, and then load it again at database startup. This preserves our hard-won knowledge of free space across restarts (given an orderly shutdown, that is).	2003-03-06 00:04:27 +00:00
Tom Lane	391eb5e5b6	Reimplement free-space-map management as per recent discussions. Adjustable threshold is gone in favor of keeping track of total requested page storage and doling out proportional fractions to each relation (with a minimum amount per relation, and some quantization of the results to avoid thrashing with small changes in page counts). Provide special- case code for indexes so as not to waste space storing useless page free space counts. Restructure internal data storage to be a flat array instead of list-of-chunks; this may cost a little more work in data copying when reorganizing, but allows binary search to be used during lookup_fsm_page_entry().	2003-03-04 21:51:22 +00:00
Tom Lane	61b22d3aab	btree page recycling can be done as soon as page's next-xact label is older than current Xmin; we don't have to wait till it's older than GlobalXmin.	2003-02-23 23:20:52 +00:00
Tom Lane	88dc31e3f2	First cut at recycling space in btree indexes. Still some rough edges to fix, but it seems to basically work...	2003-02-23 06:17:13 +00:00
Bruce Momjian	69c049cef4	Back out LOCKTAG changes by Rod Taylor, pending code review. Sorry.	2003-02-19 23:41:15 +00:00
Bruce Momjian	d0f3a7e9c4	- Modifies LOCKTAG to include a 'classId'. Relation receive a classId of RelOid_pg_class, and transaction locks XactLockTableId. RelId is renamed to objId. - LockObject() and UnlockObject() functions created, and their use sprinkled throughout the code to do descent locking for domains and types. They accept lock modes AccessShare and AccessExclusive, as we only really need a 'read' and 'write' lock at the moment. Most locking cases are held until the end of the transaction. This fixes the cases Tom mentioned earlier in regards to locking with Domains. If the patch is good, I'll work on cleaning up issues with other database objects that have this problem (most of them). Rod Taylor	2003-02-19 04:02:54 +00:00
Bruce Momjian	9ee8e7a39e	Update README.	2003-02-18 03:33:50 +00:00
Bruce Momjian	32cc6cbe23	Rename 'holder' references to 'proclock' for PROCLOCK references, for consistency.	2003-02-18 02:13:24 +00:00
Bruce Momjian	48ee6f4916	This trivial patch removes the usage of some old statistics code that no longer works -- IncrHeapAccessStat() didn't actually do anything anymore, so no reason to keep it around AFAICS. I also fixed a grammatical error in a comment. Neil Conway	2003-02-13 05:35:11 +00:00
Tom Lane	227a404cf4	Add code to print information about a detected deadlock cycle. The printed data is comparable to what you could read in the pg_locks view, were you fortunate enough to have been looking at it at the right time.	2003-01-16 21:01:45 +00:00
Bruce Momjian	ef581f0552	Rewrite for-loop, because this is not the Obfuscated C Code Contest. Manfred Koizar	2003-01-11 05:01:03 +00:00
Tom Lane	9f1f2bfb66	Fix various places where global s/NOTICE/WARNING/ was applied with too much enthusiasm.	2003-01-07 22:23:17 +00:00
Tom Lane	973a210cce	Tweak mdnblocks() to avoid doing lseek() on segments that it has previously determined not to be the last segment of a relation. This reduces the expected cost to one seek, rather than one seek per segment. We can get away with this because truncation of a relation will cause a relcache flush and so the md.c file descriptor will be closed; when it is re-opened we will re-determine the last segment.	2003-01-07 01:19:12 +00:00
Tom Lane	a2e8e15dd4	localbuf.c must be able to do blind writes.	2002-12-05 22:48:03 +00:00
Tom Lane	8362be35e8	Code review for superuser_reserved_connections patch. Don't try to do database access outside a transaction; revert bogus performance improvement in SIBackendInit(); improve comments; add documentation (this part courtesy Neil Conway).	2002-11-21 06:36:08 +00:00
Tom Lane	6929a1e6ad	Improve comment: add note that grotty special case in mdread() is required by hash index implementation.	2002-11-12 15:26:30 +00:00
Bruce Momjian	ceb4f5ea9c	> > I'll re-check that with the ppc architecture guy here. > > ... he is now about to write an inlined version that can go into > s_lock.h . I'll send the new patch later on... OK, here it comes: An inlined version of tas(), that works for both, powerpc and powerpc64. The patch is against 7.3b5 and passes the test suite on both architectures. Reinhard Max	2002-11-10 00:33:43 +00:00
Tom Lane	643dfb783d	Fix some bogus comments.	2002-11-01 00:40:23 +00:00
Tom Lane	55e4ef138c	Code review for statement_timeout patch. Fix some race conditions between signal handler and enable/disable code, avoid accumulation of timing error due to trying to maintain remaining-time instead of absolute-end-time, disable timeout before commit not after.	2002-10-31 21:34:17 +00:00
Tom Lane	edf497dec9	Avoid palloc(0) when MaxBackends = 1.	2002-10-03 19:17:55 +00:00
Bruce Momjian	5ad4faf13a	This patch removes a use of uninitialized memory in lmgr/lock.c, by adding a missing sprintf(). Neil Conway	2002-09-26 05:18:30 +00:00
Tom Lane	8a6fab412e	Remove ShutdownBufferPoolAccess exit callback, and do the work in ProcKill instead, where we still have a PGPROC with which to wait on LWLocks. This fixes 'can't wait without a PROC structure' failures occasionally seen during backend shutdown (I'm surprised they weren't more frequent, actually). Add an Assert() to LWLockAcquire to help catch any similar mistakes in future. Fix failure to update MyProcPid for standalone backends and pgstat processes.	2002-09-25 20:31:40 +00:00
Tom Lane	7233aae50b	Fix PPC s_lock operations to work correctly on multi-CPU machines. Need 'isync' during TAS and 'sync' during S_UNLOCK.	2002-09-21 00:14:05 +00:00
Tom Lane	b2735fcd52	Performance improvement for MultiRecordFreeSpace on large relations --- avoid O(N^2) behavior. Problem noted and fixed by Stephen Marshall <smarshall@wsicorp.com>, with some help from Tom Lane.	2002-09-20 19:56:01 +00:00
Bruce Momjian	229eebd559	This patch fixes two typos in src/backend/storage/ipc/README. Neil Conway	2002-09-20 03:53:55 +00:00
Tom Lane	c91b8bc537	Cosmetic fixes from Neil Conway.	2002-09-14 19:59:20 +00:00
Tom Lane	52c9d25933	Be careful to include postgres.h before any system headers, to ensure that the right flavors of largefile-related definitions are seen. Most of these changes are probably unnecessary, but better safe than sorry.	2002-09-05 00:43:07 +00:00
Bruce Momjian	e50f52a074	pgindent run.	2002-09-04 20:31:48 +00:00
Bruce Momjian	a12b4e279b	I checked all the previous string handling errors and most of them were already fixed by You. However there were a few left and attached patch should fix the rest of them. I used StringInfo only in 2 places and both of them are inside debug ifdefs. Only performance penalty will come from using strlen() like all the other code does. I also modified some of the already patched parts by changing snprintf(buf, 2 * BUFSIZE, ... style lines to snprintf(buf, sizeof(buf), ... where buf is an array. Jukka Holappa	2002-09-02 06:11:43 +00:00
Bruce Momjian	97ac103289	Remove sys/types.h in files that include postgres.h, and hence c.h, because c.h has sys/types.h.	2002-09-02 02:47:07 +00:00
Tom Lane	c7a165adc6	Code review for HeapTupleHeader changes. Add version number to page headers (overlaying low byte of page size) and add HEAP_HASOID bit to t_infomask, per earlier discussion. Simplify scheme for overlaying fields in tuple header (no need for cmax to live in more than one place). Don't try to clear infomask status bits in tqual.c --- not safe to do it there. Don't try to force output table of a SELECT INTO to have OIDs, either. Get rid of unnecessarily complex three-state scheme for TupleDesc.tdhasoids, which has already caused one recent failure. Improve documentation.	2002-09-02 01:05:06 +00:00
Tom Lane	1bab464eb4	Code review for pg_locks feature. Make shmemoffset of PROCLOCK structs available (else there's no way to interpret the list links). Change pg_locks view to show transaction ID locks separately from ordinary relation locks. Avoid showing N duplicate rows when the same lock is held multiple times (seems unlikely that users care about exact hold count). Improve documentation.	2002-08-31 17:14:28 +00:00
Bruce Momjian	626eca697c	This patch reserves the last superuser_reserved_connections slots for connections by the superuser only. This patch replaces the last patch I sent a couple of days ago. It closes a connection that has not been authorised by a superuser if it would leave less than the GUC variable ReservedBackends (superuser_reserved_connections in postgres.conf) backend process slots free in the SISeg. This differs to the first patch which only reserved the last ReservedBackends slots in the procState array. This has made the free slot test more expensive due to the use of a lock. After thinking about a comment on the first patch I've also made it a fatal error if the number of reserved slots is not less than the maximum number of connections. Nigel J. Andrews	2002-08-29 21:02:12 +00:00
Bruce Momjian	dd912c6977	This patches replaces a few more usages of strcpy() and sprintf() when copying into a fixed-size buffer (in this case, a buffer of NAMEDATALEN bytes). AFAICT nothing to worry about here, but worth fixing anyway... Neil Conway	2002-08-27 03:56:35 +00:00
Tom Lane	58de480999	Clean up comments to be careful about the distinction between variable- width types and varlena types, since with the introduction of CSTRING as a more-or-less-real type, these concepts aren't identical. I've tried to use varlena consistently to denote datatypes with typlen = -1, ie, they have a length word and are potentially TOASTable; while the term variable width covers both varlena and cstring (and, perhaps, someday other types with other rules for computing the actual width). No code changes in this commit except for renaming a couple macros.	2002-08-25 17:20:01 +00:00
Bruce Momjian	82119a696e	[ Newest version of patch applied.] This patch is an updated version of the lock listing patch. I've made the following changes: - write documentation - wrap the SRF in a view called 'pg_locks': all user-level access should be done through this view - re-diff against latest CVS One thing I chose not to do is adapt the SRF to use the anonymous composite type code from Joe Conway. I'll probably do that eventually, but I'm not really convinced it's a significantly cleaner way to bootstrap SRF builtins than the method this patch uses (of course, it has other uses...) Neil Conway	2002-08-17 13:04:19 +00:00
Bruce Momjian	b1a5f87209	Tom Lane wrote: > There's no longer a separate call to heap_storage_create in that routine > --- the right place to make the test is now in the storage_create > boolean parameter being passed to heap_create. A simple change, but > it passeth patch's understanding ... Thanks. Attached is a patch against cvs tip as of 8:30 PM PST or so. Turned out that even after fixing the failed hunks, there was a new spot in bufmgr.c which needed to be fixed (related to temp relations; RelationUpdateNumberOfBlocks). But thankfully the regression test code caught it :-) Joe Conway	2002-08-15 16:36:08 +00:00
Tom Lane	e44beef712	Code review of CLUSTER patch. Clean up problems with relcache getting confused, toasted data getting lost, etc.	2002-08-11 21:17:35 +00:00
Peter Eisentraut	f1d820494c	Fix failure to relink postmaster executable in the first make run if only a single source file a few directories deep in the backend tree has changed.	2002-08-10 17:59:28 +00:00
Tom Lane	ba053de197	Still more paranoia in PageAddItem: disallow specification of an item offset past the last-used-item-plus-one, since that would result in leaving uninitialized holes in the item pointer array. AFAICT the only place that was depending on this was btree index build, which was being cavalier about when to fill in the P_HIKEY pointer; easily fixed. Also a small performance improvement: shuffle itemid's by means of memmove, not a one-at-a-time loop.	2002-08-06 19:41:23 +00:00
Tom Lane	5df307c778	Restructure local-buffer handling per recent pghackers discussion. The local buffer manager is no longer used for newly-created relations (unless they are TEMP); a new non-TEMP relation goes through the shared bufmgr and thus will participate normally in checkpoints. But TEMP relations use the local buffer manager throughout their lifespan. Also, operations in TEMP relations are not logged in WAL, thus improving performance. Since it's no longer necessary to fsync relations as they move out of the local buffers into shared buffers, quite a lot of smgr.c/md.c/fd.c code is no longer needed and has been removed: there's no concept of a dirty relation anymore in md.c/fd.c, and we never fsync anything but WAL. Still TODO: improve local buffer management algorithms so that it would be reasonable to increase NLocBuffer.	2002-08-06 02:36:35 +00:00
Tom Lane	15fe086fba	Restructure system-catalog index updating logic. Instead of having hardwired lists of index names for each catalog, use the relcache's mechanism for caching lists of OIDs of indexes of any table. This reduces the common case of updating system catalog indexes to a single line, makes it much easier to add a new system index (in fact, you can now do so on-the-fly if you want to), and as a nice side benefit improves performance a little. Per recent pghackers discussion.	2002-08-05 03:29:17 +00:00
Bruce Momjian	5e6528adf7	* -Remove LockMethodTable.prio field, not used (Bruce)	2002-08-01 05:18:34 +00:00
Bruce Momjian	b75fcf9326	Complete TODO item: * -HOLDER/HOLDERTAB rename to PROCLOCK/PROCLOCKTAG	2002-07-19 00:17:40 +00:00
Bruce Momjian	981d045e88	Complete TODO item: * Merge LockMethodCtl and LockMethodTable into one shared structure (Bruce)	2002-07-18 23:06:20 +00:00
Bruce Momjian	4db8718e84	Add SET statement_timeout capability. Timeout is in ms. A value of zero turns off the timer.	2002-07-13 01:02:14 +00:00
Bruce Momjian	33f1687879	There already was a macro PageGetItemId; this is now used in (almost) all places, where pd_linp is accessed. Also introduce new macros SizeOfPageHeaderData and BTMaxItemSize. This is just source code cosmetic, no behaviour changed. Manfred Koizar	2002-07-02 05:48:44 +00:00
Bruce Momjian	8864603f3c	Minor code cleanup in bufmgr.c and bufmgr.h, mainly by moving repeated lines of code into internal routines (drop_relfilenode_buffers, release_buffer) and by hiding unused routines (PrintBufferDescs, PrintPinnedBufs) behind #ifdef NOT_USED. Remove AbortBufferIO() declaration from bufmgr.c (already declared in bufmgr.h) Manfred Koizar	2002-07-02 05:47:37 +00:00
Bruce Momjian	d84fe82230	Update copyright to 2002.	2002-06-20 20:29:54 +00:00
Bruce Momjian	6e8a1a6717	WriteBuffer return value: >I'd vote for changing WriteBuffer to >return void, and have it elog() on bad argument. Manfred Koizar	2002-06-15 19:59:59 +00:00
Bruce Momjian	918e864f14	Remove some pre-WAL relics: SharedBufferChanged BufferRelidLastDirtied BufferTagLastDirtied BufferDirtiedByMe Manfred Koizar	2002-06-15 19:55:38 +00:00
Jan Wieck	469cb65aca	Katherine Ward wrote: > Changes to avoid collisions with WIN32 & MFC names... > 1. Renamed: > a. PROC => PGPROC > b. GetUserName() => GetUserNameFromId() > c. GetCurrentTime() => GetCurrentDateTime() > d. IGNORE => IGNORE_DTF in include/utils/datetime.h & utils/adt/datetim > > 2. Added _P to some lex/yacc tokens: > CONST, CHAR, DELETE, FLOAT, GROUP, IN, OUT Jan	2002-06-11 13:40:53 +00:00
Tom Lane	3f4d488022	Mark index entries "killed" when they are no longer visible to any transaction, so as to avoid returning them out of the index AM. Saves repeated heap_fetch operations on frequently-updated rows. Also detect queries on unique keys (equality to all columns of a unique index), and don't bother continuing scan once we have found first match. Killing is implemented in the btree and hash AMs, but not yet in rtree or gist, because there isn't an equally convenient place to do it in those AMs (the outer amgetnext routine can't do it without re-pinning the index page). Did some small cleanup on APIs of HeapTupleSatisfies, heap_fetch, and index_insert to make this a little easier.	2002-05-24 18:57:57 +00:00
Tom Lane	959e61e917	Remove global variable scanCommandId in favor of storing a command ID in snapshots, per my proposal of a few days ago. Also, tweak heapam.c routines (heap_insert, heap_update, heap_delete, heap_mark4update) to be passed the command ID to use, instead of doing GetCurrentCommandID. For catalog updates they'll still get passed current command ID, but for updates generated from the main executor they'll get passed the command ID saved in the snapshot the query is using. This should fix some corner cases associated with functions and triggers that advance current command ID while an outer query is still in progress.	2002-05-21 22:05:55 +00:00
Tom Lane	44fbe20d62	Restructure indexscan API (index_beginscan, index_getnext) per yesterday's proposal to pghackers. Also remove unnecessary parameters to heap_beginscan, heap_rescan. I modified pg_proc.h to reflect the new numbers of parameters for the AM interface routines, but did not force an initdb because nothing actually looks at those fields.	2002-05-20 23:51:44 +00:00
Tom Lane	72a3902a66	Create an internal semaphore API that is not tied to SysV semaphores. As proof of concept, provide an alternate implementation based on POSIX semaphores. Also push the SysV shared-memory implementation into a separate file so that it can be replaced conveniently.	2002-05-05 00:03:29 +00:00
Tom Lane	1a69a37d5b	Fix obsolete comments.	2002-05-03 17:42:11 +00:00
Tom Lane	c2def1b128	Fix backslash-n typo, per Joe Conway.	2002-05-02 21:44:43 +00:00
Bruce Momjian	171824087c	The patch I sent to -patches a little while ago wasn't applied: it was in the thread "make BufferGetBlockNumber() a macro". Tom objected to the original patch, so I prepared a new one which doesn't change BufferGetBlockNumber() into a macro, it just cleans up some comments and fixes an assertion. The patch is attached. Neil Conway	2002-04-15 23:47:12 +00:00
Bruce Momjian	33d1bb76c6	The attached patch corrects an inaccuracy in src/backend/catalog/README and fixes a few spelling mistakes in src/bakckend/lmgr/README. Neil Conway	2002-04-15 23:46:13 +00:00
Bruce Momjian	b73859db8c	Patch against 7.2.1 sources. Uses Solaris Intimate Shared Memory for Solaris on SPARC. Scott Brunza (sbrunza@sonalysts.com) gets credit for identifying the issue, making the change, and doing the regression tests. Earlier testing on 7.2rc2 and 7.2 showed performance gains of 1% to 10% on pgbench, osdb-pg, and some locally developed apps. Solaris Intimate Shared Memory is described in "SOLARIS INTERNALS Core Kernel Components" by Jim Mauro and Richard McDougall, Copyright 2001 Sun Microsystem, Inc. ISBN 0-13-022496-0 P.J. "Josh" Rovero	2002-04-13 19:52:51 +00:00
Bruce Momjian	3cbe6b2478	Looks like a small patch is needed as well to do the right thing on Linux. The patch enables the mips2 ISA for the ll/sc operations, and then restores it when done. The kernel/libc emulation code will take over on CPUs without ll/sc, and on CPUs with it, it'll use the operations provided by the CPU. Combined with the earlier fix (removing -mips2), postgresql builds again on mips and mipsel. The patch is against 7.2-7. Oliver Elphick	2002-04-05 11:38:13 +00:00
Bruce Momjian	92288a1cf9	Change made to elog: o Change all current CVS messages of NOTICE to WARNING. We were going to do this just before 7.3 beta but it has to be done now, as you will see below. o Change current INFO messages that should be controlled by client_min_messages to NOTICE. o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc. to always go to the client. o Remove INFO from the client_min_messages options and add NOTICE. Seems we do need three non-ERROR elog levels to handle the various behaviors we need for these messages. Regression passed.	2002-03-06 06:10:59 +00:00
Tom Lane	cfae62c476	Some kibitzing about appropriate elog levels for sinval messages.	2002-03-02 23:35:57 +00:00
Bruce Momjian	a033daf566	Commit to match discussed elog() changes. Only update is that LOG is now just below FATAL in server_min_messages. Added more text to highlight ordering difference between it and client_min_messages. --------------------------------------------------------------------------- REALLYFATAL => PANIC STOP => PANIC New INFO level the prints to client by default New LOG level the prints to server log by default Cause VACUUM information to print only to the client NOTICE => INFO where purely information messages are sent DEBUG => LOG for purely server status messages DEBUG removed, kept as backward compatible DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added DebugLvl removed in favor of new DEBUG[1-5] symbols New server_min_messages GUC parameter with values: DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC New client_min_messages GUC parameter with values: DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC Server startup now logged with LOG instead of DEBUG Remove debug_level GUC parameter elog() numbers now start at 10 Add test to print error message if older elog() values are passed to elog() Bootstrap mode now has a -d that requires an argument, like postmaster	2002-03-02 21:39:36 +00:00
Tom Lane	d99fb0d909	Don't Assert() that fsync() and close() never fail; I have seen this crash on Solaris when over disk quota. Instead, report such failures via elog(DEBUG).	2002-02-10 22:56:31 +00:00
Tom Lane	bef0c8dc29	Add cast to suppress gcc warning on Darwin platform.	2002-01-30 19:34:55 +00:00
Tom Lane	386f1809a7	Fix logic error in insert_fsm_page_entry: because compact_fsm_page_list removes any empty chunks, the chunk previously added won't be there anymore, so it's possible there is zero free space in the rel's page list afterwards. Must loop back and rerun the part that adds a chunk to the list.	2002-01-24 15:31:43 +00:00
Tom Lane	aa00e6134e	Add more sanity-checking to PageAddItem and PageIndexTupleDelete, to prevent spreading of corruption when page header pointers are bad. Merge PageZero into PageInit, since it was never used separately, and remove separate memset calls used at most other PageInit call points. Remove IndexPageCleanup, which wasn't used at all.	2002-01-15 22:14:17 +00:00
Tom Lane	5b9a058384	Tweak LWLock algorithms so that an awakened waiter for a lock is not granted the lock when awakened; the signal now only means that the lock is potentially available. The waiting process must retry its attempt to get the lock when it gets to run. This allows the lock releasing process to re-acquire the lock later in its timeslice. Since LWLocks are usually held for short periods, it is possible for a process to acquire and release the same lock many times in a timeslice. The old spinlock-based implementation of these locks allowed for that; but the original coding of LWLock would force a process swap for each acquisition if there was any contention. Although this approach reopens the door to process starvation (a waiter might repeatedly fail to get the lock), the odds of that being a big problem seem low, and the performance cost of the previous approach is considerable.	2002-01-07 16:33:00 +00:00
Bruce Momjian	6f901b6f5a	Oops, only wanted datetime.c changes in there. lock stuff reversed out.	2001-12-29 21:30:32 +00:00
Bruce Momjian	9e7b9c6f54	Fix newly introduced datetime.c compile failure; not enough parens.	2001-12-29 21:28:18 +00:00
Tom Lane	198152730b	Improve LOCK_DEBUG logging code for LWLocks.	2001-12-28 23:26:04 +00:00
Tom Lane	d3fc362ec2	Ensure that all direct uses of spinlock-protected data structures use 'volatile' pointers to access those structures, so that optimizing compilers will not decide to move the structure accesses outside of the spinlock-acquire-to-spinlock-release sequence. There are no known bugs in these uses at present, but based on bad experience with lwlock.c, it seems prudent to ensure that we protect these other uses too. Per pghackers discussion around 12-Dec. (Note: it should not be necessary to worry about structures protected by LWLocks, since the LWLock acquire and release operations are not inline macros.)	2001-12-28 18:16:43 +00:00
Tom Lane	584f818bef	Declare LWLock pointers as volatile to prevent AIX compiler from reordering operations at its whim. Releasing TAS lock before we've finished updating proc structure is uncool.	2001-12-10 21:13:50 +00:00
Tom Lane	f6ee99a062	Clean up usage-statistics display code (ShowUsage and friends). StatFp is gone, usage messages now go through elog(DEBUG).	2001-11-10 23:51:14 +00:00
Bruce Momjian	77e4fd889c	Fix indenting for 'extern "C"' cases.	2001-11-08 20:37:52 +00:00
Tom Lane	64af43a15f	Add casts to suppress compiler warnings observed on Darwin platform (surprised no one has reported these yet...)	2001-11-08 04:05:13 +00:00
Tom Lane	ca7578d454	The extra semaphore that proc.c now allocates for checkpoint processes should be accounted for in the PROC_SEM_MAP_ENTRIES() macro. Otherwise the ports that rely on this macro to size data structures are broken. Mea culpa.	2001-11-06 00:38:26 +00:00
Bruce Momjian	ea08e6cd55	New pgindent run with fixes suggested by Tom. Patch manually reviewed, initdb/regression tests pass.	2001-11-05 17:46:40 +00:00
Tom Lane	d556920a98	Remove ill-considered Assert.	2001-11-05 01:34:37 +00:00
Tom Lane	fb5f1b2c13	Merge three existing ways of signaling postmaster from child processes, so that only one signal number is used not three. Flags in shared memory tell the reason(s) for the current signal. This method is extensible to handle more signal reasons without chewing up even more signal numbers, but the immediate reason is to keep pg_pwd reloads separate from SIGHUP processing in the postmaster. Also clean up some problems in the postmaster with delayed response to checkpoint status changes --- basically, it wouldn't schedule a checkpoint if it wasn't getting connection requests on a regular basis.	2001-11-04 19:55:31 +00:00
Bruce Momjian	c41b6b1b9c	Fix small problem Tom Lane found with pgindent run.	2001-10-30 05:38:56 +00:00
Bruce Momjian	6783b2372e	Another pgindent run. Fixes enum indenting, and improves #endif spacing. Also adds space for one-line comments.	2001-10-28 06:26:15 +00:00
Bruce Momjian	b81844b173	pgindent run on all C files. Java run to follow. initdb/regression tests pass.	2001-10-25 05:50:21 +00:00
Tom Lane	087771ae40	Add error checking to PageRepairFragmentation to ensure that it can never overwrite adjacent pages with copied data, even if page header and/or item pointers are already corrupt. Change inspired by trouble report from Alvaro Herrera.	2001-10-23 02:20:15 +00:00
Tom Lane	8a52b893b3	Further cleanup of dynahash.c API, in pursuit of portability and readability. Bizarre '(long *) TRUE' return convention is gone, in favor of just raising an error internally in dynahash.c when we detect hashtable corruption. HashTableWalk is gone, in favor of using hash_seq_search directly, since it had no hope of working with non-LONGALIGNable datatypes. Simplify some other code that was made undesirably grotty by promixity to HashTableWalk.	2001-10-05 17:28:13 +00:00
Tom Lane	c7a7107f41	Revise shmget() and semget() failure messages to mention the possibility of coping by reducing shared_buffers/max_connections settings.	2001-10-01 23:26:55 +00:00
Tom Lane	0648d78ac4	Make inclusion logic for sys/sem.h and sys/ipc.h consistent across all the files that need them. Per trouble report from Teodor.	2001-10-01 18:16:35 +00:00
Bruce Momjian	77d2622498	Add sys/types.h for FreeBSD compile. Teodor Sigaev	2001-10-01 17:52:34 +00:00
Tom Lane	5999e78fc4	Another round of cleanups for dynahash.c (maybe it's finally clean of portability issues). Caller-visible data structures are now allocated on MAXALIGN boundaries, allowing safe use of datatypes wider than 'long'. Rejigger hash_create API so that caller specifies size of key and total size of entry, not size of key and size of rest of entry. This simplifies life considerably since each number is just a sizeof(), and padding issues etc. are taken care of automatically.	2001-10-01 05:36:17 +00:00
Tom Lane	f9f258281e	Create a GUC parameter max_files_per_process that is a configurable upper limit on what we will believe from sysconf(_SC_OPEN_MAX). The default value is 1000, so that under ordinary conditions it won't affect the behavior. But on platforms where the kernel promises far more than it can deliver, this can be used to prevent running out of file descriptors. See numerous past discussions, eg, pgsql-hackers around 23-Dec-2000.	2001-09-30 18:57:45 +00:00
Bruce Momjian	0386ccfed1	Back out change. Too many place to change too close to beta: * HOLDER/HOLDERTAB rename to PROCLOCKLINK/PROCLOCKLINKTAG (Bruce) Will return later.	2001-09-30 00:45:48 +00:00
Bruce Momjian	f738747494	Do this TODO item: * HOLDER/HOLDERTAB rename to PROCLOCK/PROCLOCKTAG (Tom) Didn't use PROCLOCKLINK because it made PROCLOCKLINKTAG too long.	2001-09-29 21:35:14 +00:00
Tom Lane	2a314add00	Whoops, I was a tad too enthusiastic about using shared lock mode for SInvalLock. GetSnapshotData(true) has to use exclusive lock, since it sets MyProc->xmin.	2001-09-29 15:29:48 +00:00
Tom Lane	499abb0c0f	Implement new 'lightweight lock manager' that's intermediate between existing lock manager and spinlocks: it understands exclusive vs shared lock but has few other fancy features. Replace most uses of spinlocks with lightweight locks. All remaining uses of spinlocks have very short lock hold times (a few dozen instructions), so tweak spinlock backoff code to work efficiently given this assumption. All per my proposal on pghackers 26-Sep-01.	2001-09-29 04:02:27 +00:00
Tom Lane	90aebf7f52	Move s_lock.c and spin.c into lmgr subdirectory, which seems a much more reasonable location for them.	2001-09-27 19:10:02 +00:00
Tom Lane	3d59ad00e8	Remove useless LockDisable() function and associated overhead, per my proposal of 26-Aug.	2001-09-27 16:29:13 +00:00
Tom Lane	35b7601b04	Add an overall timeout on the client authentication cycle, so that a hung client or lost connection can't indefinitely block a postmaster child (not to mention the possibility of deliberate DoS attacks). Timeout is controlled by new authentication_timeout GUC variable, which I set to 60 seconds by default ... does that seem reasonable?	2001-09-21 17:06:12 +00:00
Tom Lane	863aceb54f	Get rid of PID entries in shmem hash table; there is no longer any need for them, and making them just wastes time during backend startup/shutdown. Also, remove compile-time MAXBACKENDS limit per long-ago proposal. You can now set MaxBackends as high as your kernel can stand without any reconfiguration/recompilation.	2001-09-07 00:27:30 +00:00
Tom Lane	763554393a	Fix code so that we recover cleanly if there are no free semaphores available in freeSemMap. As noted by Tatsuo, this is now a likely scenario for detecting MaxBackends-exceeded; if MaxBackends is a multiple of PROC_NSEMS_PER_SET then we will fail here and not in sinval.c. The cleanup path did not work correctly before, anyway.	2001-09-04 21:42:17 +00:00
Tom Lane	b553cba15a	Clean up the lock state properly when aborting because of early deadlock detection in ProcSleep(). Bug noted by Tomasz Zielonka --- how did this escape detection for this long??	2001-09-04 02:26:57 +00:00
Peter Eisentraut	3c59a9e3b7	Bring references to ipcclean in sync with reality.	2001-09-04 00:22:34 +00:00
Peter Eisentraut	b1a38a4380	Install the SQL command man pages into a section appropriate for each system. Some systems did not understand the 'l' section, and in general it wasn't entirely appropriate. On SCO OpenServer, the man pages won't be installed at all until someone figures out their man system.	2001-08-29 19:14:40 +00:00
Peter Eisentraut	f45b7270b6	Whoops, wrong logic.	2001-08-29 11:54:12 +00:00
Peter Eisentraut	dd225655b9	Change the conditionals so the mips + gcc code here doesn't apply for Irix. The code in s_lock.h should get used. report from Bruno Mattarollo <bruno@web1.greenpeace.org>	2001-08-28 15:04:27 +00:00
Tom Lane	bc7d37a525	Transaction IDs wrap around, per my proposal of 13-Aug-01. More documentation to come, but the code is all here. initdb forced.	2001-08-26 16:56:03 +00:00
Tom Lane	2589735da0	Replace implementation of pg_log as a relation accessed through the buffer manager with 'pg_clog', a specialized access method modeled on pg_xlog. This simplifies startup (don't need to play games to open pg_log; among other things, OverrideTransactionSystem goes away), should improve performance a little, and opens the door to recycling commit log space by removing no-longer-needed segments of the commit log. Actual recycling is not there yet, but I felt I should commit this part separately since it'd still be useful if we chose not to do transaction ID wraparound.	2001-08-25 18:52:43 +00:00
Peter Eisentraut	968d7733a1	Rename config.h to pg_config.h and os.h to pg_config_os.h, fix a number of places that were including the wrong files.	2001-08-24 14:07:50 +00:00
Tom Lane	7326e78c42	Ensure that all TransactionId comparisons are encapsulated in macros (TransactionIdPrecedes, TransactionIdFollows, etc). First step on the way to transaction ID wrap solution ...	2001-08-23 23:06:38 +00:00
Tom Lane	ef6ccb0bcc	Cleanup some minor oversights in optional-OIDs stuff.	2001-08-10 20:52:25 +00:00
Bruce Momjian	3e51868226	This patch is because Hurd does not support NOFILE. It is against current cvs. The Debian bug report says, "The upstream source makes use of NOFILE unconditionalized. As the Hurd doesn't have an arbitrary limit on the number of open files, this is not defined. But _SC_OPEN_MAX works fine and returns 1024 (applications can increase this as they want), so I suggest the below diff. Please forward this upstream, too." Oliver Elphick	2001-08-04 19:42:34 +00:00
Tom Lane	8a59f336bb	Minor performance improvement in MultiRecordFreeSpace.	2001-07-19 21:25:37 +00:00
Tom Lane	ed5c4e4a14	Improve documentation about reasoning behind the order of operations in GetSnapshotData, GetNewTransactionId, CommitTransaction, AbortTransaction, etc. Correct race condition in transaction status testing in HeapTupleSatisfiesVacuum --- this wasn't important for old VACUUM with exclusive lock on its table, but it sure is important now. All per pghackers discussion 7/11/01 and 7/12/01.	2001-07-16 22:43:34 +00:00
Tom Lane	b9f3a929ee	Create a new HeapTupleSatisfiesVacuum() routine in tqual.c that embodies the validity checking rules for VACUUM. Make some other rearrangements of the VACUUM code to allow more code to be shared between full and lazy VACUUM. Minor code cleanups and added comments for TransactionId manipulations.	2001-07-12 04:11:13 +00:00
Tom Lane	4fe42dfbc3	Add SHARE UPDATE EXCLUSIVE lock mode, coming soon to a VACUUM near you. Name chosen per pghackers discussion around 6/22/01.	2001-07-09 22:18:34 +00:00
Tom Lane	55432fedd2	Implement LockBufferForCleanup(), which will allow concurrent VACUUM to wait until it's safe to remove tuples and compact free space in a shared buffer page. Miscellaneous small code cleanups in bufmgr, too.	2001-07-06 21:04:26 +00:00
Tom Lane	42748087c1	First non-stub implementation of shared free space map. It's not super useful as yet, since its primary source of information is (full) VACUUM, which makes a concerted effort to get rid of free space before telling the map about it ... next stop is concurrent VACUUM ...	2001-07-02 20:50:46 +00:00
Tom Lane	a29f6c095c	Make the found-a-buffer-when-we-were-expecting-to-extend-the-rel path actually work. It had been throwing an Assert as of my recent changes to bufmgr.c, but was not really right even before that AFAICT.	2001-07-02 18:47:18 +00:00
Tom Lane	af5ced9cfd	Further work on connecting the free space map (which is still just a stub) into the rest of the system. Adopt a cleaner approach to preventing deadlock in concurrent heap_updates: allow RelationGetBufferForTuple to select any page of the rel, and put the onus on it to lock both buffers in a consistent order. Remove no-longer-needed isExtend hack from API of ReleaseAndReadBuffer.	2001-06-29 21:08:25 +00:00
Tom Lane	e0c9301c87	Install infrastructure for shared-memory free space map. Doesn't actually do anything yet, but it has the necessary connections to initialization and so forth. Make some gestures towards allowing number of blocks in a relation to be BlockNumber, ie, unsigned int, rather than signed int. (I doubt I got all the places that are sloppy about it, yet.) On the way, replace the hardwired NLOCKS_PER_XACT fudge factor with a GUC variable.	2001-06-27 23:31:40 +00:00
Jan Wieck	8d80b0d980	Statistical system views (yet without the config stuff, but it's hard to keep such massive changes in sync with the tree so I need to get it in and work from there now). Jan	2001-06-22 19:16:24 +00:00
Tom Lane	d8d9ed931e	Add support to lock manager for conditionally locking a lock (ie, return without waiting if we can't get the lock immediately). Not used yet, but will be needed for concurrent VACUUM.	2001-06-22 00:04:59 +00:00
Tom Lane	bbbc00af88	Clean up some longstanding problems in shared-cache invalidation. SI messages now include the relevant database OID, so that operations in one database do not cause useless cache flushes in backends attached to other databases. Declare SI messages properly using a union, to eliminate the former assumption that Oid is the same size as int or Index. Rewrite the nearly-unreadable code in inval.c, and document it better. Arrange for catcache flushes at end of command/transaction to happen before relcache flushes do --- this avoids loading a new tuple into the catcache while setting up new relcache entry, only to have it be flushed again immediately.	2001-06-19 19:42:16 +00:00
Bruce Momjian	49ce6fff1d	Allow removal of system-named pg_* temp tables. Rename temp file/dir as pgsql_tmp.	2001-06-18 16:13:21 +00:00
Tom Lane	2917f0a5dd	Tweak startup sequence so that running out of PROC array slots is detected sooner in backend startup, and is treated as an expected error (it gives 'Sorry, too many clients already' now). This allows us not to have to enforce the MaxBackends limit exactly in the postmaster. Also, remove ProcRemove() and fold its functionality into ProcKill(). There's no good reason for a backend not to be responsible for removing its PROC entry, and there are lots of good reasons for the postmaster not to be touching shared-memory data structures.	2001-06-16 22:58:17 +00:00
Tom Lane	1d584f97b9	Clean up various to-do items associated with system indexes: pg_database now has unique indexes on oid and on datname. pg_shadow now has unique indexes on usename and on usesysid. pg_am now has unique index on oid. pg_opclass now has unique index on oid. pg_amproc now has unique index on amid+amopclaid+amprocnum. Remove pg_rewrite's unnecessary index on oid, delete unused RULEOID syscache. Remove index on pg_listener and associated syscache for performance reasons (caching rows that are certain to change before you need 'em again is rather pointless). Change pg_attrdef's nonunique index on adrelid into a unique index on adrelid+adnum. Fix various incorrect settings of pg_class.relisshared, make that the primary reference point for whether a relation is shared or not. IsSharedSystemRelationName() is now only consulted to initialize relisshared during initial creation of tables and indexes. In theory we might now support shared user relations, though it's not clear how one would get entries for them into pg_class &etc of multiple databases. Fix recently reported bug that pg_attribute rows created for an index all have the same OID. (Proof that non-unique OID doesn't matter unless it's actually used to do lookups ;-)) There's no need to treat pg_trigger, pg_attrdef, pg_relcheck as bootstrap relations. Convert them into plain system catalogs without hardwired entries in pg_class and friends. Unify global.bki and template1.bki into a single init script postgres.bki, since the alleged distinction between them was misleading and pointless. Not to mention that it didn't work for setting up indexes on shared system relations. Rationalize locking of pg_shadow, pg_group, pg_attrdef (no need to use AccessExclusiveLock where ExclusiveLock or even RowExclusiveLock will do). Also, hold locks until transaction commit where necessary.	2001-06-12 05:55:50 +00:00
Tom Lane	2a6f7ac456	Move temporary files into 'pg_tempfiles' subdirectory of each database directory (which can be made a symlink to put temp files on another disk). Add code to delete leftover temp files during postmaster startup. Bruce, with some kibitzing from Tom.	2001-06-11 04:12:29 +00:00
Tom Lane	bdadc9bf1c	Remove RelationGetBufferWithBuffer(), which is horribly confused about appropriate pin-count manipulation, and instead use ReleaseAndReadBuffer. Make use of the fact that the passed-in buffer (if there is one) must be pinned to avoid grabbing the bufmgr spinlock when we are able to return this same buffer. Eliminate unnecessary 'previous tuple' and 'next tuple' fields of HeapScanDesc and IndexScanDesc, thereby removing a whole lot of bookkeeping from heap_getnext() and related routines.	2001-06-09 18:16:59 +00:00
Tom Lane	1173344e74	Adjust WAL code so that checkpoints truncate the xlog at the previous checkpoint's redo pointer, not its undo pointer, per discussion in pghackers a few days ago. No point in hanging onto undo information until we have the ability to do something with it --- and this solves a rather large problem with log space for long-running transactions. Also, change all calls of write() to detect the case where write returned a count less than requested, but failed to set errno. Presume that this situation indicates ENOSPC, and give the appropriate error message, rather than a random message associated with the previous value of errno.	2001-06-06 17:07:46 +00:00
Tom Lane	ddd96e1f21	Guard against malloc failure. Also, don't examine segP->lastBackend until we hold the spinlock.	2001-06-01 20:07:16 +00:00
Bruce Momjian	33f2614aa1	Remove SEP_CHAR, replace with / or '/' as appropriate.	2001-05-30 14:15:27 +00:00
Bruce Momjian	f6923ff3ac	Oops, only wanted python change in the last commit. Backing out.	2001-05-25 15:45:34 +00:00
Bruce Momjian	dffb673692	While changing Cygwin Python to build its core as a DLL (like Win32 Python) to support shared extension modules, I have learned that Guido prefers the style of the attached patch to solve the above problem. I feel that this solution is particularly appropriate in this case because the following: PglargeType PgType PgQueryType are already being handled in the way that I am proposing for PgSourceType. Jason Tishler	2001-05-25 15:34:50 +00:00
Bruce Momjian	dc0ff5c67a	Small code cleanups,formatting.	2001-05-18 21:24:20 +00:00
Tom Lane	eedb7d18fa	Modify RelationGetBufferForTuple() so that we only do lseek and lock when we need to move to a new page; as long as we can insert the new tuple on the same page as before, we only need LockBuffer and not the expensive stuff. Also, twiddle bufmgr interfaces to avoid redundant lseeks in RelationGetBufferForTuple and BufferAlloc. Successive inserts now require one lseek per page added, rather than one per tuple with several additional ones at each page boundary as happened before. Lock contention when multiple backends are inserting in same table is also greatly reduced.	2001-05-12 19:58:28 +00:00
Tom Lane	642107d5ba	Avoid unnecessary lseek() calls by cleanups in md.c. mdfd_lstbcnt was not being consulted anywhere, so remove it and remove the _mdnblocks() calls that were used to set it. Change smgrextend interface to pass in the target block number (ie, current file length) --- the caller always knows this already, having already done smgrnblocks(), so it's silly to do it over again inside mdextend. Net result: extension of a file now takes one lseek(SEEK_END) and a write(), not three lseeks and a write.	2001-05-10 20:38:49 +00:00
Bruce Momjian	82c9ce2c40	Small cleanup.	2001-05-08 19:00:26 +00:00
Bruce Momjian	415263b2d2	> Occasionally and without warning I get this from my daily vacuum > cronjob: > NOTICE: RegisterSharedInvalid: SI buffer overflow > NOTICE: InvalidateSharedInvalid: cache state reset > I don't understand what these mean. Should I be concerned about them > and what do they signify? No real need to worry. Those should've been downgraded to DEBUG-level messages a release or two back, but nobody bothered... Tom Lane	2001-05-07 17:20:19 +00:00
Tom Lane	08bf4d797b	Check for failure of malloc() and realloc() when allocating space for VFD entries. On platforms where dereferencing a null pointer doesn't lead to coredump, it's possible that this omission could have led to unpleasant behavior like deleting the wrong file.	2001-04-03 04:07:02 +00:00
Tom Lane	6cc6f18d15	open(2) flags saved for re-opening a virtual file should probably not include O_CREAT.	2001-04-03 02:31:52 +00:00
Tom Lane	244fd47124	_mdfd_getrelnfd() should include kernel error code in failure message.	2001-04-02 23:20:24 +00:00
Tom Lane	ff71301806	Spell __volatile__ correctly.	2001-03-27 01:16:24 +00:00
Tom Lane	ccd415c63f	Fix unportable assumptions about alignment of local char[n] variables.	2001-03-25 23:23:59 +00:00
Bruce Momjian	7cf952e7b4	Fix comments that were mis-wrapped, for Tom Lane.	2001-03-23 04:49:58 +00:00
Bruce Momjian	0686d49da0	Remove dashes in comments that don't need them, rewrap with pgindent.	2001-03-22 06:16:21 +00:00
Bruce Momjian	9e1552607a	pgindent run. Make it all clean.	2001-03-22 04:01:46 +00:00
Vadim B. Mikheev	ab36582a19	Check bufHdr->cntxDirty and call StartBufferIO in BufferSync() before acquiring shlock on buffer context. This way we should be protected against conflicts with FlushRelationBuffers. (Seems we never do excl lock and then StartBufferIO for the same buffer, so there should be no deadlock here, - but we'd better check this very soon).	2001-03-21 10:13:29 +00:00
Tom Lane	af6e88a9cf	Remove NEXTXID xlog record type to avoid three-way deadlock risk. NEXTXID isn't really necessary, per previous discussion in pghackers, but I mulishy insisted we should put it in anyway. Mea culpa.	2001-03-18 20:18:59 +00:00
Tom Lane	ddc5bc958a	When we add 'waiting' to the ps_status display, there should be a space in front of it. Improve comments a little.	2001-03-18 20:13:13 +00:00
Bruce Momjian	9de4b77cee	'waiting' status display had extra space, removed. Change the administrator to 'an' administrator.	2001-03-14 18:24:34 +00:00
Tom Lane	4d14fe0048	XLOG (and related) changes: * Store two past checkpoint locations, not just one, in pg_control. On startup, we fall back to the older checkpoint if the newer one is unreadable. Also, a physical copy of the newest checkpoint record is kept in pg_control for possible use in disaster recovery (ie, complete loss of pg_xlog). Also add a version number for pg_control itself. Remove archdir from pg_control; it ought to be a GUC parameter, not a special case (not that it's implemented yet anyway). * Suppress successive checkpoint records when nothing has been entered in the WAL log since the last one. This is not so much to avoid I/O as to make it actually useful to keep track of the last two checkpoints. If the things are right next to each other then there's not a lot of redundancy gained... * Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs on alternate bytes. Polynomial borrowed from ECMA DLT1 standard. * Fix XLOG record length handling so that it will work at BLCKSZ = 32k. * Change XID allocation to work more like OID allocation. (This is of dubious necessity, but I think it's a good idea anyway.) * Fix a number of minor bugs, such as off-by-one logic for XLOG file wraparound at the 4 gig mark. * Add documentation and clean up some coding infelicities; move file format declarations out to include files where planned contrib utilities can get at them. * Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also possible to force a checkpoint by sending SIGUSR1 to the postmaster (undocumented feature...) * Defend against kill -9 postmaster by storing shmem block's key and ID in postmaster.pid lockfile, and checking at startup to ensure that no processes are still connected to old shmem block (if it still exists). * Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency stop, for symmetry with postmaster and xlog utilities. Clean up signal handling in bootstrap.c so that xlog utilities launched by postmaster will react to signals better. * Standalone bootstrap now grabs lockfile in target directory, as added insurance against running it in parallel with live postmaster.	2001-03-13 01:17:06 +00:00
Tom Lane	9c9936587c	Implement COMMIT_SIBLINGS parameter to allow pre-commit delay to occur only if at least N other backends currently have open transactions. This is not a great deal of intelligence about whether a delay might be profitable ... but it beats no intelligence at all. Note that the default COMMIT_DELAY is still zero --- this new code does nothing unless that setting is changed. Also, mark ENABLEFSYNC as a system-wide setting. It's no longer safe to allow that to be set per-backend, since we may be relying on some other backend's fsync to have synced the WAL log.	2001-02-26 00:50:08 +00:00
Tom Lane	496ea7a876	At least on HPUX, select with delay.tv_sec = 0 and delay.tv_usec = 1000000 does not lead to a one-second delay, but to an immediate EINVAL failure. This causes CHECKPOINT to crash with s_lock_stuck much too quickly :-(. Fix by breaking down the requested wait div/mod 1e6.	2001-02-24 22:42:45 +00:00
Tom Lane	e74ce0a566	As long as we're fixing this space calculation, let's actually do it right. We should MAXALIGN the individual items because we'll allocate them individually, not as an array.	2001-02-23 20:12:37 +00:00
Bruce Momjian	81b48493aa	Bruce Momjian <pgman@candle.pha.pa.us> writes: > Is there one LOCKMETHODCTL for every backend? I thought there was only > one of them. >> >> You're right, that line is erroneous; it should read >> >> size += MAX_LOCK_METHODS * MAXALIGN(sizeof(LOCKMETHODCTL)); >> >> Not a significant error but it should be changed for clarity ...	2001-02-23 18:28:46 +00:00
Bruce Momjian	a95ac415f7	More comment cleanups.	2001-02-22 23:20:06 +00:00

... 5 6 7 8 9 ...

1118 Commits