postgresql

Commit Graph

Author	SHA1	Message	Date
Heikki Linnakangas	7e8b0b9ab1	Change error messages to print the physical path, like "base/11517/3767_fsm", instead of symbolic names like "1663/11517/3767/1", per Alvaro's suggestion. I didn't change the messages in the higher-level index, heap and FSM routines, though, where the fork is implicit.	2008-11-11 13:19:16 +00:00
Heikki Linnakangas	15c121b3ed	Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the free space information is stored in a dedicated FSM relation fork, with each relation (except for hash indexes; they don't use FSM). This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any trace of them from the backend, initdb, and documentation. Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also introduce a new variant of the get_raw_page(regclass, int4, int4) function in contrib/pageinspect that let's you to return pages from any relation fork, and a new fsm_page_contents() function to inspect the new FSM pages.	2008-09-30 10:52:14 +00:00
Heikki Linnakangas	3f0e808c4a	Introduce the concept of relation forks. An smgr relation can now consist of multiple forks, and each fork can be created and grown separately. The bulk of this patch is about changing the smgr API to include an extra ForkNumber argument in every smgr function. Also, smgrscheduleunlink and smgrdounlink no longer implicitly call smgrclose, because other forks might still exist after unlinking one. The callers of those functions have been modified to call smgrclose instead. This patch in itself doesn't have any user-visible effect, but provides the infrastructure needed for upcoming patches. The additional forks envisioned are a rewritten FSM implementation that doesn't rely on a fixed-size shared memory block, and a visibility map to allow skipping portions of a table in VACUUM that have no dead tuples.	2008-08-11 11:05:11 +00:00
Heikki Linnakangas	a213f1ee6c	Refactor XLogOpenRelation() and XLogReadBuffer() in preparation for relation forks. XLogOpenRelation() and the associated light-weight relation cache in xlogutils.c is gone, and XLogReadBuffer() now takes a RelFileNode as argument, instead of Relation. For functions that still need a Relation struct during WAL replay, there's a new function called CreateFakeRelcacheEntry() that returns a fake entry like XLogOpenRelation() used to.	2008-06-12 09:12:31 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	6cc4451b5c	Prevent re-use of a deleted relation's relfilenode until after the next checkpoint. This guards against an unlikely data-loss scenario in which we re-use the relfilenode, then crash, then replay the deletion and recreation of the file. Even then we'd be OK if all insertions into the new relation had been WAL-logged ... but that's not guaranteed given all the no-WAL-logging optimizations that have recently been added. Patch by Heikki Linnakangas, per a discussion last month.	2007-11-15 20:36:40 +00:00
Tom Lane	295e63983d	Implement lazy XID allocation: transactions that do not modify any database rows will normally never obtain an XID at all. We already did things this way for subtransactions, but this patch extends the concept to top-level transactions. In applications where there are lots of short read-only transactions, this should improve performance noticeably; not so much from removal of the actual XID-assignments, as from reduction of overhead that's driven by the rate of XID consumption. We add a concept of a "virtual transaction ID" so that active transactions can be uniquely identified even if they don't have a regular XID. This is a much lighter-weight concept: uniqueness of VXIDs is only guaranteed over the short term, and no on-disk record is made about them. Florian Pflug, with some editorialization by Tom.	2007-09-05 18:10:48 +00:00
Tom Lane	04fbe29a83	Fix WAL replay of truncate operations to cope with the possibility that the truncated relation was deleted later in the WAL sequence. Since replay normally auto-creates a relation upon its first reference by a WAL log entry, failure is seen only if the truncate entry happens to be the first reference after the checkpoint we're restarting from; which is a pretty unusual case but of course not impossible. Fix by making truncate entries auto-create like the other ones do. Per report and test case from Dharmendra Goyal.	2007-07-20 16:29:53 +00:00
Tom Lane	b09cb0cf12	Remove the pgstat_drop_relation() call from smgr_internal_unlink(), because we don't know at that point which relation OID to tell pgstat to forget. The code was passing the relfilenode, which is incorrect, and could possibly cause some other relation's stats to be zeroed out. While we could try to clean this up, it seems much simpler and more reliable to let the next invocation of pgstat_vacuum_tabstat() fix things; which indeed is how it worked before I introduced the buggy code into 8.1.3 and later :-(. Problem noticed by Itagaki Takahiro, fix is per subsequent discussion.	2007-07-08 22:23:16 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Tom Lane	ef07221997	Clean up smgr.c/md.c APIs as per discussion a couple months ago. Instead of having md.c return a success/failure boolean to smgr.c, which was just going to elog anyway, let md.c issue the elog messages itself. This allows better error reporting, particularly in cases such as "short read" or "short write" which Peter was complaining of. Also, remove the kluge of allowing mdread() to return zeroes from a read-beyond-EOF: this is now an error condition except when InRecovery or zero_damaged_pages = true. (Hash indexes used to require that behavior, but no more.) Also, enforce that mdwrite() is to be used for rewriting existing blocks while mdextend() is to be used for extending the relation EOF. This restriction lets us get rid of the old ad-hoc defense against creating huge files by an accidental reference to a bogus block number: we'll only create new segments in mdextend() not mdwrite() or mdread(). (Again, when InRecovery we allow it anyway, since we need to allow updates of blocks that were later truncated away.) Also, clean up the original makeshift patch for bug #2737: move the responsibility for padding relation segments to full length into md.c.	2007-01-03 18:11:01 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Bruce Momjian	e0522505bd	Remove 576 references of include files that were not needed.	2006-07-14 14:52:27 +00:00
Tom Lane	defe93463c	Make the world safe for full_page_writes. Allow XLOG records that try to update no-longer-existing pages to fall through as no-ops, but make a note of each page number referenced by such records. If we don't see a later XLOG entry dropping the table or truncating away the page, complain at the end of XLOG replay. Since this fixes the known failure mode for full_page_writes = off, revert my previous band-aid patch that disabled that GUC variable.	2006-04-14 20:27:24 +00:00
Tom Lane	4243f2387a	Suppress attempts to report dropped tables to the stats collector from a startup or recovery process. Since such a process isn't a real backend, pgstat.c gets confused. This accounts for recent reports of strange "invalid server process ID -1" log messages during crash recovery. There isn't any point in attempting to make the report, since we'll discard stats in such scenarios anyhow.	2006-03-30 22:11:55 +00:00
Tom Lane	0a20207060	Arrange to emit a description of the current XLOG record as error context when an error occurs during xlog replay. Also, replace the former risky 'write into a fixed-size buffer with no overflow detection' API for XLOG record description routines; use an expansible StringInfo instead. (The latter accounts for most of the patch bulk.) Qingqing Zhou	2006-03-24 04:32:13 +00:00
Bruce Momjian	f2f5b05655	Update copyright for 2006. Update scripts.	2006-03-05 15:59:11 +00:00
Tom Lane	d5db3abfb6	Modify pgstats code to reduce performance penalties from oversized stats data files: avoid creating stats hashtable entries for tables that aren't being touched except by vacuum/analyze, ensure that entries for dropped tables are removed promptly, and tweak the data layout to avoid storing useless struct padding. Also improve the performance of pgstat_vacuum_tabstat(), and make sure that autovacuum invokes it exactly once per autovac cycle rather than multiple times or not at all. This should cure recent complaints about 8.1 showing much higher stats I/O volume than was seen in 8.0. It'd still be a good idea to revisit the design with an eye to not re-writing the entire stats dataset every half second ... but that would be too much to backpatch, I fear.	2006-01-18 20:35:06 +00:00
Bruce Momjian	436a2956d8	Re-run pgindent, fixing a problem where comment lines after a blank comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.	2005-11-22 18:17:34 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	7117cd3a77	Cause ShutdownPostgres to do a normal transaction abort during backend exit, instead of trying to take shortcuts. Introduce some additional shutdown callback routines to eliminate kluges like having ProcKill be responsible for shutting down the buffer manager. Ensure that the order of operations during shutdown is predictable and what you would expect given the module layering.	2005-08-08 03:12:16 +00:00
Tom Lane	b95ae32b41	Avoid WAL-logging individual tuple insertions during CREATE TABLE AS (a/k/a SELECT INTO). Instead, flush and fsync the whole relation before committing. We do still need the WAL log when PITR is active, however. Simon Riggs and Tom Lane.	2005-06-20 18:37:02 +00:00
Tom Lane	d0a89683a3	Two-phase commit. Original patch by Heikki Linnakangas, with additional hacking by Alvaro Herrera and Tom Lane.	2005-06-17 22:32:51 +00:00
Tom Lane	ee7ac7b11e	Modify XLogInsert API to make callers specify whether pages to be backed up have the standard layout with unused space between pd_lower and pd_upper. When this is set, XLogInsert will omit the unused space without bothering to scan it to see if it's zero. That saves time in XLogInsert, and also allows reversion of my earlier patch to make PageRepairFragmentation et al explicitly re-zero freed space. Per suggestion by Heikki Linnakangas.	2005-06-06 20:22:58 +00:00
Tom Lane	4c8495a1f2	Remove the mostly-stubbed-out-anyway support routines for WAL UNDO. That code is never going to be used in the foreseeable future, and where it's more than a stub it's making the redo routines harder to read.	2005-06-06 17:01:25 +00:00
Tom Lane	e92a88272e	Modify hash_search() API to prevent future occurrences of the error spotted by Qingqing Zhou. The HASH_ENTER action now automatically fails with elog(ERROR) on out-of-memory --- which incidentally lets us eliminate duplicate error checks in quite a bunch of places. If you really need the old return-NULL-on-out-of-memory behavior, you can ask for HASH_ENTER_NULL. But there is now an Assert in that path checking that you aren't hoping to get that behavior in a palloc-based hash table. Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions, which were not being used anywhere anymore, and were surely too ugly and unsafe to want to see revived again.	2005-05-29 04:23:07 +00:00
Tom Lane	354049c709	Remove unnecessary calls of FlushRelationBuffers: there is no need to write out data that we are about to tell the filesystem to drop. smgr_internal_unlink already had a DropRelFileNodeBuffers call to get rid of dead buffers without a write after it's no longer possible to roll back the deleting transaction. Adding a similar call in smgrtruncate simplifies callers and makes the overall division of labor clearer. This patch removes the former behavior that VACUUM would write all dirty buffers of a relation unconditionally.	2005-03-20 22:00:54 +00:00
Tom Lane	0ce4d56924	Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This is the minimum required fix. I want to look next at taking advantage of it by simplifying the message semantics in the shared inval message queue, but that part can be held over for 8.1 if it turns out too ugly.	2005-01-10 20:02:24 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Peter Eisentraut	0ed3c7665e	Small message clarifications	2004-11-05 17:11:34 +00:00
Tom Lane	23645f0582	Fix incorrect ordering of smgr cleanup relative to buffer pin cleanup during transaction abort. Add a regression test case to catch related mistakes in future. Alvaro Herrera and Tom Lane.	2004-09-06 17:56:33 +00:00
Bruce Momjian	15d3f9f6b7	Another pgindent run with lib typedefs added.	2004-08-30 02:54:42 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Tom Lane	1a3de15a3a	Dept. of further reflection: I looked around to see if any other callers of XLogInsert had the same sort of checkpoint interlock problem as RecordTransactionCommit, and indeed I found some. Btree index build and ALTER TABLE SET TABLESPACE write data outside the friendly confines of the buffer manager, and therefore they have to take their own responsibility for checkpoint interlock. The easiest solution seems to be to force smgrimmedsync at the end of the index build or table copy, even when the operation is being WAL-logged. This is sufficient since the new index or table will be of interest to no one if we don't get as far as committing the current transaction.	2004-08-15 23:44:46 +00:00
Tom Lane	fe548629c5	Invent ResourceOwner mechanism as per my recent proposal, and use it to keep track of portal-related resources separately from transaction-related resources. This allows cursors to work in a somewhat sane fashion with nested transactions. For now, cursor behavior is non-subtransactional, that is a cursor's state does not roll back if you abort a subtransaction that fetched from the cursor. We might want to change that later.	2004-07-17 03:32:14 +00:00
Tom Lane	8801110b20	Move TablespaceCreateDbspace() call into smgrcreate(), which is where it probably should have been to begin with; this is to cover cases like needing to recreate the per-db directory during WAL replay. Also, fix heap_create to force pg_class.reltablespace to be zero instead of the database's default tablespace; this makes the world safe for CREATE DATABASE to handle all tables in the default tablespace alike, as per previous discussion. And force pg_class.reltablespace to zero when creating a relation without physical storage (eg, a view); this avoids possibly having dangling references in this column after a subsequent DROP TABLESPACE.	2004-07-11 19:52:52 +00:00
Tom Lane	573a71a5da	Nested transactions. There is still much left to do, especially on the performance front, but with feature freeze upon us I think it's time to drive a stake in the ground and say that this will be in 7.5. Alvaro Herrera, with some help from Tom Lane.	2004-07-01 00:52:04 +00:00
Tom Lane	2467394ee1	Tablespaces. Alternate database locations are dead, long live tablespaces. There are various things left to do: contrib dbsize and oid2name modules need work, and so does the documentation. Also someone should think about COMMENT ON TABLESPACE and maybe RENAME TABLESPACE. Also initlocation is dead, it just doesn't know it yet. Gavin Sherry and Tom Lane.	2004-06-18 06:14:31 +00:00
Tom Lane	2095206de1	Adjust btree index build to not use shared buffers, thereby avoiding the locking conflict against concurrent CHECKPOINT that was discussed a few weeks ago. Also, if not using WAL archiving (which is always true ATM but won't be if PITR makes it into this release), there's no need to WAL-log the index build process; it's sufficient to force-fsync the completed index before commit. This seems to gain about a factor of 2 in my tests, which is consistent with writing half as much data. I did not try it with WAL on a separate drive though --- probably the gain would be a lot less in that scenario.	2004-06-02 17:28:18 +00:00
Tom Lane	91d20ff7aa	Additional mop-up for sync-to-fsync changes: avoid issuing fsyncs for temp tables, and avoid WAL-logging truncations of temp tables. Do issue fsync on truncated files (not sure this is necessary but it seems like a good idea).	2004-05-31 20:31:33 +00:00
Tom Lane	9b178555fc	Per previous discussions, get rid of use of sync(2) in favor of explicitly fsync'ing every (non-temp) file we have written since the last checkpoint. In the vast majority of cases, the burden of the fsyncs should fall on the bgwriter process not on backends. (To this end, we assume that an fsync issued by the bgwriter will force out blocks written to the same file by other processes using other file descriptors. Anyone have a problem with that?) This makes the world safe for WIN32, which ain't even got sync(2), and really makes the world safe for Unixen as well, because sync(2) never had the semantics we need: it offers no way to wait for the requested I/O to finish. Along the way, fix a bug I recently introduced in xlog recovery: file truncation replay failed to clear bufmgr buffers for the dropped blocks, which could result in 'PANIC: heap_delete_redo: no block' later on in xlog replay.	2004-05-31 03:48:10 +00:00
Tom Lane	c3c09be34b	Commit the reasonably uncontroversial parts of J.R. Nield's PITR patch, to wit: Add a header record to each WAL segment file so that it can be reliably identified. Avoid splitting WAL records across segment files (this is not strictly necessary, but makes it simpler to incorporate the header records). Make WAL entries for file creation, deletion, and truncation (as foreseen but never implemented by Vadim). Also, add support for making XLOG_SEG_SIZE configurable at compile time, similarly to BLCKSZ. Fix a couple bugs I introduced in WAL replay during recent smgr API changes. initdb is forced due to changes in pg_control contents.	2004-02-11 22:55:26 +00:00
Tom Lane	87bd956385	Restructure smgr API as per recent proposal. smgr no longer depends on the relcache, and so the notion of 'blind write' is gone. This should improve efficiency in bgwriter and background checkpoint processes. Internal restructuring in md.c to remove the not-very-useful array of MdfdVec objects --- might as well just use pointers. Also remove the long-dead 'persistent main memory' storage manager (mm.c), since it seems quite unlikely to ever get resurrected.	2004-02-10 01:55:27 +00:00
Neil Conway	dfc7e7b71d	Code cleanup, mostly in the smgr: - Update comment in IsReservedName() to the present day - Improve some variable & function names in commands/vacuum.c. I was planning to rewrite this to avoid lappend(), but since I still intend to do the list rewrite, there's no need for that. - Update some smgr comments which seemed to imply that we still forced all dirty pages to disk at commit-time. - Replace some #ifdef DIAGNOSTIC code with assertions. - Make the distinction between OS-level file descriptors and virtual file descriptors a little clearer in a few comments - Other minor comment improvements in the smgr code	2004-01-06 18:07:32 +00:00
Peter Eisentraut	2afacfc403	This patch properly sets the prototype for the on_shmem_exit and on_proc_exit functions, and adjust all other related code to use the proper types too. by Kurt Roeckx	2003-12-12 18:45:10 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Peter Eisentraut	feb4f44d29	Message editing: remove gratuitous variations in message wording, standardize terms, add some clarifications, fix some untranslatable attempts at dynamic message building.	2003-09-25 06:58:07 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00

1 2 3

113 Commits