postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	0007490e09	Convert the arithmetic for shared memory size calculation from 'int' to 'Size' (that is, size_t), and install overflow detection checks in it. This allows us to remove the former arbitrary restrictions on NBuffers etc. It won't make any difference in a 32-bit machine, but in a 64-bit machine you could theoretically have terabytes of shared buffers. (How efficiently we could manage 'em remains to be seen.) Similarly, num_temp_buffers, work_mem, and maintenance_work_mem can be set above 2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional work by moi.	2005-08-20 23:26:37 +00:00
Tatsuo Ishii	ba2fc7eb4b	Make GetMultiXactIdMembers() a public function.	2005-08-20 01:29:27 +00:00
Tom Lane	f57e3f4cf3	Repair problems with VACUUM destroying t_ctid chains too soon, and with insufficient paranoia in code that follows t_ctid links. (We must do both because even with VACUUM doing it properly, the intermediate state with a dangling t_ctid link is visible concurrently during lazy VACUUM, and could be seen afterwards if either type of VACUUM crashes partway through.) Also try to improve documentation about what's going on. Patch is a bit bulky because passing the XMAX information around required changing the APIs of some low-level heapam.c routines, but it's not conceptually very complicated. Per trouble report from Teodor and subsequent analysis. This needs to be back-patched, but I'll do that after 8.1 beta is out.	2005-08-20 00:40:32 +00:00
Tom Lane	f8d0a82bf9	Avoid an Assert failure if OuterUserId hasn't been set yet during AbortTransaction. This can happen if a backend's InitPostgres transaction fails (eg, because the given username is invalid). Per Alvaro.	2005-08-17 22:14:34 +00:00
Tom Lane	6ea05c16a4	Change a couple of "can't happen" error messages to be a shade more verbose when they do happen. The "left link changed unexpectedly" one in particular has been seen more than once in the field.	2005-08-12 14:34:14 +00:00
Tom Lane	721e53785d	Solve the problem of OID collisions by probing for duplicate OIDs whenever we generate a new OID. This prevents occasional duplicate-OID errors that can otherwise occur once the OID counter has wrapped around. Duplicate relfilenode values are also checked for when creating new physical files. Per my recent proposal.	2005-08-12 01:36:05 +00:00
Tom Lane	d90c531188	Autovacuum loose end mop-up. Provide autovacuum-specific vacuum cost delay and limit, both as global GUCs and as table-specific entries in pg_autovacuum. stats_reset_on_server_start is now OFF by default, but a reset is forced if we did WAL replay. XID-wrap vacuums do not ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera	2005-08-11 21:11:50 +00:00
Bruce Momjian	949ebbd55e	Mention MD5 function index for indexing long values.	2005-08-11 13:22:33 +00:00
Tom Lane	24ff62d76f	Make new hints follow style guide.	2005-08-10 22:39:00 +00:00
Bruce Momjian	237be3cc29	Add hints to cases where indexes fail because of values that are too long.	2005-08-10 21:36:46 +00:00
Tom Lane	4568e0f791	Modify AtEOXact_CatCache and AtEOXact_RelationCache to assume that the ResourceOwner mechanism already released all reference counts for the cache entries; therefore, we do not need to scan the catcache or relcache at transaction end, unless we want to do it as a debugging crosscheck. Do the crosscheck only in Assert mode. This is the same logic we had previously installed in AtEOXact_Buffers to avoid overhead with large numbers of shared buffers. I thought it'd be a good idea to do it here too, in view of Kari Lavikka's recent report showing a real-world case where AtEOXact_CatCache is taking a significant fraction of runtime.	2005-08-08 19:17:23 +00:00
Tom Lane	0001e98d54	Code and docs review for pg_column_size() patch.	2005-08-02 16:11:57 +00:00
Tom Lane	2a4fad1a0e	Add NOWAIT option to SELECT FOR UPDATE/SHARE. Original patch by Hans-Juergen Schoenig, revisions by Karel Zak and Tom Lane.	2005-08-01 20:31:16 +00:00
Tom Lane	d42cf5a42a	Add per-user and per-database connection limit options. This patch also includes preliminary update of pg_dumpall for roles. Petr Jelinek, with review by Bruce Momjian and Tom Lane.	2005-07-31 17:19:22 +00:00
Bruce Momjian	5b0bfec414	Fix compile for no O_SYNC, but introduced with O_DIRECT.	2005-07-30 14:15:44 +00:00
Tom Lane	5d5f1a79e6	Clean up a number of autovacuum loose ends. Make the stats collector track shared relations in a separate hashtable, so that operations done from different databases are counted correctly. Add proper support for anti-XID-wraparound vacuuming, even in databases that are never connected to and so have no stats entries. Miscellaneous other bug fixes. Alvaro Herrera, some additional fixes by Tom Lane.	2005-07-29 19:30:09 +00:00
Bruce Momjian	c6b1724c67	Update O_DIRECT comment.	2005-07-29 03:25:53 +00:00
Bruce Momjian	c34bb00581	Use O_DIRECT if available when using O_SYNC for wal_sync_method. Also, write multiple WAL buffers out in one write() operation. ITAGAKI Takahiro --------------------------------------------------------------------------- > If we disable writeback-cache and use open_sync, the per-page writing > behavior in WAL module will show up as bad result. O_DIRECT is similar > to O_DSYNC (at least on linux), so that the benefit of it will disappear > behind the slow disk revolution. > > In the current source, WAL is written as: > for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); } > Is this intentional? Can we rewrite it as follows? > write(&buffers[0], N * BLCKSZ); > > In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff). > Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff). > These two patches are independent, so they can be applied either or both. > > > I tested them on my machine and the results as follows. It shows that > direct-io and gather-write is the best choice when writeback-cache is off. > Are these two patches worth trying if they are used together? > > > \| writeback \| fsync= \| fdata \| open_ \| fsync_ \| open_ > patch \| cache \| false \| sync \| sync \| direct \| direct > ------------+-----------+--------+-------+-------+--------+--------- > direct io \| off \| 124.2 \| 105.7 \| 48.3 \| 48.3 \| 48.2 > direct io \| on \| 129.1 \| 112.3 \| 114.1 \| 142.9 \| 144.5 > gather-write\| off \| 124.3 \| 108.7 \| 105.4 \| (N/A) \| (N/A) > both \| off \| 131.5 \| 115.5 \| 114.4 \| 145.4 \| 145.2 > > - 20runs * pgbench -s 100 -c 50 -t 200 > - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8) > - using 2 ATA disks: > - hda(reiserfs) includes system and wal. > - hdc(jfs) includes database files. writeback-cache is always on. > > --- > ITAGAKI Takahiro	2005-07-29 03:22:33 +00:00
Tom Lane	e5d6b91220	Add SET ROLE. This is a partial commit of Stephen Frost's recent patch; I'm still working on the has_role function and information_schema changes.	2005-07-25 22:12:34 +00:00
Bruce Momjian	9af9d674c6	Remove unintended code addition.	2005-07-23 15:31:16 +00:00
Bruce Momjian	4098c8867d	Macro alignment cleanup.	2005-07-23 15:29:47 +00:00
Tom Lane	f2bf2d2dc5	Fix a couple of bogus comments, per Alvaro.	2005-07-13 22:46:09 +00:00
Tom Lane	d7207cfc6b	Even though I'd like to see full_page_writes go away before 8.1, a minimum requirement is that it not completely break the system meanwhile. Put the test in the right place.	2005-07-08 04:07:26 +00:00
Bruce Momjian	a923602855	Add pg_column_size() to return storage size of a column, including possible compression. Mark Kirkwood	2005-07-06 19:02:54 +00:00
Bruce Momjian	326a7a0788	Add GUC full_page_writes to control writing full pages to WAL.	2005-07-05 23:18:10 +00:00
Tom Lane	eb5949d190	Arrange for the postmaster (and standalone backends, initdb, etc) to chdir into PGDATA and subsequently use relative paths instead of absolute paths to access all files under PGDATA. This seems to give a small performance improvement, and it should make the system more robust against naive DBAs doing things like moving a database directory that has a live postmaster in it. Per recent discussion.	2005-07-04 04:51:52 +00:00
Tom Lane	e7e1694295	Migrate rtree_gist functionality into the core system, and add some basic regression tests for GiST to the standard regression tests. I took the opportunity to add an rtree-equivalent gist opclass for circles; the contrib version only covered boxes and polygons, but indexing circles is very handy for distance searches.	2005-07-01 19:19:05 +00:00
Teodor Sigaev	5350216156	Improve error messages and add comment	2005-07-01 13:18:17 +00:00
Teodor Sigaev	898a7bd13b	Bug fixes for GiST crash recovery. - add forgotten check of lsn for insert completion - remove level of pages: hard to check in recovery - some cleanups	2005-06-30 17:52:14 +00:00
Tom Lane	401de9c8be	Improve the checkpoint signaling mechanism so that the bgwriter can tell the difference between checkpoints forced due to WAL segment consumption and checkpoints forced for other reasons (such as CREATE DATABASE). Avoid generating 'checkpoints are occurring too frequently' messages when the checkpoint wasn't caused by WAL segment consumption. Per gripe from Chris K-L.	2005-06-30 00:00:52 +00:00
Tom Lane	b5f7cff84f	Clean up the rather historically encumbered interface to now() and current time: provide a GetCurrentTimestamp() function that returns current time in the form of a TimestampTz, instead of separate time_t and microseconds fields. This is what all the callers really want anyway, and it eliminates low-level dependencies on AbsoluteTime, which is a deprecated datatype that will have to disappear eventually.	2005-06-29 22:51:57 +00:00
Teodor Sigaev	4523e0b63a	Cleanup, remove unneeded pallocs	2005-06-29 14:06:14 +00:00
Teodor Sigaev	88b49cdc95	Code cleanup. gistfillbuffer accepts InvalidOffsetNumber.	2005-06-28 15:51:00 +00:00
Tom Lane	7762619e95	Replace pg_shadow and pg_group by new role-capable catalogs pg_authid and pg_auth_members. There are still many loose ends to finish in this patch (no documentation, no regression tests, no pg_dump support for instance). But I'm going to commit it now anyway so that Alvaro can make some progress on shared dependencies. The catalog changes should be pretty much done.	2005-06-28 05:09:14 +00:00
Teodor Sigaev	e8cab5fe49	Concurrency for GiST - full concurrency for insert/update/select/vacuum: - select and vacuum never locks more than one page simultaneously - select (gettuple) hasn't any lock across it's calls - insert never locks more than two page simultaneously: - during search of leaf to insert it locks only one page simultaneously - while walk upward to the root it locked only parent (may be non-direct parent) and child. One of them X-lock, another may be S- or X-lock - 'vacuum full' locks index - improve gistgetmulti - simplify XLOG records Fix bug in index_beginscan_internal: LockRelation may clean rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation	2005-06-27 12:45:23 +00:00
Tom Lane	b90f8f20f0	Extend r-tree operator classes to handle Y-direction tests equivalent to the existing X-direction tests. An rtree class now includes 4 actual 2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests. This involved adding four new Y-direction test operators for each of box and polygon; I followed the PostGIS project's lead as to the names of these operators. NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright (&>) operators now have semantics comparable to box_overleft and box_overright. This is necessary to make r-tree indexes work correctly on polygons. Also, I changed circle_left and circle_right to agree with box_left and box_right --- formerly they allowed the boundaries to touch. This isn't actually essential given the lack of any r-tree opclass for circles, but it seems best to sync all the definitions while we are at it.	2005-06-24 20:53:34 +00:00
Tom Lane	9a09248edd	Fix rtree and contrib/rtree_gist search behavior for the 1-D box and polygon operators (<<, &<, >>, &>). Per ideas originally put forward by andrew@supernews and later rediscovered by moi. This patch just fixes the existing opclasses, and does not add any new behavior as I proposed earlier; that can be sorted out later. In principle this could be back-patched, since it changes only search behavior and not system catalog entries nor rtree index contents. I'm not currently planning to do that, though, since I think it could use more testing.	2005-06-24 00:18:52 +00:00
Tom Lane	e98edb5555	Fix the mechanism for reporting the original table OID and column number of columns of a query result so that it can "see through" cursors and prepared statements. Per gripe a couple months back from John DeSoi.	2005-06-22 17:45:46 +00:00
Tom Lane	b95ae32b41	Avoid WAL-logging individual tuple insertions during CREATE TABLE AS (a/k/a SELECT INTO). Instead, flush and fsync the whole relation before committing. We do still need the WAL log when PITR is active, however. Simon Riggs and Tom Lane.	2005-06-20 18:37:02 +00:00
Teodor Sigaev	1bfdd1a893	fix founded hole in recovery after crash, add vacuum_delay_point()	2005-06-20 15:22:38 +00:00
Teodor Sigaev	d544ec8bbd	1. full functional WAL for GiST 2. improve vacuum for gist - use FSM - full vacuum: - reforms parent tuple if it's needed ( tuples was deleted on child page or parent tuple remains invalid after crash recovery ) - truncate index file if possible 3. fixes bugs and mistakes	2005-06-20 10:29:37 +00:00
Tom Lane	d961a56899	Avoid unnecessary palloc overhead in _bt_first(). The temporary scankeys arrays that it needs can never have more than INDEX_MAX_KEYS entries, so it's reasonable to just allocate them as fixed-size local arrays, and save the cost of palloc/pfree. Not a huge savings, but a cycle saved is a cycle earned ...	2005-06-19 22:41:00 +00:00
Tom Lane	fc654583ab	Need #include <time.h> on some platforms.	2005-06-19 22:34:56 +00:00
Tom Lane	3f749924f8	Simplify uses of readdir() by creating a function ReadDir() that includes error checking and an appropriate ereport(ERROR) message. This gets rid of rather tedious and error-prone manipulation of errno, as well as a Windows-specific bug workaround, at more than a dozen call sites. After an idea in a recent patch by Heikki Linnakangas.	2005-06-19 21:34:03 +00:00
Tom Lane	e26b0abda3	Arrange to fsync two-phase-commit state files only during checkpoints; given reasonably short lifespans for prepared transactions, this should mean that only a small minority of state files ever need to be fsynced at all. Per discussion with Heikki Linnakangas.	2005-06-19 20:00:39 +00:00
Tom Lane	a8d1075f27	Add a time-of-preparation column to the pg_prepared_xacts view, per an old suggestion by Oliver Jowett. Also, add a transaction column to the pg_locks view to show the xid of each transaction holding or awaiting locks; this allows prepared transactions to be properly associated with the locks they own. There was already a column named 'transaction', and I chose to rename it to 'transactionid' --- since this column is new in the current devel cycle there should be no backwards compatibility issue to worry about.	2005-06-18 19:33:42 +00:00
Tom Lane	66b098492e	Dept. of second thoughts: regular COMMIT deletes deletable files before releasing locks, so COMMIT PREPARED should too.	2005-06-18 05:21:09 +00:00
Tom Lane	d0a89683a3	Two-phase commit. Original patch by Heikki Linnakangas, with additional hacking by Alvaro Herrera and Tom Lane.	2005-06-17 22:32:51 +00:00
Bruce Momjian	f4d907ca85	Remove old .backup files when we do pg_stop_backup(). This prevents a large number of .backup files from existing in pg_xlog/	2005-06-15 01:36:08 +00:00
Teodor Sigaev	37c839365c	WAL for GiST. It work for online backup and so on, but on recovery after crash (power loss etc) it may say that it can't restore index and index should be reindexed. Some refactoring code.	2005-06-14 11:45:14 +00:00
Tom Lane	c186c93148	Change the planner to allow indexscan qualification clauses to use nonconsecutive columns of a multicolumn index, as per discussion around mid-May (pghackers thread "Best way to scan on-disk bitmaps"). This turns out to require only minimal changes in btree, and so far as I can see none at all in GiST. btcostestimate did need some work, but its original assumption that index selectivity == heap selectivity was quite bogus even before this.	2005-06-13 23:14:49 +00:00
Bruce Momjian	51746c4549	Free buffer allocated via malloc (process is short-lived, but fix it anyway).	2005-06-09 22:36:27 +00:00
Tom Lane	dce83e76e0	Add missing #include -- mea culpa.	2005-06-09 21:01:25 +00:00
Tom Lane	82358ca936	Put a critical section around update of hash index metapage. Per discussion with Qingqing Zhou.	2005-06-09 18:23:50 +00:00
Tom Lane	f5b2f60bd1	Change WAL-logging scheme for multixacts to be more like regular transaction IDs, rather than like subtrans; in particular, the information now survives a database restart. Per previous discussion, this is essential for PITR log shipping and for 2PC.	2005-06-08 15:50:28 +00:00
Tom Lane	ee7ac7b11e	Modify XLogInsert API to make callers specify whether pages to be backed up have the standard layout with unused space between pd_lower and pd_upper. When this is set, XLogInsert will omit the unused space without bothering to scan it to see if it's zero. That saves time in XLogInsert, and also allows reversion of my earlier patch to make PageRepairFragmentation et al explicitly re-zero freed space. Per suggestion by Heikki Linnakangas.	2005-06-06 20:22:58 +00:00
Tom Lane	4c8495a1f2	Remove the mostly-stubbed-out-anyway support routines for WAL UNDO. That code is never going to be used in the foreseeable future, and where it's more than a stub it's making the redo routines harder to read.	2005-06-06 17:01:25 +00:00
Tom Lane	21fda22ec4	Change CRCs in WAL records from 64bit to 32bit for performance reasons. Instead of a separate CRC on each backup block, include backup blocks in their parent WAL record's CRC; this is important to ensure that the backup block really goes with the WAL record, ie there was not a page tear right at the start of the backup block. Implement a simple form of compression of backup blocks: drop any run of zeroes starting at pd_lower, so as not to store the unused 'hole' that commonly exists in PG heap and index pages. Tweak PageRepairFragmentation and related routines to ensure they keep the unused space zeroed, so that the above compression method remains effective. All per recent discussions.	2005-06-02 05:55:29 +00:00
Tom Lane	a91fa39028	Add test to WAL replay to verify that xl_prev points back to the previous WAL record; this is necessary to be sure we recognize stale WAL records when a WAL page was only partially written during a system crash.	2005-05-31 19:10:28 +00:00
Tom Lane	e92a88272e	Modify hash_search() API to prevent future occurrences of the error spotted by Qingqing Zhou. The HASH_ENTER action now automatically fails with elog(ERROR) on out-of-memory --- which incidentally lets us eliminate duplicate error checks in quite a bunch of places. If you really need the old return-NULL-on-out-of-memory behavior, you can ask for HASH_ENTER_NULL. But there is now an Assert in that path checking that you aren't hoping to get that behavior in a palloc-based hash table. Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions, which were not being used anywhere anymore, and were surely too ugly and unsafe to want to see revived again.	2005-05-29 04:23:07 +00:00
Tom Lane	32e8fc4a28	Arrange to cache fmgr lookup information for an index's access method routines in the index's relcache entry, instead of doing a fresh fmgr_info on every index access. We were already doing this for the index's opclass support functions; not sure why we didn't think to do it for the AM functions too. This supersedes the former method of caching (only) amgettuple in indexscan scan descriptors; it's an improvement because the function lookup can be amortized across multiple statements instead of being repeated for each statement. Even though lookup for builtin functions is pretty cheap, this seems to drop a percent or two off some simple benchmarks.	2005-05-27 23:31:21 +00:00
Bruce Momjian	b492c3accc	Add parentheses to macros when args are used in computations. Without them, the executation behavior could be unexpected.	2005-05-25 21:40:43 +00:00
Bruce Momjian	6dc7760ac3	Add support for wal_fsync_writethrough for Darwin, and restructure the code to better handle writethrough. Chris Campbell	2005-05-20 14:53:26 +00:00
Tom Lane	ff0c143a3b	Make a comment pgindent-proof, per suggestion from Alvaro.	2005-05-19 23:58:51 +00:00
Tom Lane	ee3b71f6bc	Split the shared-memory array of PGPROC pointers out of the sinval communication structure, and make it its own module with its own lock. This should reduce contention at least a little, and it definitely makes the code seem cleaner. Per my recent proposal.	2005-05-19 21:35:48 +00:00
Neil Conway	c891e05f26	Cleanup GiST header files. Since GiST extensions are often written as external projects, we should be careful about what parts of the GiST API are considered implementation details, and which are part of the public API. Therefore, I've moved internal-only declarations into gist_private.h -- future backward-incompatible changes to gist.h should be made with care, to avoid needlessly breaking external GiST extensions. Also did some related header cleanup: remove some unnecessary #includes from gist.h, and remove some unused definitions: isAttByVal(), _gistdump(), and GISTNStrategies.	2005-05-17 03:34:18 +00:00
Neil Conway	eda6dd32d1	GiST improvements: - make sure we always invoke user-supplied GiST methods in a short-lived memory context. This means the backend isn't exposed to any memory leaks that be in those methods (in fact, it is probably a net loss for most GiST methods to bother manually freeing memory now). This also means we can do away with a lot of ugly manual memory management in the GiST code itself. - keep the current page of a GiST index scan pinned, rather than doing a ReadBuffer() for each tuple produced by the scan. Since ReadBuffer() is expensive, this is a perf. win - implement dead tuple killing for GiST indexes (which is easy to do, now that we keep a pin on the current scan page). Now all the builtin indexes implement dead tuple killing. - cleanup a lot of ugly code in GiST	2005-05-17 00:59:30 +00:00
Tom Lane	2ef172a2a4	Fix latent bug in ExecSeqRestrPos: it leaves the plan node's result slot in an inconsistent state. (This is only latent because in reality ExecSeqRestrPos is dead code at the moment ... but someday maybe it won't be.) Add some comments about what the API for plan node mark/restore actually is, because it's not immediately obvious.	2005-05-15 21:19:55 +00:00
Neil Conway	eb0d00a9be	Various style cleanups for GiST; no changes to functionality.	2005-05-15 04:08:29 +00:00
Neil Conway	3140437495	This patch refactors away some duplicated code in the index AM build methods: they all invoke UpdateStats() since they have computed the number of heap tuples, so I created a function in catalog/index.c that each AM now calls.	2005-05-11 06:24:55 +00:00
Neil Conway	f38e413b20	Code cleanup: in C89, there is no point casting the first argument to memset() or MemSet() to a char . For one, memset()'s first argument is a void , and further void * can be implicitly coerced to/from any other pointer type.	2005-05-11 01:26:02 +00:00
Bruce Momjian	35e1651508	Back out check for unreferenced files. Heikki Linnakangas	2005-05-10 22:27:30 +00:00
Neil Conway	dc5ebcfcce	Fix typo in comment.	2005-05-10 05:15:07 +00:00
Tom Lane	30f540be43	Repair very-low-probability race condition between relation extension and VACUUM: in the interval between adding a new page to the relation and formatting it, it was possible for VACUUM to come along and decide it should format the page too. Though not harmful in itself, this would cause data loss if a third transaction were able to insert tuples into the vacuumed page before the original extender got control back.	2005-05-07 21:32:24 +00:00
Tom Lane	4cd4ed0cc2	Fix case in which a debug printout would print already-pfreed data.	2005-05-07 18:14:25 +00:00
Tom Lane	278bd0cc22	For some reason access/tupmacs.h has been #including utils/memutils.h, which is neither needed by nor related to that header. Remove the bogus inclusion and instead include the header in those C files that actually need it. Also fix unnecessary inclusions and bad inclusion order in tsearch2 files.	2005-05-06 17:24:55 +00:00
Tom Lane	126eaef651	Clean up MultiXactIdExpand's API by separating out the case where we are creating a new MultiXactId from two regular XIDs. The original coding was unnecessarily complicated and didn't save any code anyway.	2005-05-03 19:42:41 +00:00
Bruce Momjian	76668e6eb4	Check the file system on postmaster startup and report any unreferenced files in the server log. Heikki Linnakangas	2005-05-02 18:26:54 +00:00
Tom Lane	6c412f0605	Change CREATE TYPE to require datatype output and send functions to have only one argument. (Per recent discussion, the option to accept multiple arguments is pretty useless for user-defined types, and would be a likely source of security holes if it was used.) Simplify call sites of output/send functions to not bother passing more than one argument.	2005-05-01 18:56:19 +00:00
Tom Lane	93b2477278	Use the standard lock manager to establish priority order when there is contention for a tuple-level lock. This solves the problem of a would-be exclusive locker being starved out by an indefinite succession of share-lockers. Per recent discussion with Alvaro.	2005-04-30 19:03:33 +00:00
Tom Lane	3a694bb0a1	Restructure LOCKTAG as per discussions of a couple months ago. Essentially, we shoehorn in a lockable-object-type field by taking a byte away from the lockmethodid, which can surely fit in one byte instead of two. This allows less artificial definitions of all the other fields of LOCKTAG; we can get rid of the special pg_xactlock pseudo-relation, and also support locks on individual tuples and general database objects (including shared objects). None of those possibilities are actually exploited just yet, however. I removed pg_xactlock from pg_class, but did not force initdb for that change. At this point, relkind 's' (SPECIAL) is unused and could be removed entirely.	2005-04-29 22:28:24 +00:00
Tom Lane	bedb78d386	Implement sharable row-level locks, and use them for foreign key references to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU data structure (managed much like pg_subtrans) to represent multiple- transaction-ID sets. When more than one transaction is holding a shared lock on a particular row, we create a MultiXactId representing that set of transactions and store its ID in the row's XMAX. This scheme allows an effectively unlimited number of row locks, just as we did before, while not costing any extra overhead except when a shared lock actually has to be shared. Still TODO: use the regular lock manager to control the grant order when multiple backends are waiting for a row lock. Alvaro Herrera and Tom Lane.	2005-04-28 21:47:18 +00:00
Tom Lane	19d127548c	Add comment about checkpoint panic behavior during shutdown, per suggestion from Qingqing Zhou.	2005-04-23 18:49:54 +00:00
Tom Lane	3842892492	Recent changes got the sense of the notnull bit backwards in the 2.0 protocol output routines. Mea culpa :-(. Per report from Kris Jurka.	2005-04-23 17:45:35 +00:00
Bruce Momjian	1a6ad669fb	Fix comment typo.	2005-04-17 03:04:29 +00:00
Tom Lane	5f0a974ea9	Reduce PANIC to ERROR in several xlog routines that are used in both critical and noncritical contexts (an example of noncritical being post-checkpoint removal of dead xlog segments). In the critical cases the CRIT_SECTION mechanism will cause ERROR to be promoted to PANIC anyway, and in the noncritical cases we shouldn't let an error take down the entire database. Arguably there should be no explicit PANIC errors in this module, only more START/END_CRIT_SECTION calls, but I didn't go that far. (Yet.)	2005-04-15 22:19:48 +00:00
Tom Lane	61b861421b	Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior when recycling a large number of xlog segments during checkpoint. The former behavior searched from the same start point each time, requiring O(checkpoint_segments^2) stat() calls to relocate all the segments. Instead keep track of where we stopped last time through.	2005-04-15 18:48:10 +00:00
Tom Lane	8e14408028	Make equalTupleDescs() compare attlen/attbyval/attalign rather than assuming comparison of atttypid is sufficient. In a dropped column atttypid will be 0, and we'd better check the physical-storage data to make sure the tupdescs are physically compatible. I do not believe there is a real risk before 8.0, since before that we only used this routine to compare successive states of the tupdesc for a particular relation. But 8.0's typcache.c might be comparing arbitrary tupdescs so we'd better play it safer.	2005-04-14 22:34:48 +00:00
Tom Lane	162bd08b3f	Completion of project to use fixed OIDs for all system catalogs and indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...	2005-04-14 20:03:27 +00:00
Tom Lane	2193a856a2	Simplify initdb-time assignment of OIDs as I proposed yesterday, and avoid encroaching on the 'user' range of OIDs by allowing automatic OID assignment to use values below 16k until we reach normal operation. initdb not forced since this doesn't make any incompatible change; however a lot of stuff will have different OIDs after your next initdb.	2005-04-13 18:54:57 +00:00
Tom Lane	c3294f1cbf	Fix interaction between materializing holdable cursors and firing deferred triggers: either one can create more work for the other, so we have to loop till it's all gone. Per example from andrew@supernews. Add a regression test to help spot trouble in this area in future.	2005-04-11 19:51:16 +00:00
Tom Lane	ad161bcc8a	Merge Resdom nodes into TargetEntry nodes to simplify code and save a few palloc's. I also chose to eliminate the restype and restypmod fields entirely, since they are redundant with information stored in the node's contained expression; re-examining the expression at need seems simpler and more reliable than trying to keep restype/restypmod up to date. initdb forced due to change in contents of stored rules.	2005-04-06 16:34:07 +00:00
Tom Lane	47888fe842	First phase of OUT-parameters project. We can now define and use SQL functions with OUT parameters. The various PLs still need work, as does pg_dump. Rudimentary docs and regression tests included.	2005-03-31 22:46:33 +00:00
Tom Lane	8c85a34a3b	Officially decouple FUNC_MAX_ARGS from INDEX_MAX_KEYS, and set the former to 100 by default. Clean up some of the less necessary dependencies on FUNC_MAX_ARGS; however, the biggie (FunctionCallInfoData) remains.	2005-03-29 03:01:32 +00:00
Tom Lane	70c9763d48	Convert oidvector and int2vector into variable-length arrays. This change saves a great deal of space in pg_proc and its primary index, and it eliminates the former requirement that INDEX_MAX_KEYS and FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded in the on-disk representation (because it affects index tuple header size), but FUNC_MAX_ARGS is not. I believe it would now be possible to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet. There are still a lot of vestigial references to FUNC_MAX_ARGS, which I will clean up in a separate pass. However, getting rid of it altogether would require changing the FunctionCallInfoData struct, and I'm not sure I want to buy into that.	2005-03-29 00:17:27 +00:00
Tom Lane	119191609c	Remove dead push/pop rollback code. Vadim once planned to implement transaction rollback via UNDO but I think that's highly unlikely to happen, so we may as well remove the stubs. (Someday we ought to rip out the stub xxx_undo routines, too.) Per Alvaro.	2005-03-28 01:50:34 +00:00
Tom Lane	bf3dbb5881	First steps towards index scans with heap access decoupled from index access: define new index access method functions 'amgetmulti' that can fetch multiple TIDs per call. (The functions exist but are totally untested as yet.) Since I was modifying pg_am anyway, remove the no-longer-needed 'rel' parameter from amcostestimate functions, and also remove the vestigial amowner column that was creating useless work for Alvaro's shared-object-dependencies project. Initdb forced due to changes in pg_am.	2005-03-27 23:53:05 +00:00
Tom Lane	617dd33b6e	Eliminate duplicate hasnulls bit testing in index tuple access, and clean up itup.h a little bit.	2005-03-27 18:38:27 +00:00
Bruce Momjian	b1f57d88f5	Change Win32 O_SYNC method to O_DSYNC because that is what the method currently does. This is now the default Win32 wal sync method because we perfer o_datasync to fsync. Also, change Win32 fsync to a new wal sync method called fsync_writethrough because that is the behavior of _commit, which is what is used for fsync on Win32. Backpatch to 8.0.X.	2005-03-24 04:36:20 +00:00
Tom Lane	94e03330cb	Create a routine PageIndexMultiDelete() that replaces a loop around PageIndexTupleDelete() with a single pass of compactification --- logic mostly lifted from PageRepairFragmentation. I noticed while profiling that a VACUUM that's cleaning up a whole lot of deleted tuples would spend as much as a third of its CPU time in PageIndexTupleDelete; not too surprising considering the loop method was roughly O(N^2) in the number of tuples involved.	2005-03-22 06:17:03 +00:00

1 2 3 4 5 ...

1086 Commits