postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	88452d5ba6	Implement ALTER TABLE ADD UNIQUE/PRIMARY KEY USING INDEX. This feature allows a unique or pkey constraint to be created using an already-existing unique index. While the constraint isn't very functionally different from the bare index, it's nice to be able to do that for documentation purposes. The main advantage over just issuing a plain ALTER TABLE ADD UNIQUE/PRIMARY KEY is that the index can be created with CREATE INDEX CONCURRENTLY, so that there is not a long interval where the table is locked against updates. On the way, refactor some of the code in DefineIndex() and index_create() so that we don't have to pass through those functions in order to create the index constraint's catalog entries. Also, in parse_utilcmd.c, pass around the ParseState pointer in struct CreateStmtContext to save on notation, and add error location pointers to some error reports that didn't have one before. Gurjeet Singh, reviewed by Steve Singer and Tom Lane	2011-01-25 15:43:05 -05:00
Robert Haas	8ceb245680	Make ALTER TABLE revalidate uniqueness and exclusion constraints. Failure to do so can lead to constraint violations. This was broken by commit `1ddc2703a9` on 2010-02-07, so back-patch to 9.0. Noah Misch. Regression test by me.	2011-01-20 22:44:10 -05:00
Bruce Momjian	2896c87ce4	Force pg_upgrade's to preserve pg_class.oid, not pg_class.relfilenode. Toast tables have identical pg_class.oid and pg_class.relfilenode, but for clarity it is good to preserve the pg_class.oid. Update comments regarding what is preserved, and do some variable/function renaming for clarity.	2011-01-07 21:26:13 -05:00
Bruce Momjian	46d28820b6	Improve C comments about backend variables set by pg_upgrade_support functions.	2011-01-06 22:45:36 -05:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Robert Haas	53dbc27c62	Support unlogged tables. The contents of an unlogged table are WAL-logged; thus, they are not available on standby servers and are truncated whenever the database system enters recovery. Indexes on unlogged tables are also unlogged. Unlogged GiST indexes are not currently supported.	2010-12-29 06:48:53 -05:00
Robert Haas	5f7b58fad8	Generalize concept of temporary relations to "relation persistence". This commit replaces pg_class.relistemp with pg_class.relpersistence; and also modifies the RangeVar node type to carry relpersistence rather than istemp. It also removes removes rd_istemp from RelationData and instead performs the correct computation based on relpersistence. For clarity, we add three new macros: RelationNeedsWAL(), RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we can clarify the purpose of each check that previous depended on rd_istemp. This is intended as infrastructure for the upcoming unlogged tables patch, as well as for future possible work on global temporary tables.	2010-12-13 12:34:26 -05:00
Tom Lane	9f376e146b	Ensure an index that uses a whole-row Var still depends on its table. We failed to record any dependency on the underlying table for an index declared like "create index i on t (foo(t.*))". This would create trouble if the table were dropped without previously dropping the index. To fix, simplify some overly-cute code in index_create(), accepting the possibility that sometimes the whole-table dependency will be redundant. Also document this hazard in dependency.c. Per report from Kevin Grittner. In passing, prevent a core dump in pg_get_indexdef() if the index's table can't be found. I came across this while experimenting with Kevin's example. Not sure it's a real issue when the catalogs aren't corrupt, but might as well be cautious. Back-patch to all supported versions.	2010-11-02 17:15:07 -04:00
Tom Lane	2ec993a7cb	Support triggers on views. This patch adds the SQL-standard concept of an INSTEAD OF trigger, which is fired instead of performing a physical insert/update/delete. The trigger function is passed the entire old and/or new rows of the view, and must figure out what to do to the underlying tables to implement the update. So this feature can be used to implement updatable views using trigger programming style rather than rule hacking. In passing, this patch corrects the names of some columns in the information_schema.triggers view. It seems the SQL committee renamed them somewhere between SQL:99 and SQL:2003. Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.	2010-10-10 13:45:07 -04:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Joe Conway	5eb15c9942	SERIALIZABLE transactions are actually implemented beneath the covers with transaction snapshots, i.e. a snapshot registered at the beginning of a transaction. Change variable naming and comments to reflect this reality in preparation for a future, truly serializable mode, e.g. Serializable Snapshot Isolation (SSI). For the moment transaction snapshots are still used to implement SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin Grittner and Dan Ports with review and some minor wording changes by me.	2010-09-11 18:38:58 +00:00
Robert Haas	debcec7dc3	Include the backend ID in the relpath of temporary relations. This allows us to reliably remove all leftover temporary relation files on cluster startup without reference to system catalogs or WAL; therefore, we no longer include temporary relations in XLOG_XACT_COMMIT and XLOG_XACT_ABORT WAL records. Since these changes require including a backend ID in each SharedInvalSmgrMsg, the size of the SharedInvalidationMessage.id field has been reduced from two bytes to one, and the maximum number of connections has been reduced from INT_MAX / 4 to 2^23-1. It would be possible to remove these restrictions by increasing the size of SharedInvalidationMessage by 4 bytes, but right now that doesn't seem like a good trade-off. Review by Jaime Casanova and Tom Lane.	2010-08-13 20:10:54 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Robert Haas	e26c539e9f	Wrap calls to SearchSysCache and related functions using macros. The purpose of this change is to eliminate the need for every caller of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists, GetSysCacheOid, and SearchSysCacheList to know the maximum number of allowable keys for a syscache entry (currently 4). This will make it far easier to increase the maximum number of keys in a future release should we choose to do so, and it makes the code shorter, too. Design and review by Tom Lane.	2010-02-14 18:42:19 +00:00
Tom Lane	0a469c8769	Remove old-style VACUUM FULL (which was known for a little while as VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity. Per discussion, the use case for this method of vacuuming is no longer large enough to justify maintaining it; not to mention that we don't wish to invest the work that would be needed to make it play nicely with Hot Standby. Aside from the code directly related to old-style VACUUM FULL, this commit removes support for certain WAL record types that could only be generated within VACUUM FULL, redirect-pointer removal in heap_page_prune, and nontransactional generation of cache invalidation sinval messages (the last being the sticking point for Hot Standby). We still have to retain all code that copes with finding HEAP_MOVED_OFF and HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long as we want to support in-place update from pre-9.0 databases.	2010-02-08 04:33:55 +00:00
Tom Lane	1ddc2703a9	Work around deadlock problems with VACUUM FULL/CLUSTER on system catalogs, as per my recent proposal. First, teach IndexBuildHeapScan to not wait for INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples to commit unless the index build is checking uniqueness/exclusion constraints. If it isn't, there's no harm in just indexing the in-doubt tuple. Second, modify VACUUM FULL/CLUSTER to suppress reverifying uniqueness/exclusion constraint properties while rebuilding indexes of the target relation. This is reasonable because these commands aren't meant to deal with corrupted-data situations. Constraint properties will still be rechecked when an index is rebuilt by a REINDEX command. This gets us out of the problem that new-style VACUUM FULL would often wait for other transactions while holding exclusive lock on a system catalog, leading to probable deadlock because those other transactions need to look at the catalogs too. Although the real ultimate cause of the problem is a debatable choice to release locks early after modifying system catalogs, changing that choice would require pretty serious analysis and is not something to be undertaken lightly or on a tight schedule. The present patch fixes the problem in a fairly reasonable way and should also improve the speed of VACUUM FULL/CLUSTER a little bit.	2010-02-07 22:40:33 +00:00
Tom Lane	b9b8831ad6	Create a "relation mapping" infrastructure to support changing the relfilenodes of shared or nailed system catalogs. This has two key benefits: * The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs. * We no longer have to use an unsafe reindex-in-place approach for reindexing shared catalogs. CLUSTER on nailed catalogs now works too, although I left it disabled on shared catalogs because the resulting pg_index.indisclustered update would only be visible in one database. Since reindexing shared system catalogs is now fully transactional and crash-safe, the former special cases in REINDEX behavior have been removed; shared catalogs are treated the same as non-shared. This commit does not do anything about the recently-discussed problem of deadlocks between VACUUM FULL/CLUSTER on a system catalog and other concurrent queries; will address that in a separate patch. As a stopgap, parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid such failures during the regression tests.	2010-02-07 20:48:13 +00:00
Tom Lane	70a2b05a59	Assorted cleanups in preparation for using a map file to support altering the relfilenode of currently-not-relocatable system catalogs. 1. Get rid of inval.c's dependency on relfilenode, by not having it emit smgr invalidations as a result of relcache flushes. Instead, smgr sinval messages are sent directly from smgr.c when an actual relation delete or truncate is done. This makes considerably more structural sense and allows elimination of a large number of useless smgr inval messages that were formerly sent even in cases where nothing was changing at the physical-relation level. Note that this reintroduces the concept of nontransactional inval messages, but that's okay --- because the messages are sent by smgr.c, they will be sent in Hot Standby slaves, just from a lower logical level than before. 2. Move setNewRelfilenode out of catalog/index.c, where it never logically belonged, into relcache.c; which is a somewhat debatable choice as well but better than before. (I considered catalog/storage.c, but that seemed too low level.) Rename to RelationSetNewRelfilenode. 3. Cosmetic cleanups of some other relfilenode manipulations.	2010-02-03 01:14:17 +00:00
Robert Haas	76a47c0e74	Replace ALTER TABLE ... SET STATISTICS DISTINCT with a more general mechanism. Attributes can now have options, just as relations and tablespaces do, and the reloptions code is used to parse, validate, and store them. For simplicity and because these options are not performance critical, we store them in a separate cache rather than the main relcache. Thanks to Alex Hunsaker for the review.	2010-01-22 16:40:19 +00:00
Tom Lane	9a915e596f	Improve the handling of SET CONSTRAINTS commands by having them search pg_constraint before searching pg_trigger. This allows saner handling of corner cases; in particular we now say "constraint is not deferrable" rather than "constraint does not exist" when the command is applied to a constraint that's inherently non-deferrable. Per a gripe several months ago from hubert depesz lubaczewski. To make this work without breaking user-defined constraint triggers, we have to add entries for them to pg_constraint. However, in return we can remove the pgconstrname column from pg_constraint, which represents a fairly sizable space savings. I also replaced the tgisconstraint column with tgisinternal; the old meaning of tgisconstraint can now be had by testing for nonzero tgconstraint, while there is no other way to get the old meaning of nonzero tgconstraint, namely that the trigger was internally generated rather than being user-created. In passing, fix an old misstatement in the docs and comments, namely that pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable. Actually, we mark RI action triggers as nondeferrable even when they belong to a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on that instead of hard-coding a list of exception OIDs.	2010-01-17 22:56:23 +00:00
Bruce Momjian	f98fbc78c3	Preserve relfilenodes: Add support to pg_dump --binary-upgrade to preserve all relfilenodes, for use by pg_migrator.	2010-01-06 03:04:03 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	cfc5008a51	Adjust naming of indexes and their columns per recent discussion. Index expression columns are now named after the FigureColname result for their expressions, rather than always being "pg_expression_N". Digits are appended to this name if needed to make the column name unique within the index. (That happens for regular columns too, thus fixing the old problem that CREATE INDEX fooi ON foo (f1, f1) fails. Before exclusion indexes there was no real reason to do such a thing, but now maybe there is.) Default names for indexes and associated constraints now include the column names of all their columns, not only the first one as in previous practice. (Of course, this will be truncated as needed to fit in NAMEDATALEN. Also, pkey indexes retain the historical behavior of not naming specific columns at all.) An example of the results: regression=# create table foo (f1 int, f2 text, regression(# exclude (f1 with =, lower(f2) with =)); NOTICE: CREATE TABLE / EXCLUDE will create implicit index "foo_f1_lower_exclusion" for table "foo" CREATE TABLE regression=# \d foo_f1_lower_exclusion Index "public.foo_f1_lower_exclusion" Column \| Type \| Definition --------+---------+------------ f1 \| integer \| f1 lower \| text \| lower(f2) btree, for table "public.foo"	2009-12-23 02:35:25 +00:00
Tom Lane	62aba76568	Prevent indirect security attacks via changing session-local state within an allegedly immutable index function. It was previously recognized that we had to prevent such a function from executing SET/RESET ROLE/SESSION AUTHORIZATION, or it could trivially obtain the privileges of the session user. However, since there is in general no privilege checking for changes of session-local state, it is also possible for such a function to change settings in a way that might subvert later operations in the same session. Examples include changing search_path to cause an unexpected function to be called, or replacing an existing prepared statement with another one that will execute a function of the attacker's choosing. The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against these threats, which are the same places previously deemed to need protection against the SET ROLE issue. GUC changes are still allowed, since there are many useful cases for that, but we prevent security problems by forcing a rollback of any GUC change after completing the operation. Other cases are handled by throwing an error if any change is attempted; these include temp table creation, closing a cursor, and creating or deleting a prepared statement. (In 7.4, the infrastructure to roll back GUC changes doesn't exist, so we settle for rejecting changes of "search_path" in these contexts.) Original report and patch by Gurjeet Singh, additional analysis by Tom Lane. Security: CVE-2009-4136	2009-12-09 21:57:51 +00:00
Tom Lane	0cb65564e5	Add exclusion constraints, which generalize the concept of uniqueness to support any indexable commutative operator, not just equality. Two rows violate the exclusion constraint if "row1.col OP row2.col" is TRUE for each of the columns in the constraint. Jeff Davis, reviewed by Robert Haas	2009-12-07 05:22:23 +00:00
Tom Lane	7fc0f06221	Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be checked to determine whether the trigger should be fired. For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER triggers it can provide a noticeable performance improvement, since queuing of a deferred trigger event and re-fetching of the row(s) at end of statement can be short-circuited if the trigger does not need to be fired. Takahiro Itagaki, reviewed by KaiGai Kohei.	2009-11-20 20:38:12 +00:00
Tom Lane	b2734a0d79	Support SQL-compliant triggers on columns, ie fire only if certain columns are named in the UPDATE's SET list. Note: the schema of pg_trigger has not actually changed; we've just started to use a column that was there all along. catversion bumped anyway so that this commit is included in the history of potentially interesting changes to system catalog contents. Itagaki Takahiro	2009-10-14 22:14:25 +00:00
Tom Lane	249724cb01	Create an ALTER DEFAULT PRIVILEGES command, which allows users to adjust the privileges that will be applied to subsequently-created objects. Such adjustments are always per owning role, and can be restricted to objects created in particular schemas too. A notable benefit is that users can override the traditional default privilege settings, eg, the PUBLIC EXECUTE privilege traditionally granted by default for functions. Petr Jelinek	2009-10-05 19:24:49 +00:00
Tom Lane	9072592946	Add ALTER TABLE ... ALTER COLUMN ... SET STATISTICS DISTINCT Robert Haas	2009-08-02 22:14:53 +00:00
Tom Lane	25d9bf2e3e	Support deferrable uniqueness constraints. The current implementation fires an AFTER ROW trigger for each tuple that looks like it might be non-unique according to the index contents at the time of insertion. This works well as long as there aren't many conflicts, but won't scale to massive unique-key reassignments. Improving that case is a TODO item. Dean Rasheed	2009-07-29 20:56:21 +00:00
Tom Lane	c1b9ec24ef	Add system catalog columns pg_constraint.conindid and pg_trigger.tgconstrindid. conindid is the index supporting a constraint. We can use this not only for unique/primary-key constraints, but also foreign-key constraints, which depend on the unique index that constrains the referenced columns. tgconstrindid is just copied from the constraint's conindid field, or is zero for triggers not associated with constraints. This is mainly intended as infrastructure for upcoming patches, but it has some virtue in itself, since it exposes a relationship that you formerly had to grovel in pg_depend to determine. I simplified one information_schema view accordingly. (There is a pg_dump query that could also use conindid, but I left it alone because it wasn't clear it'd get any faster.)	2009-07-28 02:56:31 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Tom Lane	32ea236361	Improve the IndexVacuumInfo/IndexBulkDeleteResult API to allow somewhat sane behavior in cases where we don't know the heap tuple count accurately; in particular partial vacuum, but this also makes the API a bit more useful for ANALYZE. This patch adds "estimated_count" flags to both structs so that an approximate count can be flagged as such, and adjusts the logic so that approximate counts are not used for updating pg_class.reltuples. This fixes my previous complaint that VACUUM was putting ridiculous values into pg_class.reltuples for indexes. The actual impact of that bug is limited, because the planner only pays attention to reltuples for an index if the index is partial; which probably explains why beta testers hadn't noticed a degradation in plan quality from it. But it needs to be fixed. The whole thing is a bit messy and should be redesigned in future, because reltuples now has the potential to drift quite far away from reality when a long period elapses with no non-partial vacuums. But this is as good as it's going to get for 8.4.	2009-06-06 22:13:52 +00:00
Tom Lane	5377ccbe24	Update obsolete comment in index_drop(). When the comment was written, queries frequently took no lock at all on individual indexes. That's not true any more, but we still need lock on the parent table to make it safe to use cached lists of index OIDs.	2009-05-31 20:55:37 +00:00
Tom Lane	948d6ec90f	Modify the relcache to record the temp status of both local and nonlocal temp relations; this is no more expensive than before, now that we have pg_class.relistemp. Insert tests into bufmgr.c to prevent attempting to fetch pages from nonlocal temp relations. This provides a low-level defense against bugs-of-omission allowing temp pages to be loaded into shared buffers, as in the contrib/pgstattuple problem reported by Stuart Bishop. While at it, tweak a bunch of places to use new relcache tests (instead of expensive probes into pg_namespace) to detect local or nonlocal temp tables.	2009-03-31 22:12:48 +00:00
Tom Lane	a95307b639	Teach reindex_index() to clear pg_index.indcheckxmin when possible. Greg Stark, slightly modified by me.	2009-03-27 15:57:11 +00:00
Tom Lane	ff301d6e69	Implement "fastupdate" support for GIN indexes, in which we try to accumulate multiple index entries in a holding area before adding them to the main index structure. This helps because bulk insert is (usually) significantly faster than retail insert for GIN. This patch also removes GIN support for amgettuple-style index scans. The API defined for amgettuple is difficult to support with fastupdate, and the previously committed partial-match feature didn't really work with it either. We might eventually figure a way to put back amgettuple support, but it won't happen for 8.4. catversion bumped because of change in GIN's pg_am entry, and because the format of GIN indexes changed on-disk (there's a metapage now, and possibly a pending list). Teodor Sigaev	2009-03-24 20:17:18 +00:00
Tom Lane	3cb5d6580a	Support column-level privileges, as required by SQL standard. Stephen Frost, with help from KaiGai Kohei and others	2009-01-22 20:16:10 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Heikki Linnakangas	3396000684	Rethink the way FSM truncation works. Instead of WAL-logging FSM truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To make that cleaner from modularity point of view, move the WAL-logging one level up to RelationTruncate, and move RelationTruncate and all the related WAL-logging to new src/backend/catalog/storage.c file. Introduce new RelationCreateStorage and RelationDropStorage functions that are used instead of calling smgrcreate/smgrscheduleunlink directly. Move the pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new functions. This leaves smgr.c as a thin wrapper around md.c; all the transactional stuff is now in storage.c. This will make it easier to add new forks with similar truncation logic, like the visibility map.	2008-11-19 10:34:52 +00:00
Alvaro Herrera	03e5248d0f	Replace the usage of heap_addheader to create pg_attribute tuples with regular heap_form_tuple. Since this removes the last remaining caller of heap_addheader, remove it. Extracted from the column privileges patch from Stephen Frost, with further code cleanups by me.	2008-11-14 01:57:42 +00:00
Tom Lane	10e3acb8e7	Prevent synchronous scan during GIN index build, because GIN is optimized for inserting tuples in increasing TID order. It's not clear whether this fully explains Ivan Sergio Borgonovo's complaint, but simple testing confirms that a scan that doesn't start at block 0 can slow GIN build by a factor of three or four. Backpatch to 8.3. Sync scan didn't exist before that.	2008-11-13 17:42:10 +00:00
Tom Lane	902d1cb35f	Remove all uses of the deprecated functions heap_formtuple, heap_modifytuple, and heap_deformtuple in favor of the newer functions heap_form_tuple et al (which do the same things but use bool control flags instead of arbitrary char values). Eliminate the former duplicate coding of these functions, reducing the deprecated functions to mere wrappers around the newer ones. We can't get rid of them entirely because add-on modules probably still contain many instances of the old coding style. Kris Jurka	2008-11-02 01:45:28 +00:00
Tom Lane	5b5ee14a4b	Add a defense to prevent storing pseudo-type data into index columns. Formerly, the lack of any opclasses that could accept such data was enough of a defense, but now with a "record" opclass we need to check more carefully. (You can still use that opclass for an index, but you have to store a named composite type not an anonymous one.)	2008-10-14 21:47:39 +00:00
Heikki Linnakangas	15c121b3ed	Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, the free space information is stored in a dedicated FSM relation fork, with each relation (except for hash indexes; they don't use FSM). This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any trace of them from the backend, initdb, and documentation. Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also introduce a new variant of the get_raw_page(regclass, int4, int4) function in contrib/pageinspect that let's you to return pages from any relation fork, and a new fsm_page_contents() function to inspect the new FSM pages.	2008-09-30 10:52:14 +00:00
Tom Lane	4adc2f72a4	Change hash indexes to store only the hash code rather than the whole indexed value. This means that hash index lookups are always lossy and have to be rechecked when the heap is visited; however, the gain in index compactness outweighs this when the indexed values are wide. Also, we only need to perform datatype comparisons when the hash codes match exactly, rather than for every entry in the hash bucket; so it could also win for datatypes that have expensive comparison functions. A small additional win is gained by keeping hash index pages sorted by hash code and using binary search to reduce the number of index tuples we have to look at. Xiao Meng This commit also incorporates Zdenek Kotala's patch to isolate hash metapages and hash bitmaps a bit better from the page header datastructures.	2008-09-15 18:43:41 +00:00
Tom Lane	e5536e77a5	Move exprType(), exprTypmod(), expression_tree_walker(), and related routines into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway.	2008-08-25 22:42:34 +00:00
Heikki Linnakangas	3f0e808c4a	Introduce the concept of relation forks. An smgr relation can now consist of multiple forks, and each fork can be created and grown separately. The bulk of this patch is about changing the smgr API to include an extra ForkNumber argument in every smgr function. Also, smgrscheduleunlink and smgrdounlink no longer implicitly call smgrclose, because other forks might still exist after unlinking one. The callers of those functions have been modified to call smgrclose instead. This patch in itself doesn't have any user-visible effect, but provides the infrastructure needed for upcoming patches. The additional forks envisioned are a rewritten FSM implementation that doesn't rely on a fixed-size shared memory block, and a visibility map to allow skipping portions of a table in VACUUM that have no dead tuples.	2008-08-11 11:05:11 +00:00
Tom Lane	eca1388629	Fix corner-case bug introduced with HOT: if REINDEX TABLE pg_class (or a REINDEX DATABASE including same) is done before a session has done any other update on pg_class, the pg_class relcache entry was left with an incorrect setting of rd_indexattr, because the indexed-attributes set would be first demanded at a time when we'd forced a partial list of indexes into the pg_class entry, and it would remain cached after that. This could result in incorrect decisions about HOT-update safety later in the same session. In practice, since only pg_class_relname_nsp_index would be missed out, only ALTER TABLE RENAME and ALTER TABLE SET SCHEMA could trigger a problem. Per report and test case from Ondrej Jirman.	2008-08-10 19:02:33 +00:00
Alvaro Herrera	a3540b0f65	Improve our #include situation by moving pointer types away from the corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.	2008-06-19 00:46:06 +00:00
Alvaro Herrera	5da9da71c4	Improve snapshot manager by keeping explicit track of snapshots. There are two ways to track a snapshot: there's the "registered" list, which is used for arbitrary long-lived snapshots; and there's the "active stack", which is used for the snapshot that is considered "active" at any time. This also allows users of snapshots to stop worrying about snapshot memory allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot assignment. This is all done automatically now. As a consequence, this allows us to reset MyProc->xmin when there are no more snapshots registered in the current backend, reducing the impact that long-running transactions have on VACUUM.	2008-05-12 20:02:02 +00:00
Alvaro Herrera	f8c4d7db60	Restructure some header files a bit, in particular heapam.h, by removing some unnecessary #include lines in it. Also, move some tuple routine prototypes and macros to htup.h, which allows removal of heapam.h inclusion from some .c files. For this to work, a new header file access/sysattr.h needed to be created, initially containing attribute numbers of system columns, for pg_dump usage. While at it, make contrib ltree, intarray and hstore header files more consistent with our header style.	2008-05-12 00:00:54 +00:00
Tom Lane	cd902b331d	Change the rules for inherited CHECK constraints to be essentially the same as those for inherited columns; that is, it's no longer allowed for a child table to not have a check constraint matching one that exists on a parent. This satisfies the principle of least surprise (rows selected from the parent will always appear to meet its check constraints) and eliminates some longstanding bogosity in pg_dump, which formerly had to guess about whether check constraints were really inherited or not. The implementation involves adding conislocal and coninhcount columns to pg_constraint (paralleling attislocal and attinhcount in pg_attribute) and refactoring various ALTER TABLE actions to be more like those for columns. Alex Hunsaker, Nikhil Sontakke, Tom Lane	2008-05-09 23:32:05 +00:00
Alvaro Herrera	73b0300b2a	Move the HTSU_Result enum definition into snapshot.h, to avoid including tqual.h into heapam.h. This makes all inclusion of tqual.h explicit. I also sorted alphabetically the includes on some source files.	2008-03-26 21:10:39 +00:00
Alvaro Herrera	78f02ca1f5	Rename snapmgmt.c/h to snapmgr.c/h, for consistency with other files. Per complaint from Tom Lane.	2008-03-26 18:48:59 +00:00
Alvaro Herrera	d43b085d57	Separate snapshot management code from tuple visibility code, create a snapmgmt.c file for the former. The header files have also been reorganized in three parts: the most basic snapshot definitions are now in a new file snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c. tqual.h has been reduced to the bare minimum. This patch is just a first step towards managing live snapshots within a transaction; there is no functionality change. Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and subsequent discussion.	2008-03-26 16:20:48 +00:00
Tom Lane	220db7ccd8	Simplify and standardize conversions between TEXT datums and ordinary C strings. This patch introduces four support functions cstring_to_text, cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and two macros CStringGetTextDatum and TextDatumGetCString. A number of existing macros that provided variants on these themes were removed. Most of the places that need to make such conversions now require just one function or macro call, in place of the multiple notational layers that used to be needed. There are no longer any direct calls of textout or textin, and we got most of the places that were using handmade conversions via memcpy (there may be a few still lurking, though). This commit doesn't make any serious effort to eliminate transient memory leaks caused by detoasting toasted text objects before they reach text_to_cstring. We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few places where it was easy, but much more could be done. Brendan Jurd and Tom Lane	2008-03-25 22:42:46 +00:00
Tom Lane	0688d84041	Add checks to TRUNCATE, CLUSTER, and REINDEX to prevent performing these operations when the current transaction has any open references to the target relation or index (implying it has an active query using the relation). The need for this was previously recognized in connection with ALTER TABLE, but anything that summarily eliminates tuples or moves them around would confuse an active scan. While this patch does not in itself fix bug #3883 (the deadlock would happen before the new check fires), it will discourage people from attempting the sequence of operations that creates a deadlock risk, so it's at least a partial response to that problem. In passing, add a previously-missing check to REINDEX to prevent trying to reindex another backend's temp table. This isn't a security problem since only a superuser would get past the schema permission checks, but if we are testing for this in other utility commands then surely REINDEX should too.	2008-01-30 19:46:48 +00:00
Tom Lane	d3b1b1f9d8	Fix CREATE INDEX CONCURRENTLY so that it won't use synchronized scan for its second pass over the table. It has to start at block zero, else the "merge join" logic for detecting which TIDs are already in the index doesn't work. Hence, extend heapam.c's API so that callers can enable or disable syncscan. (I put in an option to disable buffer access strategy, too, just in case somebody needs it.) Per report from Hannes Dorbath.	2008-01-14 01:39:09 +00:00
Tom Lane	eedb068c0a	Make standard maintenance operations (including VACUUM, ANALYZE, REINDEX, and CLUSTER) execute as the table owner rather than the calling user, using the same privilege-switching mechanism already used for SECURITY DEFINER functions. The purpose of this change is to ensure that user-defined functions used in index definitions cannot acquire the privileges of a superuser account that is performing routine maintenance. While a function used in an index is supposed to be IMMUTABLE and thus not able to do anything very interesting, there are several easy ways around that restriction; and even if we could plug them all, there would remain a risk of reading sensitive information and broadcasting it through a covert channel such as CPU usage. To prevent bypassing this security measure, execution of SET SESSION AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context. Thanks to Itagaki Takahiro for reporting this vulnerability. Security: CVE-2007-6600	2008-01-03 21:23:15 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	c293ba9eff	If an index depends on no columns of its table, give it a dependency on the whole table instead, to ensure that it goes away when the table is dropped. Per bug #3723 from Sam Mason. Backpatch as far as 7.4; AFAICT 7.3 does not have the issue, because it doesn't have general-purpose expression indexes and so there must be at least one column referenced by an index.	2007-11-08 23:22:54 +00:00
Tom Lane	6daef2bca4	Remove hack in pg_tablespace_aclmask() that disallowed permissions on pg_global even to superusers, and replace it with checks in various other places to complain about invalid uses of pg_global. This ends up being a bit more code but it allows a more specific error message to be given, and it un-breaks pg_tablespace_size() on pg_global. Per discussion.	2007-10-12 18:55:12 +00:00
Tom Lane	282d2a03dd	HOT updates. When we update a tuple without changing any of its indexed columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.	2007-09-20 17:56:33 +00:00
Tom Lane	d526575f89	Make large sequential scans and VACUUMs work in a limited-size "ring" of buffers, rather than blowing out the whole shared-buffer arena. Aside from avoiding cache spoliation, this fixes the problem that VACUUM formerly tended to cause a WAL flush for every page it modified, because we had it hacked to use only a single buffer. Those flushes will now occur only once per ring-ful. The exact ring size, and the threshold for seqscans to switch into the ring usage pattern, remain under debate; but the infrastructure seems done. The key bit of infrastructure is a new optional BufferAccessStrategy object that can be passed to ReadBuffer operations; this replaces the former StrategyHintVacuum API. This patch also changes the buffer usage-count methodology a bit: we now advance usage_count when first pinning a buffer, rather than when last unpinning it. To preserve the behavior that a buffer's lifetime starts to decrease when it's released, the clock sweep code is modified to not decrement usage_count of pinned buffers. Work not done in this commit: teach GiST and GIN indexes to use the vacuum BufferAccessStrategy for vacuum-driven fetches. Original patch by Simon, reworked by Heikki and again by Tom.	2007-05-30 20:12:03 +00:00
Alvaro Herrera	90cbc63fd1	Have TRUNCATE advance the affected table's relfrozenxid to RecentXmin, to avoid a later needless VACUUM for Xid-wraparound purposes. We can do this since the table is known to be left empty, so no Xid remains on it. Per discussion.	2007-05-16 17:28:20 +00:00
Tom Lane	fba8113c1b	Teach CLUSTER to skip writing WAL if not needed (ie, not using archiving) --- Simon. Also, code review and cleanup for the previous COPY-no-WAL patches --- Tom.	2007-03-29 00:15:39 +00:00
Tom Lane	e85a01df67	Clean up the representation of special snapshots by including a "method pointer" in every Snapshot struct. This allows removal of the case-by-case tests in HeapTupleSatisfiesVisibility, which should make it a bit faster (I didn't try any performance tests though). More importantly, we are no longer violating portable C practices by assuming that small integers are distinct from all pointer values, and HeapTupleSatisfiesDirty no longer has a non-reentrant API involving side-effects on a global variable. There were a couple of places calling HeapTupleSatisfiesXXX routines directly rather than through the HeapTupleSatisfiesVisibility macro. Since these places had to be changed anyway, I chose to make them go through the macro for uniformity. Along the way I renamed HeapTupleSatisfiesSnapshot to HeapTupleSatisfiesMVCC to emphasize that it's only used with MVCC-type snapshots. I was sorely tempted to rename HeapTupleSatisfiesVisibility to HeapTupleSatisfiesSnapshot, but forebore for the moment to avoid confusion and reduce the likelihood that this patch breaks some of the pending patches. Might want to reconsider doing that later.	2007-03-25 19:45:14 +00:00
Bruce Momjian	63c678d17b	Fix for COPY-after-truncate feature. Simon Riggs	2007-03-03 20:08:41 +00:00
Tom Lane	7bddca3450	Fix up foreign-key mechanism so that there is a sound semantic basis for the equality checks it applies, instead of a random dependence on whatever operators might be named "=". The equality operators will now be selected from the opfamily of the unique index that the FK constraint depends on to enforce uniqueness of the referenced columns; therefore they are certain to be consistent with that index's notion of equality. Among other things this should fix the problem noted awhile back that pg_dump may fail for foreign-key constraints on user-defined types when the required operators aren't in the search path. This also means that the former warning condition about "foreign key constraint will require costly sequential scans" is gone: if the comparison condition isn't indexable then we'll reject the constraint entirely. All per past discussions. Along the way, make the RI triggers look into pg_constraint for their information, instead of using pg_trigger.tgargs; and get rid of the always error-prone fixed-size string buffers in ri_triggers.c in favor of building up the RI queries in StringInfo buffers. initdb forced due to columns added to pg_constraint and pg_trigger.	2007-02-14 01:58:58 +00:00
Bruce Momjian	8b4ff8b6a1	Wording cleanup for error messages. Also change can't -> cannot. Standard English uses "may", "can", and "might" in different ways: may - permission, "You may borrow my rake." can - ability, "I can lift that log." might - possibility, "It might rain today." Unfortunately, in conversational English, their use is often mixed, as in, "You may use this variable to do X", when in fact, "can" is a better choice. Similarly, "It may crash" is better stated, "It might crash".	2007-02-01 19:10:30 +00:00
Bruce Momjian	ef65f6f7a4	Prevent WAL logging when COPY is done in the same transation that created it. Simon Riggs	2007-01-25 02:17:26 +00:00
Tom Lane	4431758229	Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST per-column options for btree indexes. The planner's support for this is still pretty rudimentary; it does not yet know how to plan mergejoins with nondefault ordering options. The documentation is pretty rudimentary, too. I'll work on improving that stuff later. Note incompatible change from prior behavior: ORDER BY ... USING will now be rejected if the operator is not a less-than or greater-than member of some btree opclass. This prevents less-than-sane behavior if an operator that doesn't actually define a proper sort ordering is selected.	2007-01-09 02:14:16 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Tom Lane	e093dcdd28	Add the ability to create indexes 'concurrently', that is, without blocking concurrent writes to the table. Greg Stark, with a little help from Tom Lane.	2006-08-25 04:06:58 +00:00
Tom Lane	09d3670df3	Change the relation_open protocol so that we obtain lock on a relation (table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.	2006-07-31 20:09:10 +00:00
Tom Lane	6e38e34d64	Change the bootstrap sequence so that toast tables for system catalogs are created in the bootstrap phase proper, rather than added after-the-fact by initdb. This is cleaner than before because it allows us to retire the undocumented ALTER TABLE ... CREATE TOAST TABLE command, but the real reason I'm doing it is so that toast tables of shared catalogs will now have predetermined OIDs. This will allow a reasonably clean solution to the problem of locking tables before we load their relcache entries, to appear in a forthcoming patch.	2006-07-31 01:16:38 +00:00
Alvaro Herrera	92c2ecc130	Modify snapshot definition so that lazy vacuums are ignored by other vacuums. This allows a OLTP-like system with big tables to continue regular vacuuming on small-but-frequently-updated tables while the big tables are being vacuumed. Original patch from Hannu Krossing, rewritten by Tom Lane and updated by me.	2006-07-30 02:07:18 +00:00
Bruce Momjian	a22d76d96a	Allow include files to compile own their own. Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.	2006-07-13 16:49:20 +00:00
Tom Lane	b7b78d24f7	Code review for FILLFACTOR patch. Change WITH grammar as per earlier discussion (including making def_arg allow reserved words), add missed opt_definition for UNIQUE case. Put the reloptions support code in a less random place (I chose to make a new file access/common/reloptions.c). Eliminate header inclusion creep. Make the index options functions safely user-callable (seems like client apps might like to be able to test validity of options before trying to make an index). Reduce overhead for normal case with no options by allowing rd_options to be NULL. Fix some unmaintainably klugy code, including getting rid of Natts_pg_class_fixed at long last. Some stylistic cleanup too, and pay attention to keeping comments in sync with code. Documentation still needs work, though I did fix the omissions in catalogs.sgml and indexam.sgml.	2006-07-03 22:45:41 +00:00
Bruce Momjian	277807bd9e	Add FILLFACTOR to CREATE INDEX. ITAGAKI Takahiro	2006-07-02 02:23:23 +00:00
Tom Lane	3fdeb189e9	Clean up code associated with updating pg_class statistics columns (relpages/reltuples). To do this, create formal support in heapam.c for "overwrite" tuple updates (including xlog replay capability) and use that instead of the ad-hoc overwrites we'd been using in VACUUM and CREATE INDEX. Take the responsibility for updating stats during CREATE INDEX out of the individual index AMs, and do it where it belongs, in catalog/index.c. Aside from being more modular, this avoids having to update the same tuple twice in some paths through CREATE INDEX. It's probably not measurably faster, but for sure it's a lot cleaner than before.	2006-05-10 23:18:39 +00:00
Tom Lane	a8b8f4db23	Clean up WAL/buffer interactions as per my recent proposal. Get rid of the misleadingly-named WriteBuffer routine, and instead require routines that change buffer pages to call MarkBufferDirty (which does exactly what it says). We also require that they do so before calling XLogInsert; this takes care of the synchronization requirement documented in SyncOneBuffer. Note that because bufmgr takes the buffer content lock (in shared mode) while writing out any buffer, it doesn't matter whether MarkBufferDirty is executed before the buffer content change is complete, so long as the content change is completed before releasing exclusive lock on the buffer. So it's OK to set the dirtybit before we fill in the LSN. This eliminates the former kluge of needing to set the dirtybit in LockBuffer. Aside from making the code more transparent, we can also add some new debugging assertions, in particular that the caller of MarkBufferDirty must hold the buffer content lock, not merely a pin.	2006-03-31 23:32:07 +00:00
Tom Lane	4e7d10c7cd	Comments in IndexBuildHeapScan describe the indexing of recently-dead tuples as needed "to keep VACUUM from complaining", but actually there is a more compelling reason to do it: failure to do so violates MVCC semantics. This is because a pre-existing serializable transaction might try to use the index after we finish (re)building it, and it might fail to find tuples it should be able to see. We got this mostly right, but not in the case of partial indexes: the code mistakenly discarded recently-dead tuples for partial indexes. Fix that, and adjust the comments.	2006-03-24 23:02:17 +00:00
Bruce Momjian	f2f5b05655	Update copyright for 2006. Update scripts.	2006-03-05 15:59:11 +00:00
Bruce Momjian	436a2956d8	Re-run pgindent, fixing a problem where comment lines after a blank comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.	2005-11-22 18:17:34 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	f26b91761b	Arrange for indexes and toast tables to inherit their ownership from the parent table, even if the command that creates them is executed by someone else (such as a superuser or a member of the owning role). Per gripe from Michael Fuhr.	2005-08-26 03:08:15 +00:00
Tom Lane	721e53785d	Solve the problem of OID collisions by probing for duplicate OIDs whenever we generate a new OID. This prevents occasional duplicate-OID errors that can otherwise occur once the OID counter has wrapped around. Duplicate relfilenode values are also checked for when creating new physical files. Per my recent proposal.	2005-08-12 01:36:05 +00:00
Tom Lane	3acca18d28	Fix ancient memory leak in index_create(): RelationInitIndexAccessInfo was being called twice in normal operation, leading to a leak of one set of relcache subsidiary info. Per report from Jeff Gold.	2005-06-25 16:53:49 +00:00
Neil Conway	9de97c5531	Trivial code clarity improvement to UpdateStats(); no functional change.	2005-06-20 02:07:47 +00:00
Tom Lane	ee3b71f6bc	Split the shared-memory array of PGPROC pointers out of the sinval communication structure, and make it its own module with its own lock. This should reduce contention at least a little, and it definitely makes the code seem cleaner. Per my recent proposal.	2005-05-19 21:35:48 +00:00
Neil Conway	3140437495	This patch refactors away some duplicated code in the index AM build methods: they all invoke UpdateStats() since they have computed the number of heap tuples, so I created a function in catalog/index.c that each AM now calls.	2005-05-11 06:24:55 +00:00
Tom Lane	278bd0cc22	For some reason access/tupmacs.h has been #including utils/memutils.h, which is neither needed by nor related to that header. Remove the bogus inclusion and instead include the header in those C files that actually need it. Also fix unnecessary inclusions and bad inclusion order in tsearch2 files.	2005-05-06 17:24:55 +00:00
Tom Lane	bedb78d386	Implement sharable row-level locks, and use them for foreign key references to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU data structure (managed much like pg_subtrans) to represent multiple- transaction-ID sets. When more than one transaction is holding a shared lock on a particular row, we create a MultiXactId representing that set of transactions and store its ID in the row's XMAX. This scheme allows an effectively unlimited number of row locks, just as we did before, while not costing any extra overhead except when a shared lock actually has to be shared. Still TODO: use the regular lock manager to control the grant order when multiple backends are waiting for a row lock. Alvaro Herrera and Tom Lane.	2005-04-28 21:47:18 +00:00
Tom Lane	162bd08b3f	Completion of project to use fixed OIDs for all system catalogs and indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...	2005-04-14 20:03:27 +00:00
Tom Lane	7c13781ee7	First phase of project to use fixed OIDs for all system catalogs and indexes. Extend the macros in include/catalog/*.h to carry the info about hand-assigned OIDs, and adjust the genbki script and bootstrap code to make the relations actually get those OIDs. Remove the small number of RelOid_pg_foo macros that we had in favor of a complete set named like the catname.h and indexing.h macros. Next phase will get rid of internal use of names for looking up catalogs and indexes; but this completes the changes forcing an initdb, so it looks like a good place to commit. Along the way, I made the shared relations (pg_database etc) not be 'bootstrap' relations any more, so as to reduce the number of hardwired entries and simplify changing those relations in future. I'm not sure whether they ever really needed to be handled as bootstrap relations, but it seems to work fine to not do so now.	2005-04-14 01:38:22 +00:00
Tom Lane	70c9763d48	Convert oidvector and int2vector into variable-length arrays. This change saves a great deal of space in pg_proc and its primary index, and it eliminates the former requirement that INDEX_MAX_KEYS and FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded in the on-disk representation (because it affects index tuple header size), but FUNC_MAX_ARGS is not. I believe it would now be possible to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet. There are still a lot of vestigial references to FUNC_MAX_ARGS, which I will clean up in a separate pass. However, getting rid of it altogether would require changing the FunctionCallInfoData struct, and I'm not sure I want to buy into that.	2005-03-29 00:17:27 +00:00
Tom Lane	ee4ddac137	Convert index-related tuple handling routines from char 'n'/' ' to bool convention for isnull flags. Also, remove the useless InsertIndexResult return struct from index AM aminsert calls --- there is no reason for the caller to know where in the index the tuple was inserted, and we were wasting a palloc cycle per insert to deliver this uninteresting value (plus nontrivial complexity in some AMs). I forced initdb because of the change in the signature of the aminsert routines, even though nothing really looks at those pg_proc entries...	2005-03-21 01:24:04 +00:00
Tom Lane	354049c709	Remove unnecessary calls of FlushRelationBuffers: there is no need to write out data that we are about to tell the filesystem to drop. smgr_internal_unlink already had a DropRelFileNodeBuffers call to get rid of dead buffers without a write after it's no longer possible to roll back the deleting transaction. Adding a similar call in smgrtruncate simplifies callers and makes the overall division of labor clearer. This patch removes the former behavior that VACUUM would write all dirty buffers of a relation unconditionally.	2005-03-20 22:00:54 +00:00
Tom Lane	f97aebd162	Revise TupleTableSlot code to avoid unnecessary construction and disassembly of tuples when passing data up through multiple plan nodes. A slot can now hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting of Datum/isnull arrays. Upper plan levels can usually just copy the Datum arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr() calls to extract the data again. This work extends Atsushi Ogawa's earlier patch, which provided the key idea of adding Datum arrays to TupleTableSlots. (I believe however that something like this was foreseen way back in Berkeley days --- see the old comment on ExecProject.) A test case involving many levels of join of fairly wide tables (about 80 columns altogether) showed about 3x overall speedup, though simple queries will probably not be helped very much. I have also duplicated some code in heaptuple.c in order to provide versions of heap_formtuple and friends that use "bool" arrays to indicate null attributes, instead of the old convention of "char" arrays containing either 'n' or ' '. This provides a better match to the convention used by ExecEvalExpr. While I have not made a concerted effort to get rid of uses of the old routines, I think they should be deprecated and eventually removed.	2005-03-16 21:38:10 +00:00
Tom Lane	a52b4fb131	Adjust creation/destruction of TupleDesc data structure to reduce the number of palloc calls. This has a salutory impact on plpgsql operations with record variables (which create and destroy tupdescs constantly) and probably helps a bit in some other cases too.	2005-03-07 04:42:17 +00:00
Tom Lane	5d5087363d	Replace the BufMgrLock with separate locks on the lookup hashtable and the freelist, plus per-buffer spinlocks that protect access to individual shared buffer headers. This requires abandoning a global freelist (since the freelist is a global contention point), which shoots down ARC and 2Q as well as plain LRU management. Adopt a clock sweep algorithm instead. Preliminary results show substantial improvement in multi-backend situations.	2005-03-04 20:21:07 +00:00
Tom Lane	0ce4d56924	Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This is the minimum required fix. I want to look next at taking advantage of it by simplifying the message semantics in the shared inval message queue, but that part can be held over for 8.1 if it turns out too ugly.	2005-01-10 20:02:24 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Tom Lane	5374d097de	Change planner to use the current true disk file size as its estimate of a relation's number of blocks, rather than the possibly-obsolete value in pg_class.relpages. Scale the value in pg_class.reltuples correspondingly to arrive at a hopefully more accurate number of rows. When pg_class contains 0/0, estimate a tuple width from the column datatypes and divide that into current file size to estimate number of rows. This improved methodology allows us to jettison the ancient hacks that put bogus default values into pg_class when a table is first created. Also, per a suggestion from Simon, make VACUUM (but not VACUUM FULL or ANALYZE) adjust the value it puts into pg_class.reltuples to try to represent the mean tuple density instead of the minimal density that actually prevails just after VACUUM. These changes alter the plans selected for certain regression tests, so update the expected files accordingly. (I removed join_1.out because it's not clear if it still applies; we can add back any variant versions as they are shown to be needed.)	2004-12-01 19:00:56 +00:00
Tom Lane	9ffc8ed58b	Repair possible failure to update hint bits back to disk, per http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php. This fix is intended to be permanent: it moves the responsibility for calling SetBufferCommitInfoNeedsSave() into the tqual.c routines, eliminating the requirement for callers to test whether t_infomask changed. Also, tighten validity checking on buffer IDs in bufmgr.c --- several routines were paranoid about out-of-range shared buffer numbers but not about out-of-range local ones, which seems a tad pointless.	2004-10-15 22:40:29 +00:00
Tom Lane	b1f8a37aa7	Fallout from changing index locking rules: we can reduce the strength of locking used by REINDEX. REINDEX needs only ShareLock on the parent table, same as CREATE INDEX, plus an exclusive lock on the specific index being processed.	2004-10-01 17:11:50 +00:00
Tom Lane	617d6ea7df	Fix unintended assignment of sequences to the containing schema's default tablespace --- they should always go in the database's default tablespace. Adjust heap_create() API so that it is passed the relkind to make this easier; should simplify any further tweaking of the same sort.	2004-08-31 17:10:36 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Tom Lane	448eb0837f	Rearrange order of operations in heap_drop_with_catalog and index_drop so that we close and flush the doomed relation's relcache entry before we start to delete the underlying catalog rows, rather than afterwards. For awhile yesterday I thought that an unexpected relcache entry rebuild partway through this sequence might explain the infrequent parallel regression failures we were chasing. It doesn't, mainly because there's no CommandCounterIncrement in the sequence and so the deletions aren't "really" done yet. But it sure seems like trouble waiting to happen.	2004-08-28 21:05:26 +00:00
Tom Lane	efcaf1e868	Some mop-up work for savepoints (nested transactions). Store a small number of active subtransaction XIDs in each backend's PGPROC entry, and use this to avoid expensive probes into pg_subtrans during TransactionIdIsInProgress. Extend EOXactCallback API to allow add-on modules to get control at subxact start/end. (This is deliberately not compatible with the former API, since any uses of that API probably need manual review anyway.) Add basic reference documentation for SAVEPOINT and related commands. Minor other cleanups to check off some of the open issues for subtransactions. Alvaro Herrera and Tom Lane.	2004-08-01 17:32:22 +00:00
Tom Lane	2467394ee1	Tablespaces. Alternate database locations are dead, long live tablespaces. There are various things left to do: contrib dbsize and oid2name modules need work, and so does the documentation. Also someone should think about COMMENT ON TABLESPACE and maybe RENAME TABLESPACE. Also initlocation is dead, it just doesn't know it yet. Gavin Sherry and Tom Lane.	2004-06-18 06:14:31 +00:00
Tom Lane	e674707968	Minor code rationalization: FlushRelationBuffers just returns void, rather than an error code, and does elog(ERROR) not elog(WARNING) when it detects a problem. All callers were simply elog(ERROR)'ing on failure return anyway, and I find it hard to envision a caller that would not, so we may as well simplify the callers and produce the more useful error message directly.	2004-05-31 19:24:05 +00:00
Neil Conway	d0b4399d81	Reimplement the linked list data structure used throughout the backend. In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.	2004-05-26 04:41:50 +00:00
Tom Lane	4af3421161	Get rid of rd_nblocks field in relcache entries. Turns out this was costing us lots more to maintain than it was worth. On shared tables it was of exactly zero benefit because we couldn't trust it to be up to date. On temp tables it sometimes saved an lseek, but not often enough to be worth getting excited about. And the real problem was that we forced an lseek on every relcache flush in order to update the field. So all in all it seems best to lose the complexity.	2004-05-08 19:09:25 +00:00
Tom Lane	dd16b7aa9e	Get rid of cluster.c's apparatus for rebuilding a relation's indexes in favor of using the REINDEX TABLE apparatus, which does the same thing simpler and faster. Also, make TRUNCATE not use cluster.c at all, but just assign a new relfilenode and REINDEX. This partially addresses Hartmut Raschick's complaint from last December that 7.4's TRUNCATE is an order of magnitude slower than prior releases. By getting rid of a lot of unnecessary catalog updates, these changes buy back about a factor of two (on my system). The remaining overhead seems associated with creating and deleting storage files, which we may not be able to do much about without abandoning transaction safety for TRUNCATE.	2004-05-08 00:34:49 +00:00
Tom Lane	077db40fa1	ALTER TABLE rewrite. New cool stuff: * ALTER ... ADD COLUMN with defaults and NOT NULL constraints works per SQL spec. A default is implemented by rewriting the table with the new value stored in each row. * ALTER COLUMN TYPE. You can change a column's datatype to anything you want, so long as you can specify how to convert the old value. Rewrites the table. (Possible future improvement: optimize no-op conversions such as varchar(N) to varchar(N+1).) * Multiple ALTER actions in a single ALTER TABLE command. You can perform any number of column additions, type changes, and constraint additions with only one pass over the table contents. Basic documentation provided in ALTER TABLE ref page, but some more docs work is needed. Original patch from Rod Taylor, additional work from Tom Lane.	2004-05-05 04:48:48 +00:00
Tom Lane	f0c9397f80	First steps towards statistics on expressional (nee functional) indexes. This commit teaches ANALYZE to store such stats in pg_statistic, but nothing is done yet about teaching the planner to use 'em. Also, repair longstanding oversight in separate ANALYZE command: it updated the pg_class.relpages and reltuples counts for the table proper, but not for indexes.	2004-02-15 21:01:39 +00:00
Tom Lane	87bd956385	Restructure smgr API as per recent proposal. smgr no longer depends on the relcache, and so the notion of 'blind write' is gone. This should improve efficiency in bgwriter and background checkpoint processes. Internal restructuring in md.c to remove the not-very-useful array of MdfdVec objects --- might as well just use pointers. Also remove the long-dead 'persistent main memory' storage manager (mm.c), since it seems quite unlikely to ever get resurrected.	2004-02-10 01:55:27 +00:00
Tom Lane	2f0d43b251	Review uses of IsUnderPostmaster, change some tests to look at whereToSendOutput instead because they are really inquiring about the correct client communication protocol. Update some comments. This is pointing towards supporting regular FE/BE client protocol in a standalone backend, per discussion a month or so back.	2004-01-28 21:02:40 +00:00
Neil Conway	192ad63bd7	More janitorial work: remove the explicit casting of NULL literals to a pointer type when it is not necessary to do so. For future reference, casting NULL to a pointer type is only necessary when (a) invoking a function AND either (b) the function has no prototype OR (c) the function is a varargs function.	2004-01-07 18:56:30 +00:00
Tom Lane	c607bd693f	Clean up the usage of canonicalize_qual(): in particular, be consistent about whether it is applied before or after eval_const_expressions(). I believe there were some corner cases where the system would fail to recognize that a partial index is applicable because of the previous inconsistency. Store normal rather than 'implicit AND' representations of constraints and index predicates in the catalogs. initdb forced due to representation change of constraints/predicates.	2003-12-28 21:57:37 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Jan Wieck	cfeca62148	Background writer process This first part of the background writer does no syncing at all. It's only purpose is to keep the LRU heads clean so that regular backends seldom to never have to call write(). Jan	2003-11-19 15:55:08 +00:00
Tom Lane	fa5c8a055a	Cross-data-type comparisons are now indexable by btrees, pursuant to my pghackers proposal of 8-Nov. All the existing cross-type comparison operators (int2/int4/int8 and float4/float8) have appropriate support. The original proposal of storing the right-hand-side datatype as part of the primary key for pg_amop and pg_amproc got modified a bit in the event; it is easier to store zero as the 'default' case and only store a nonzero when the operator is actually cross-type. Along the way, remove the long-since-defunct bigbox_ops operator class.	2003-11-12 21:15:59 +00:00
Tom Lane	c1d62bfd00	Add operator strategy and comparison-value datatype fields to ScanKey. Remove the 'strategy map' code, which was a large amount of mechanism that no longer had any use except reverse-mapping from procedure OID to strategy number. Passing the strategy number to the index AM in the first place is simpler and faster. This is a preliminary step in planned support for cross-datatype index operations. I'm committing it now since the ScanKeyEntryInitialize() API change touches quite a lot of files, and I want to commit those changes before the tree drifts under me.	2003-11-09 21:30:38 +00:00
Peter Eisentraut	7438af96fa	More message editing, some suggested by Alvaro Herrera	2003-09-29 00:05:25 +00:00
Peter Eisentraut	feb4f44d29	Message editing: remove gratuitous variations in message wording, standardize terms, add some clarifications, fix some untranslatable attempts at dynamic message building.	2003-09-25 06:58:07 +00:00
Tom Lane	a56a016ceb	Repair some REINDEX problems per recent discussions. The relcache is now able to cope with assigning new relfilenode values to nailed-in-cache indexes, so they can be reindexed using the fully crash-safe method. This leaves only shared system indexes as special cases. Remove the 'index deactivation' code, since it provides no useful protection in the shared- index case. Require reindexing of shared indexes to be done in standalone mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes) now prevents using indexes for lookups, but does not disable index updates. It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs can be done without a standalone backend for all cases except shared catalogs.	2003-09-24 18:54:02 +00:00
Hiroshi Inoue	f5c5c3c6f7	Putting back the previous change must be the first thing. ALso put back a #ifndef ENABLE_REINDEX_NAILED_RELATIONS which was removed about a year ago.	2003-09-23 01:51:09 +00:00
Tom Lane	28847ae77d	Seems like a bad idea that REINDEX TABLE supports (or thinks it does) reindexing system tables without ignoring system indexes, when the other two varieties of REINDEX disallow it. Make all three act the same, and simplify downstream code accordingly.	2003-09-19 19:57:42 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	d85286305d	Error message editing in backend/catalog.	2003-07-21 01:59:11 +00:00
Tom Lane	f3707d0705	Fix stupid oversight :-(	2003-05-29 00:54:42 +00:00
Tom Lane	fc8d970cbc	Replace functional-index facility with expressional indexes. Any column of an index can now be a computed expression instead of a simple variable. Restrictions on expressions are the same as for predicates (only immutable functions, no sub-selects). This fixes problems recently introduced with inlining SQL functions, because the inlining transformation is applied to both expression trees so the planner can still match them up. Along the way, improve efficiency of handling index predicates (both predicates and index expressions are now cached by the relcache) and fix 7.3 oversight that didn't record dependencies of predicate expressions.	2003-05-28 16:04:02 +00:00
Bruce Momjian	cde8bbc413	This patch makes the following changes to the documentation: - more work from the SGML police - some grammar improvements: rewriting a paragraph or two, replacing contractions where (IMHO) appropriate - fix missing utility commands in lock mode docs - improve CLUSTER, REINDEX, SET SESSION AUTHORIZATION ref pages Neil Conway	2003-02-19 04:06:28 +00:00
Tom Lane	5bab36e9f6	Revise executor APIs so that all per-query state structure is built in a per-query memory context created by CreateExecutorState --- and destroyed by FreeExecutorState. This provides a final solution to the longstanding problem of memory leaked by various ExecEndNode calls.	2002-12-15 16:17:59 +00:00
Tom Lane	3a4f7dde16	Phase 3 of read-only-plans project: ExecInitExpr now builds expression execution state trees, and ExecEvalExpr takes an expression state tree not an expression plan tree. The plan tree is now read-only as far as the executor is concerned. Next step is to begin actually exploiting this property.	2002-12-13 19:46:01 +00:00
Tom Lane	a0bf885f9e	Phase 2 of read-only-plans project: restructure expression-tree nodes so that all executable expression nodes inherit from a common supertype Expr. This is somewhat of an exercise in code purity rather than any real functional advance, but getting rid of the extra Oper or Func node formerly used in each operator or function call should provide at least a little space and speed improvement. initdb forced by changes in stored-rules representation.	2002-12-12 15:49:42 +00:00
Bruce Momjian	9b12ab6d5d	Add new palloc0 call as merge of palloc and MemSet(0).	2002-11-13 00:39:48 +00:00
Bruce Momjian	75fee4535d	Back out use of palloc0 in place if palloc/MemSet. Seems constant len to MemSet is a performance boost.	2002-11-11 03:02:20 +00:00
Bruce Momjian	8fee9615cc	Merge palloc()/MemSet(0) calls into a single palloc0() call.	2002-11-10 07:25:14 +00:00
Tom Lane	200b151615	Fix places that were using IsTransactionBlock() as an (inadequate) check that they'd get to commit immediately on finishing. There's now a centralized routine PreventTransactionChain() that implements the necessary tests.	2002-10-21 22:06:20 +00:00
Tom Lane	d2d0f42040	Use heap_formtuple not heap_addheader to construct pg_index tuples. heap_addheader is wrong because it doesn't cope with varlena fields, notably indpred.	2002-09-27 15:05:23 +00:00
Tom Lane	bc1088c28a	Get rid of bogus use of heap_mark4update in reindex operations (cf. recent bug report). Fix processing of nailed-in-cache indexes; it appears that REINDEX DATABASE has been broken for months :-(.	2002-09-23 00:42:48 +00:00

1 2 3 4 5 ...

449 Commits